Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Web Scraping Project

Hi all,

 

Ive built a simple scraper which pulls pricing data instead of copy/pasting it manually.  Currently each Query is pointed to a product URL by changing the "source" of the query.  Is there a way that I can create a table of URLs and have one query look at each of them and extract the Source code data, instead of having separate Queries for each product page and appending them?

 

It would save me considerable copy/paste time if I can acheve this.


Thanks

3 REPLIES 3
Anonymous
Not applicable

Hi @Anonymous 

 

I assume you have something like this.

 

let
    Source = Web.BrowserContents("https:/abc.com/?page=12"),
    //Your addtional tranfomation steps goes below
    .
    ...
    ....
    #"LastStep" = ....
in
    #"LastStep"

 

 

Then you have other URLs for which you need to apply same steps. In that case follow below steps.

 

  1. Create a new table with single column(URL) having all the urls you want. Lets say this table Products.
  2. Now change above query as below, this creates a custom function for you.

 

(url as text) =>
let
    Source = Web.BrowserContents(url), // Replace the hardcoded url with parameter url
    //Your addtional tranfomation steps goes below
    .
    ...
    ....
    #"LastStep" = ....
in
    #"LastStep"​

 

  • Go to the Products table created in step 1. Click on the table icon displayed on top left corner of the table.tableicon.PNG
  • Then choose Invoke Custom Function.
  • Under function query choose the function created in step 2.
  • For url select column name URL  and hit ok.
  • Then click on expand icon as below expand.PNG

Thats all you need I hope.

 

 

Appreciate with kudos by clicking the like button on bottom right.

Please mark as a solution if this solves your problem.

 

Thanks

 

 

 

 

Hi @Anonymous ,

as @Anonymous  said.

 

But if you want to regularly refresh the results in Power BI service, you have to move the dynamic URLs into the query parameters instead. Otherwise you'll get an error complaining about dynamic data sources: https://www.thebiccountant.com/2018/03/22/web-scraping-2-scrape-multiple-pages-power-bi-power-query/

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi @Anonymous you can create a power query function a Table with URLS and iterate over those URLS to get the data.

 

Thanks.

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.