cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Helper III
Helper III

Web Scraping Project

Hi all,

 

Ive built a simple scraper which pulls pricing data instead of copy/pasting it manually.  Currently each Query is pointed to a product URL by changing the "source" of the query.  Is there a way that I can create a table of URLs and have one query look at each of them and extract the Source code data, instead of having separate Queries for each product page and appending them?

 

It would save me considerable copy/paste time if I can acheve this.


Thanks

3 REPLIES 3
Resolver III
Resolver III

Hi @pchapple 

 

I assume you have something like this.

 

let
    Source = Web.BrowserContents("https:/abc.com/?page=12"),
    //Your addtional tranfomation steps goes below
    .
    ...
    ....
    #"LastStep" = ....
in
    #"LastStep"

 

 

Then you have other URLs for which you need to apply same steps. In that case follow below steps.

 

  1. Create a new table with single column(URL) having all the urls you want. Lets say this table Products.
  2. Now change above query as below, this creates a custom function for you.

 

(url as text) =>
let
    Source = Web.BrowserContents(url), // Replace the hardcoded url with parameter url
    //Your addtional tranfomation steps goes below
    .
    ...
    ....
    #"LastStep" = ....
in
    #"LastStep"​

 

  • Go to the Products table created in step 1. Click on the table icon displayed on top left corner of the table.tableicon.PNG
  • Then choose Invoke Custom Function.
  • Under function query choose the function created in step 2.
  • For url select column name URL  and hit ok.
  • Then click on expand icon as below expand.PNG

Thats all you need I hope.

 

 

Appreciate with kudos by clicking the like button on bottom right.

Please mark as a solution if this solves your problem.

 

Thanks

 

 

 

 

Hi @pchapple ,

as @sparse-coder  said.

 

But if you want to regularly refresh the results in Power BI service, you have to move the dynamic URLs into the query parameters instead. Otherwise you'll get an error complaining about dynamic data sources: https://www.thebiccountant.com/2018/03/22/web-scraping-2-scrape-multiple-pages-power-bi-power-query/

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Resolver III
Resolver III

Hi @pchapple you can create a power query function a Table with URLS and iterate over those URLS to get the data.

 

Thanks.

 

 

Helpful resources

Announcements
secondImage

Happy New Year from Power BI

This is a must watch for a message from Power BI!

December Update

Check it Out!

Click here to read more about the December 2020 Updates!

Community Blog

Check it Out!

Click here to read the latest blog and learn more about contributing to the Power BI blog!

Get Ready for Power BI Dev Camp

Get Ready for Power BI Dev Camp

Mark your calendars and join us for our next Power BI Dev Camp!.

Top Solution Authors
Top Kudoed Authors