Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Website scraping advice

I'm trying to scrape a website with a pretty simple HTML table, but it uses Javascript for pagination and I can only get the first 25 results when using the web connector. I've tried using 

 [WaitFor = [Timeout = #duration(0,0,0,0)]])

to see if Power BI could pick up the table before the javascript loads -- I'm not sure if that's how it works but it hasn't given me any results yet.

 

Is there anything I can do? This is the website and data in question: 

http://www.onequestionshootout.xyz/episodes/series_all.htm

1 ACCEPTED SOLUTION

@Anonymous ,

 

I would suggest you to use python script in power bi to scrapy the website. About how to configure python environment and implement python script in power bi desktop, I would suggest you to refer to doc below:

https://docs.microsoft.com/en-us/power-bi/desktop-python-scripts

 

Community Support Team _ Jimmy Tao

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

I'm afraid my skills at this point won't allow for python scripting, so in the meanwhile I've downloaded the page as .html and I used the Text/CSV data connector to get the table in plain HTML. The downside of course is I cannot get the latest updates to my report over the internet.

@Anonymous 

Wow! Thank you for sharing the idea of dowloading the page as an HTML! I was having the same problem as you and was completely stuck. With your solution, I have at least succeeded in extracting a "snapshot" of the data as it stands currently, which is better than no data at all...

I would have never thought of downloading the actual page!

 

Thanks!!





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






@Anonymous ,

 

Power query only support simple web scrapying. If the website needs dynamic scrapying, I'm afraid power query won't work.

 

Community Support Team _ Jimmy Tao

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

kcantor
Community Champion
Community Champion

@Anonymous 

Perhaps this resource will help.

https://datachant.com/2017/03/30/web-scraping-power-bi-excel-power-query/





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Anonymous
Not applicable

It started as promising, but unfortunately I can't get any parameters from the url as it doesn't produce any when you navigate through the pages... Tricky!

@Anonymous ,

 

I would suggest you to use python script in power bi to scrapy the website. About how to configure python environment and implement python script in power bi desktop, I would suggest you to refer to doc below:

https://docs.microsoft.com/en-us/power-bi/desktop-python-scripts

 

Community Support Team _ Jimmy Tao

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors