Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
PaulDBrown
Community Champion
Community Champion

Extract Data from a table on a website

Hi everyone,

I am trying to extract data from a table on a website. The problem is that the full data is spread over a number of pages. The table has a filter to select the number of rows per view, but the selection is not reflected in the URL. In other words, the URL deafaults to the first page.

How can I extract the full content?

The Table I'm trying to extract the data from is here:
http://www.aebec.org/registro-de-barcos/

Thanks for your help!

Paul.

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






1 ACCEPTED SOLUTION

Just an update on this challenge. I have been seraching the forum for the past few hours and eventualy came across this post which was right on topic:

 

Website scraping advice

 

It seems that you can extract data from these kind of tables by using Python (which is way beyond my current scope). 

As a temporary solution the orginal poster mentioned downloading the actual HTML page and connecting to the file instead of the website. Not ideal obviously, but at least I now have a snapshot of the data as it stands at the moment. Extracting the data this way was painless, and all the rows of the table were loaded automatically. 

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






View solution in original post

3 REPLIES 3
Anonymous
Not applicable

Power Bi can do this for you.

-Get Data
-Web

enter the URL and choose the table that's available

 

Capture.JPG

@Anonymous 

Thank you for taking the time to look into this. I did get as far as you have suggested. The problem is that the URL input only returns the first 50 rows out of a total of 280. Normally I would use a function and an input table for the pages for each of the following pages containing the subsequent rows. However, when you move to the next pages in the table on the website the URL remains static - in other words it’s the same URL for all pages, so I am lost as to how to obtain the complete table dataset.

 

I would appreciate any input as to how to get the query to run through all of the rows inte table, and not just the first 50.

 

many thanks for your help!





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






Just an update on this challenge. I have been seraching the forum for the past few hours and eventualy came across this post which was right on topic:

 

Website scraping advice

 

It seems that you can extract data from these kind of tables by using Python (which is way beyond my current scope). 

As a temporary solution the orginal poster mentioned downloading the actual HTML page and connecting to the file instead of the website. Not ideal obviously, but at least I now have a snapshot of the data as it stands at the moment. Extracting the data this way was painless, and all the rows of the table were loaded automatically. 

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.