Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
gancw1
Helper III
Helper III

Getting simple HTML table from html file is very slow

I have an 8000+ records report in HTML format (110,000+ lines) on a SharePoint site which takes more than 1 hour to import.  Manually converting the report to CSV is not an option.  Is there any way to speed up reading the HTML table ?

 

My last resort is to run a scheduled Power Automate task to use Excel online to convert the html table into CSV or XLS format 

 

 

1 ACCEPTED SOLUTION
v-stephen-msft
Community Support
Community Support

Hi @gancw1 ,

 

You can reduce unnecessary data before importing.

Or you can use the "From Web" option in Power Query to read the HTML table directly from the SharePoint site first. And then clean up the data in Power Query for the next load or refresh.

You can remove unnecessary columns, rename columns, and change data types to optimize the data for your needs.

Use the "Close & Load" option in Power Query to load the data into it. This will create a connection to the SharePoint site, and you can refresh the data whenever you need to.

If the above steps do not work, you can try running a scheduled Power Automate task to use Excel online to convert the HTML table into CSV or XLS format.

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.           

View solution in original post

4 REPLIES 4
gancw1
Helper III
Helper III

There are 2 other options that do not use non-Power BI solution:
- Using Web.Page which is significantly faster but requires IE. This means a getway is required if the app is published to PBI service

- Write M code to parse the HTML code. Not too difficult and there are some examples on the web.

Web.Page still functions, and refreshes in the service

 

let
Source = Web.Page(Web.Contents("https://ssbipolar.com/2021/05/31/roches-maxim/")),
Data0 = Source{0}[Data]
in
Data0

v-stephen-msft
Community Support
Community Support

Hi @gancw1 ,

 

You can reduce unnecessary data before importing.

Or you can use the "From Web" option in Power Query to read the HTML table directly from the SharePoint site first. And then clean up the data in Power Query for the next load or refresh.

You can remove unnecessary columns, rename columns, and change data types to optimize the data for your needs.

Use the "Close & Load" option in Power Query to load the data into it. This will create a connection to the SharePoint site, and you can refresh the data whenever you need to.

If the above steps do not work, you can try running a scheduled Power Automate task to use Excel online to convert the HTML table into CSV or XLS format.

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.           

I am going to explore using a scheduled Power Automate task to use Excel online to convert the HTML table into CSV or XLS format.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors