Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
meirlicht
Frequent Visitor

Working with large amounts of data SQL Server

 

I have Windows computer with 16 GB of RAM,  2.13 GHz CPU, and about 60 million rows of data stored in SQL Server. When querying all the data in SSMS it takes about 30 minutes, I am trying to visualize this data in PowerBI. Importing the data is taking too long and returning a timeout error after 5 hours (I have premium per user) and I have seen a lot of peopel reccomend direct query for larger datasets but it is returning "visual exceeded the available recourses".

 

I am nervous about using incremental refresh since I will not be able to make any changes in PowerQuery once the dataset is uploaded and because I am having difficulty with the inital refresh (as I mentioned import is timing out and the initial refresh for incremental refresh is simply an import from my understanding), I have been experementing with the "Data Load" configurations but without much success.

 

How do people deal with having 100 million+ rows of data in their report? What type of database and connection are they using?

2 ACCEPTED SOLUTIONS
GilbertQ
Super User
Super User

Hi @meirlicht 

 

I would highly recommend you use the Incremental refreshing.

 

If you have got a lot of data, what I would suggest is to manually process each partition in your dataset. By doing it manually for each partition, you can then make sure that each query runs within the 5 hours limit before going onto the next partition to then query.

 

I have got a dataset with over 1 billion rows in total (it took me 5 full days to get the data in), but this is the process I followed to get it done.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

View solution in original post

Thank you so much for your help! I am currently manually refreshing each parition in SSMS via XMLA endpoint. I also switched to PostgreSQL since I was running out of storage on using the free SQL Server Express Edition.  The term "Premium Capacity" is used in the above link which threw me off since I thought it was refering to purchasing a seperae capacity, but it is simply talking about having a premium account.

 

Here are the steps I took incase anyone else is struggling with the same issue:

1: Added a filter at the end of my PowerQuery: Table.SelectRows(#"previous step", each Parameter1) Where Parameter1 is set to False

2: Set up incremental refresh policy

3: Publish to PowerBI and refresh dataset. This refresh took one minute since all the data is filtered out because Parameter1 is set to false. This refresh also created all my partitions.

4: In the dataset settings in PowerBI Service, switch Parameter1 = True (I also truned on "Large dataset storage format")

5: In Tabular editor I connected to the PowerBI dataset and edited the partitions to make them smaller so that none would timeout.

6: In SSMS I connected to the PowerBI dataset and manually processed one partiton at a time using "full process" 

View solution in original post

4 REPLIES 4
GilbertQ
Super User
Super User

Hi @meirlicht 

 

I would highly recommend you use the Incremental refreshing.

 

If you have got a lot of data, what I would suggest is to manually process each partition in your dataset. By doing it manually for each partition, you can then make sure that each query runs within the 5 hours limit before going onto the next partition to then query.

 

I have got a dataset with over 1 billion rows in total (it took me 5 full days to get the data in), but this is the process I followed to get it done.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

@GilbertQ Hello, thank you for responding! How do you manually proccess each partition? Are you referring to the "Prevent timeouts on initial full refresh" section of https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-xmla? I am using SQL Server Express addition. Do you know if this will still work, or do I need a paid version? Also it seems according to this link I would need to purchase Premium Capacity. Ideally I do not want to purhase further subscriptions. 

Hi @meirlicht 

 

I use Power BI premium Per user which allows me to see the partitions and then I can process them using SSMS or Tabular Editor.

 

SQL Server Express should work fine as far as I am aware.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Thank you so much for your help! I am currently manually refreshing each parition in SSMS via XMLA endpoint. I also switched to PostgreSQL since I was running out of storage on using the free SQL Server Express Edition.  The term "Premium Capacity" is used in the above link which threw me off since I thought it was refering to purchasing a seperae capacity, but it is simply talking about having a premium account.

 

Here are the steps I took incase anyone else is struggling with the same issue:

1: Added a filter at the end of my PowerQuery: Table.SelectRows(#"previous step", each Parameter1) Where Parameter1 is set to False

2: Set up incremental refresh policy

3: Publish to PowerBI and refresh dataset. This refresh took one minute since all the data is filtered out because Parameter1 is set to false. This refresh also created all my partitions.

4: In the dataset settings in PowerBI Service, switch Parameter1 = True (I also truned on "Large dataset storage format")

5: In Tabular editor I connected to the PowerBI dataset and edited the partitions to make them smaller so that none would timeout.

6: In SSMS I connected to the PowerBI dataset and manually processed one partiton at a time using "full process" 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors