cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
thangtruong Helper I
Helper I

Large Dataset in refreshing PowerBI Service

Hi all,

I'm newbie in this field.

I had created a Customer's reports. I used MySQL Server is my datawarehouse, and connect with power BI. Then I published it online.

It was 15 milion data rows for 6 months, and it took 2-3 hours to refresh data in app.powerbi.com (not refresh in powerbi desktop)

 

So, here is my 02 problems:

1. My dataset will be get more and more large in the near future, and maybe it will take more time than 3hours to refresh data.

Is there something that I can do to imporve this situation?

 

2. The refresh usually get fail, for some below common reasons:

- "Before the data import for Total_Identifies finished, its data source timed out. Double-check whether that data source can process import queries, and if it can, try again."
- "Unable to connect to the data source undefined."

Microsoft SQL: Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.

 

What can I do to fix this?

 

Thanks you guy in advance (Y)

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Community Support
Community Support

Re: Large Dataset in refreshing PowerBI Service

Hi @thangtruong ,

 

We can use the Increasement Refresh to reduce the refresh time if the dataset is under premium capacity. Or we can following those tips to reduce the size of dataset or optimize the model of dataset based on this document, some tips may not reduce the time of refresh.

 

  • Remove unused tables or columns, where possible. 
  • Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.  
  • Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
  • Use integers instead of strings, where possible.
  • Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
  • When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.  

 

For second question, please increase the timeout value in connector function, such as following:

 

MySQL.Database(server, database, [CommandTimeout = #duration(0,2,0,0)])


we may also need to increase the timeout value in data source.


Best regards,

 

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
Super User III
Super User III

Re: Large Dataset in refreshing PowerBI Service

@thangtruong The quick things you can do are ensure that only the data you are using is being loaded, remove the rest. Use the PowerQuery Analyzer to tune the ingestion if possible.

Another alternative is to use Analysis Services as a stand alone model. I love working in Power BI more, but sometimes if you have to scale up you need to jump over and use that with live connection to offload the long processing times.


Looking for more Power BI tips, tricks & tools? Check out PowerBI.tips the site I co-own with Mike Carlo. Also, if you are near SE WI? Join our PUG Milwaukee Brew City PUG
Community Support
Community Support

Re: Large Dataset in refreshing PowerBI Service

Hi @thangtruong ,

 

We can use the Increasement Refresh to reduce the refresh time if the dataset is under premium capacity. Or we can following those tips to reduce the size of dataset or optimize the model of dataset based on this document, some tips may not reduce the time of refresh.

 

  • Remove unused tables or columns, where possible. 
  • Avoid distinct counts on fields with high cardinality – that is, millions of distinct values.  
  • Take steps to avoid fields with unnecessary precision and high cardinality. For example, you could split highly unique datetime values into separate columns – for example, month, year, date, and so on. Or, where possible, use rounding on high-precision fields to lower cardinality – (for example, 13.29889 -> 13.3).
  • Use integers instead of strings, where possible.
  • Be wary of DAX functions, which need to test every row in a table – for example, RANKX – in the worst case, these functions can exponentially increase run-time and memory requirements given linear increases in table size.
  • When connecting to data sources via DirectQuery, consider indexing columns that are commonly filtered or sliced again. Indexing greatly improves report responsiveness.  

 

For second question, please increase the timeout value in connector function, such as following:

 

MySQL.Database(server, database, [CommandTimeout = #duration(0,2,0,0)])


we may also need to increase the timeout value in data source.


Best regards,

 

Community Support Team _ Dong Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Helpful resources

Announcements
New Ranks Launched March 24th!

New Ranks Launched March 24th!

The time has come: We are finally able to share more details on the brand-new ranks coming to the Power BI Community!

‘Better Together’ Contest Finalists Announced!

‘Better Together’ Contest Finalists Announced!

Congrats to the finalists of our ‘Better Together’-themed T-shirt design contest! Click for the top entries.

Arun 'Triple A' Event Video, Q&A, and Slides

Arun 'Triple A' Event Video, Q&A, and Slides

Missed the Arun 'Triple A' event or want to revisit it? We've got you covered! Check out the video, Q&A, and slides now.

Join THE global Microsoft Power Platform event series.

Join THE global Power Platform event series.

Attend for two days of expert-led learning and innovation on topics like AI and Analytics, powered by Dynamic Communities.

Community Summit North America

Community Summit North America

Innovate, Collaborate, Grow. The top training and networking event across the globe for Microsoft Business Applications

Top Solution Authors