Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
opticshrew
Resolver II
Resolver II

Using Power Bi with Large Datasets

Hey folks,

 

I have Power Bi desktop and a Azure SQL data source that contains 450 million rows of data. When I load the data into desktop or using Direct Query the time it takes is unreasonable.

 

The only way I could see an improvement is if I do any of the following:

  • Reduce my dataset (this isn't preferable due to the data being reported on)
  • Increase the number of DTUs on my SQL server (the expense isn't something I want to endure)

 

Am I using the right product for what I want to do or should I be using something else from the Microsoft stack?

 

Thanks,

J

1 ACCEPTED SOLUTION
opticshrew
Resolver II
Resolver II

@v-yuezhe-msft

 

I found that if I load by CSVs into Azure Data Lake Store I can use Azure Data Lake Analytics to choose the data I need and push that out to a new CSV. This new CSV should be below that 1GB limit anad therefore I can use ADLS as my Power Bi data source.

 

If the 1GB limit is reach I can split out the CSV and import them seperately combining them in Power Bi.

 

Thanks for the help,

J

View solution in original post

11 REPLIES 11
opticshrew
Resolver II
Resolver II

@v-yuezhe-msft

 

I found that if I load by CSVs into Azure Data Lake Store I can use Azure Data Lake Analytics to choose the data I need and push that out to a new CSV. This new CSV should be below that 1GB limit anad therefore I can use ADLS as my Power Bi data source.

 

If the 1GB limit is reach I can split out the CSV and import them seperately combining them in Power Bi.

 

Thanks for the help,

J

Its not useful, when you need realtime analytics and your query mode is "Direct Query"

Potential solution one looking for should be, reduce the dataset size which is being used to load the inital set of rows by PowerBI to 10 or 100 and than let end user decide the recordset actually needed based on their reporting needs (restricting data via filter or other means).

When I tried this on Power BI desktop recently using direct query, I couldn't get rows past the initial set of 1000. I used a native query when connecting a SQL database for the limit 1000. Has someone else tried this method and got it to work?

Hi , 
Hope you are doing well.

Since i have the same scenario where more then 1million records needs to fetch & create the report .So in realtime would like to know ,Is power Bi good solution for that? If No then what would be the solution or tool for this.

Thanks & Regards.

Hi @kotvir

 

I agree and disagree with you 🙂 

 

So yes this wouldn't be good for streaming data, it just wouldn't work. Ideally you'd want to create a streaming data set and tackle the whole situation differently.

 

However, In by scenario I'm looking at static data that changes infrequently and will be using "Import". Additionally the dataset we're using contains all the information we need where we need to view the dataset in it's entirety. Essentially we're looking at SharePoint farms and need to understand file type distribution throughout them. We then need the ability to dril down into a specific server, drive, file etc.

 

Thanks,

Jay

any final answer for this. i am also facing same issue

I have the same issue. 

My dataset is above 7 million of rows and power bi performs very slowly. 
Any advices? 

Anonymous
Not applicable

I have had seen good performance with over 40 million records. How many tables are you importing and how many columns in each? Leveraging Power BI's relational model to reduce the column count in tables with large numbers of records helps in my experience.

I have eventually solved it using DirectQuery
v-yuezhe-msft
Employee
Employee

@opticshrew,

In DirectQuery mode, you should have no problem to connect to the Azure SQL database as data is not imported into Power BI model in this case. But there is a 1 million row limit for returning data when using DirectQuery, you can get more details in this article.

For example, you can aggregate 10 million rows with your query that runs on the data source, and accurately return the results of that aggregation to Power BI using DirectQuery as long as the data returned to Power BI is less than 1 million rows. If more than 1 million rows would be returned from DirectQuery, Power BI returns an error.

Additionally, in import mode, how long does it take to import data from Azure SQL to Power BI Desktop? I would recommend you create a view to aggregate the data you need and import the view to Power BI, and you can take a look at the following blog about other best practices.

https://www.sqlbi.com/articles/data-import-best-practices-in-power-bi/

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

@v-yuezhe-msft

 

I don't feel that I'm going to be able to aggregate the data, here is an example of what is coming through:

 

FilePath            AccessTime                    WriteTime
C:\FilePathA     01/01/2018 10:00 AM     01/01/2018 10:00 AM

The data is loaded into the SQL db from blob storage. Each blob is a CSV containing roughly 1 million lines. These are then loaded into SQL where they are given the correct column data types. *note the FilePath is differnt each time*

 

As for load time we don't currently have 450 million rows of data yet however this is what we need to plan for. Currently with around 200,000 rows of data your looking at around 20-30 mins.

 

Would you say it is best to create a report purely on a per CSV basis?

 

Thanks,

J

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.