cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
opticshrew Member
Member

Using Power Bi with Large Datasets

Hey folks,

 

I have Power Bi desktop and a Azure SQL data source that contains 450 million rows of data. When I load the data into desktop or using Direct Query the time it takes is unreasonable.

 

The only way I could see an improvement is if I do any of the following:

  • Reduce my dataset (this isn't preferable due to the data being reported on)
  • Increase the number of DTUs on my SQL server (the expense isn't something I want to endure)

 

Am I using the right product for what I want to do or should I be using something else from the Microsoft stack?

 

Thanks,

J

1 ACCEPTED SOLUTION

Accepted Solutions
opticshrew Member
Member

Re: Using Power Bi with Large Datasets

@v-yuezhe-msft

 

I found that if I load by CSVs into Azure Data Lake Store I can use Azure Data Lake Analytics to choose the data I need and push that out to a new CSV. This new CSV should be below that 1GB limit anad therefore I can use ADLS as my Power Bi data source.

 

If the 1GB limit is reach I can split out the CSV and import them seperately combining them in Power Bi.

 

Thanks for the help,

J

View solution in original post

10 REPLIES 10
Moderator v-yuezhe-msft
Moderator

Re: Using Power Bi with Large Datasets

@opticshrew,

In DirectQuery mode, you should have no problem to connect to the Azure SQL database as data is not imported into Power BI model in this case. But there is a 1 million row limit for returning data when using DirectQuery, you can get more details in this article.

For example, you can aggregate 10 million rows with your query that runs on the data source, and accurately return the results of that aggregation to Power BI using DirectQuery as long as the data returned to Power BI is less than 1 million rows. If more than 1 million rows would be returned from DirectQuery, Power BI returns an error.

Additionally, in import mode, how long does it take to import data from Azure SQL to Power BI Desktop? I would recommend you create a view to aggregate the data you need and import the view to Power BI, and you can take a look at the following blog about other best practices.

https://www.sqlbi.com/articles/data-import-best-practices-in-power-bi/

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
opticshrew Member
Member

Re: Using Power Bi with Large Datasets

@v-yuezhe-msft

 

I don't feel that I'm going to be able to aggregate the data, here is an example of what is coming through:

 

FilePath            AccessTime                    WriteTime
C:\FilePathA     01/01/2018 10:00 AM     01/01/2018 10:00 AM

The data is loaded into the SQL db from blob storage. Each blob is a CSV containing roughly 1 million lines. These are then loaded into SQL where they are given the correct column data types. *note the FilePath is differnt each time*

 

As for load time we don't currently have 450 million rows of data yet however this is what we need to plan for. Currently with around 200,000 rows of data your looking at around 20-30 mins.

 

Would you say it is best to create a report purely on a per CSV basis?

 

Thanks,

J

opticshrew Member
Member

Re: Using Power Bi with Large Datasets

@v-yuezhe-msft

 

I found that if I load by CSVs into Azure Data Lake Store I can use Azure Data Lake Analytics to choose the data I need and push that out to a new CSV. This new CSV should be below that 1GB limit anad therefore I can use ADLS as my Power Bi data source.

 

If the 1GB limit is reach I can split out the CSV and import them seperately combining them in Power Bi.

 

Thanks for the help,

J

View solution in original post

kotvir Occasional Visitor
Occasional Visitor

Re: Using Power Bi with Large Datasets

Its not useful, when you need realtime analytics and your query mode is "Direct Query"

Potential solution one looking for should be, reduce the dataset size which is being used to load the inital set of rows by PowerBI to 10 or 100 and than let end user decide the recordset actually needed based on their reporting needs (restricting data via filter or other means).

Highlighted
opticshrew Member
Member

Re: Using Power Bi with Large Datasets

Hi @kotvir

 

I agree and disagree with you 🙂 

 

So yes this wouldn't be good for streaming data, it just wouldn't work. Ideally you'd want to create a streaming data set and tackle the whole situation differently.

 

However, In by scenario I'm looking at static data that changes infrequently and will be using "Import". Additionally the dataset we're using contains all the information we need where we need to view the dataset in it's entirety. Essentially we're looking at SharePoint farms and need to understand file type distribution throughout them. We then need the ability to dril down into a specific server, drive, file etc.

 

Thanks,

Jay

prabhuss Regular Visitor
Regular Visitor

Re: Using Power Bi with Large Datasets

any final answer for this. i am also facing same issue

rebin40 Frequent Visitor
Frequent Visitor

Re: Using Power Bi with Large Datasets

I have the same issue. 

My dataset is above 7 million of rows and power bi performs very slowly. 
Any advices? 

dalenewnham Frequent Visitor
Frequent Visitor

Re: Using Power Bi with Large Datasets

I have had seen good performance with over 40 million records. How many tables are you importing and how many columns in each? Leveraging Power BI's relational model to reduce the column count in tables with large numbers of records helps in my experience.

rebin40 Frequent Visitor
Frequent Visitor

Re: Using Power Bi with Large Datasets

I have eventually solved it using DirectQuery

Helpful resources

Announcements
Ask Amanda Anything Q&A

Ask Amanda Anything Q&A

Learn the answers to some of the questions asked during the Amanda Triple A event.

October 2019 Community Highlights

October 2019 Community Highlights

October was a busy month in the community. Read the recap article to learn about some of the events and content.

New Solution Badges

New Solution Badges

Two waves of brand new solution badges are coming! Read the article for more information on our new community badges.

Analytics in Azure virtual event

Analytics in Azure virtual event

Experience a limitless analytics service built to ingest, prep, manage, and serve data for immediate use in Power BI.

Users Online
Currently online: 249 members 2,841 guests
Please welcome our newest community members: