Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
GaryK
Frequent Visitor

Large Datasets - How can I improve performance?

Very new to PowerBI, so I may be missing something really simple and obvious...

 

I am connect PowerBI to an Oracle database and in most cases, the dataset size is very large, eg 400,000 rows.

I created a view in the source Oracle db and the underlying SQL has 6 table JOINs and then connected PowerBI via DirectQuery to the VIEW.

 

In Desktop I created a Table and a couple of Slicers - the Table is very slow to load, which is understandable considering there are 400,000 data rows returned by the VIEW.

 

In the interest of simplicity, let's assume the VIEW definition is as follows:

 

SELECT Region, Date, SomeValue

FROM myTable

 

 

Note that there is no selection criteria in the SQL.

 

The 2 slicers are based on Region and Date, and seemingly by selecting a single Region (eg Europe) and narrowing the Date range down (eg 1st June 2016 to 30th June 2016), the Table/Visualizations display quicker.

 

My question is whether there is a way to prevent the entire dataset from being loaded into the PowerBI table?

Is there a way that I can delay the renedering of the table until the user has specified their report requirements - that is, for the user to select the Region and Date range in teh slicers?

 

You may argue that I should Import the data instead of using DirectQuery, but this is not ideal for me as there will be multiple complex and large datasets in my solution and I would also prefer the users to see real-time reports rather than periodically re-importing the data.

1 ACCEPTED SOLUTION
GilbertQ
Super User
Super User

Hi @GaryK the first question I would ask is when you say realtime what do you define as being realtime?

 

The reason for this is, is if you define realtime as being hourly or every 5 minutes, if you are looking for the optimal performance then importing your data into the Power BI Model will result in very quick response times when interacting with the data.

 

As @v-haibl-msft has mentioned no actual data is stored in your Power BI Model. So if it is slow it is due to the Source system in your case your Oracle DB that is taking time to run the required queries and return the data back to Power BI. (And as you indicated it is doing 6 joins to underlying tables)

 

I would suggest possibly as a test, import all the data into a Power BI Model, and see if when using the Power BI Model with your slicers if it is quicker. If it is, then at least you will know where the bottleneck lies.

 

In my experience often people ask for realtime data, but unless someone is actually watching it as part of their job, it can be refreshed hourly, which is more than adequate.

 

Another option is if you only want to load specific data into your Power BI Model, you can make use of the Parameters. But note when you do this, every time you change the parameter you will have to re-import your dataset.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

View solution in original post

2 REPLIES 2
GilbertQ
Super User
Super User

Hi @GaryK the first question I would ask is when you say realtime what do you define as being realtime?

 

The reason for this is, is if you define realtime as being hourly or every 5 minutes, if you are looking for the optimal performance then importing your data into the Power BI Model will result in very quick response times when interacting with the data.

 

As @v-haibl-msft has mentioned no actual data is stored in your Power BI Model. So if it is slow it is due to the Source system in your case your Oracle DB that is taking time to run the required queries and return the data back to Power BI. (And as you indicated it is doing 6 joins to underlying tables)

 

I would suggest possibly as a test, import all the data into a Power BI Model, and see if when using the Power BI Model with your slicers if it is quicker. If it is, then at least you will know where the bottleneck lies.

 

In my experience often people ask for realtime data, but unless someone is actually watching it as part of their job, it can be refreshed hourly, which is more than adequate.

 

Another option is if you only want to load specific data into your Power BI Model, you can make use of the Parameters. But note when you do this, every time you change the parameter you will have to re-import your dataset.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

v-haibl-msft
Employee
Employee

@GaryK

 

DirectQuery won’t import data into the model in Power BI. DirectQuery brings the metadata and data structure into Power BI.

When creating or interacting with a visualization, the underlying source will be queried and the time necessary to refresh the visualization is dependent on the performance of the underlying data source. Data Sources are different in terms of response time. SSAS tabular might produce faster result, and normal SQL Server database slower. If you are working for example with SQL Server consider proper indexing, column-store indexes. You can take a look at this article.

 

Best Regards,

Herbert

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.