Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Should I query once and reference or split my table

I'm dealing with a large table that contains a large amount of columns (~120) and I'm looking for the most "performant and open to modifications" way to create a report. Note that in my case, I do not have many rows; I am however interested in the case where there are large amounts of data (and information if it would change if the data was small).

 

I will have to unpivot most of the columns and analyse them per groups (some columns fit well together while others don't, I'm estimating it to about 20 to 30 groups). And for every group, I need 2 to 4 columns to filter my data depending on the user input (there will be a slicer with date and other options).

 

My question regards the performance and the customization design of my report. Should I query the whole table once, then reference it for every group and remove the columns I am not using OR should I create multiple tables, each querying the DB with only the columns I'm needing, knowing  that for every single table created that way, I need 2 to 4 columns in order to filter my data (mostly dates and object_ids) ?

1 ACCEPTED SOLUTION
v-alq-msft
Community Support
Community Support

Hi, @Anonymous 

 

Generally, It is recommended that you reference queries to avoid the duplication of logic across your queries. However, this design approach can contribute to slow data refreshes, and overburden data sources. Consider several queries: Query1 sources data from a web service, and its load is disabled. Query2, Query3, and Query4 all reference Query1, and their outputs are loaded to the data model. When the data model is refreshed, it's often assumed that Power Query retrieves the Query1 result, and that it's reused by referenced queries. This thinking is incorrect. In fact, Power Query executes Query2, Query3, and Query4 separately.

 

For further information, you may refer to the following link.

https://docs.microsoft.com/en-us/power-bi/guidance/power-query-referenced-queries

https://docs.microsoft.com/en-us/power-bi/power-bi-reports-performance

 

Best Regards 

Allan

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
v-alq-msft
Community Support
Community Support

Hi, @Anonymous 

 

Generally, It is recommended that you reference queries to avoid the duplication of logic across your queries. However, this design approach can contribute to slow data refreshes, and overburden data sources. Consider several queries: Query1 sources data from a web service, and its load is disabled. Query2, Query3, and Query4 all reference Query1, and their outputs are loaded to the data model. When the data model is refreshed, it's often assumed that Power Query retrieves the Query1 result, and that it's reused by referenced queries. This thinking is incorrect. In fact, Power Query executes Query2, Query3, and Query4 separately.

 

For further information, you may refer to the following link.

https://docs.microsoft.com/en-us/power-bi/guidance/power-query-referenced-queries

https://docs.microsoft.com/en-us/power-bi/power-bi-reports-performance

 

Best Regards 

Allan

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

@v-alq-msft that is great to know thanks. Sadly, it seems another team is working on datasets and I won't be able to try them out.

 

Do you think it's better to just cut my table into smaller parts then ?

Hi, @Anonymous 

 

Scheduled refresh for imported datasets timeout after two hours. This timeout is increased to five hours for datasets in Premium workspaces. If you encounter this limit, consider reducing the size or complexity of your dataset, or consider breaking the dataset into smaller pieces. Thanks.

 

Best Regards

Allan

 

 

Tad17
Solution Sage
Solution Sage

Hey @Anonymous 

 

My thoughts (not sure what they're worth),

 

I would query the table once as that will cut back on refresh time. 

I would make as many changes as possible to the original table before referencing.

Then create as many reference tables as needed and make any remaining necessary changes on the individual level, though again, if you can do this on the original query table using if statements in whatever columns need to be created I would go that route.

 

If different tables require different columns removed or kept then I would still use referenced tables as this is the fastest and easiest way to go about needing multiple tables from the same source. Again, I would maek as many changes as possible in the original query to save time (including refresh time), resources, and headache.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.