Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
CAMA_UK_18
Frequent Visitor

Small datasets, many queries/steps or large data sets, fewer queries/steps?

Hi

What would you say is the optimal set-up for the scenario below?

 

I have 10 customers (say). For each of those customers, there are three tables (all exactly the same structures) on separate tabs within a sharepoint excel workbook that my colleagues can work on.

 

I then merge/transform each customer's 3 tables to produce one final query per customer. The final queries per customer are then appended to produce my final dataset, ready for measures and visualisations.

 

Is this the right approach? Or, would it be better to first append each customer's A, B and C tables with each other and then merge/transform once, albeing on one much larger A, B and C table?

 

I look forward to receiving some guidance.

1 ACCEPTED SOLUTION
MarkLaf
Solution Sage
Solution Sage

In my experience working with one Excel file with multiple tables of same structure, I generally get best performance by immediately expanding (and automatically combining) tables off the initial Excel.Workbook access function. Pre-reqs/steps:

  1. Make sure inside your Excel workbook that your data tables are formatted as tables. This let's you skip any need to do any transforms before combining tables (namely, Table.PromoteHeaders). They show up as Kind = Table in the access output
    MarkLaf_0-1645123284505.png
  2. From the initial access step (e.g. Excel.Workbook( Web.Contents( [workbook url] ) ) ), filter the table of content to just your tables (not sheets) of same-structure data.
    MarkLaf_1-1645123365364.png
  3. Now, you can simply expand the Data column to get your combined table. You'll have the added benefit of keeping a "Source Table" column with the Name or Item column (others can be removed before expanding)
    MarkLaf_2-1645123626079.pngMarkLaf_3-1645123819308.png

I've found that this minimizes repeated queries to the data source, and makes maintenance easier (e.g. schema drift like when column name changes; can just fix in the one expand step rather than multiple places)

View solution in original post

2 REPLIES 2
MarkLaf
Solution Sage
Solution Sage

In my experience working with one Excel file with multiple tables of same structure, I generally get best performance by immediately expanding (and automatically combining) tables off the initial Excel.Workbook access function. Pre-reqs/steps:

  1. Make sure inside your Excel workbook that your data tables are formatted as tables. This let's you skip any need to do any transforms before combining tables (namely, Table.PromoteHeaders). They show up as Kind = Table in the access output
    MarkLaf_0-1645123284505.png
  2. From the initial access step (e.g. Excel.Workbook( Web.Contents( [workbook url] ) ) ), filter the table of content to just your tables (not sheets) of same-structure data.
    MarkLaf_1-1645123365364.png
  3. Now, you can simply expand the Data column to get your combined table. You'll have the added benefit of keeping a "Source Table" column with the Name or Item column (others can be removed before expanding)
    MarkLaf_2-1645123626079.pngMarkLaf_3-1645123819308.png

I've found that this minimizes repeated queries to the data source, and makes maintenance easier (e.g. schema drift like when column name changes; can just fix in the one expand step rather than multiple places)

@MarkLaf  I don't think I could get over to you how much this post has transformed my work! Thank you!!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.