Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
SaaM
Helper II
Helper II

Duplicate or refer Table ? Or someting else

Hi All,

 

i am facing terrible performance in a power query table after a couple of joins and transformation.

Let me give you a broad overview over the steps i do.

It would be great to find out which option i have to apply to improve the performance.

My steps:

1) I load a table from Excel (12k rows and 20 columns):

 

2) I add around 38 calculated columns with some transformations with power query

 

So far, it takes nearly 50 secs to load and create the Table A1 with all the added columns.

 

When I try to join Table A1 with another table B, I see that it does all the calculations of all steps again.

 

Is there anyway to use a copy of my final table (something like copy values in a new table) in order to use it for the join ?

 

I tried the duplicate and reference options of my final table A1, but It does the recalculation every time when  I do the join with My duplicated/referred table A1 and the table B ....

 

Thanks in advance

 

Kind regards

1 ACCEPTED SOLUTION
Anonymous
Not applicable

No!

You have to use a separate expression for 

Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1)),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"})

 

for instance:

 

 

TABBUFF=Table.Buffer(Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1)),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"}),

 

 

 

then the rest (check the sintax: ican't do)

 

 

#"join_sum_T1_T2_" = let #"GroupedLines" = Table.Group(TABBUFF, {"key", "filter_item"}, {{"sum_field_1", each List.Sum([#"field_1"]), type nullable number}, {"sum_field_2", each List.Sum([field_32), type nullable number}, {"Ssum_field_3", each List.Sum([#"field_3"]), type nullable number}, {"sum_field_4", each List.Sum([#"field_4"]), type nullable number}}), #"MergedQueries" = Table.NestedJoin(#"Table2_source", {"key_2"}, 
#"GroupedLines", {"key_1"}, "Table1_sum", JoinKind.LeftOuter), 
#"Tableexpanded" = Table.ExpandTableColumn(#"MergedQueries", "Table1_sum", {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}, {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}) 
in #"Tableexpanded",

 

 

 

 

 

View solution in original post

13 REPLIES 13
SaaM
Helper II
Helper II

@Jimmy801  @Anonymous 

I tried the Table.buffer but it takes the same time to evaluate all the queries.

I will try to refactor my previous code to optimize its performances

 

Thanks anyway

 

Kind regards

Saam

Anonymous
Not applicable

But before using table.buffer did you strip this piece of code from inside the table.group function?

 

Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"})

 

 Can you show the code you used?

The code I used is:

 

#"join_sum_T1_T2_" = let #"GroupedLines" = Table.Group(Table.Buffer(Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1)),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"}), {"key", "filter_item"}, {{"sum_field_1", each List.Sum([#"field_1"]), type nullable number}, {"sum_field_2", each List.Sum([field_32), type nullable number}, {"Ssum_field_3", each List.Sum([#"field_3"]), type nullable number}, {"sum_field_4", each List.Sum([#"field_4"]), type nullable number}}), #"MergedQueries" = Table.NestedJoin(#"Table2_source", {"key_2"}, #"GroupedLines", {"key_1"}, "Table1_sum", JoinKind.LeftOuter), #"Tableexpanded" = Table.ExpandTableColumn(#"MergedQueries", "Table1_sum", {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}, {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}) in #"Tableexpanded",

 

Anonymous
Not applicable

No!

You have to use a separate expression for 

Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1)),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"})

 

for instance:

 

 

TABBUFF=Table.Buffer(Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1)),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"}),

 

 

 

then the rest (check the sintax: ican't do)

 

 

#"join_sum_T1_T2_" = let #"GroupedLines" = Table.Group(TABBUFF, {"key", "filter_item"}, {{"sum_field_1", each List.Sum([#"field_1"]), type nullable number}, {"sum_field_2", each List.Sum([field_32), type nullable number}, {"Ssum_field_3", each List.Sum([#"field_3"]), type nullable number}, {"sum_field_4", each List.Sum([#"field_4"]), type nullable number}}), #"MergedQueries" = Table.NestedJoin(#"Table2_source", {"key_2"}, 
#"GroupedLines", {"key_1"}, "Table1_sum", JoinKind.LeftOuter), 
#"Tableexpanded" = Table.ExpandTableColumn(#"MergedQueries", "Table1_sum", {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}, {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}) 
in #"Tableexpanded",

 

 

 

 

 

Thank you all guys.

It worked that way !

 

Anonymous
Not applicable

"It worked that way !"

 

Could You quantify the time difference between before and after "that way"?

 It drastically decreased from 10' to 4'

Jimmy801
Community Champion
Community Champion

Hello @SaaM 

 

do you how long it takes to load after the join? Do you there some other transformation of the joined data? What data you are using out of the joined data?

Could you post the M-code here `?

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hello,

 

Here is my M-code for one of the queries to create my columns and it takes too much time (around 10 mn)

I follow the same steps for other columns

Just for information my Table1 takes around 50 sec to add the 38 columns to the original source

 

my key_1 and key_2 are Text and I sum quantities

        #"join_sum_T1_T2_" = let

            #"GroupedLines" = Table.Group(Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"}), {"key", "filter_item"}, {{"sum_field_1", each List.Sum([#"field_1"]), type nullable number}, {"sum_field_2", each List.Sum([field_32), type nullable number}, {"Ssum_field_3", each List.Sum([#"field_3"]), type nullable number}, {"sum_field_4", each List.Sum([#"field_4"]), type nullable number}}),

            #"MergedQueries" = Table.NestedJoin(#"Table2_source", {"key_2"}, #"GroupedLines", {"key_1"}, "Table1_sum", JoinKind.LeftOuter),

            #"Tableexpanded" = Table.ExpandTableColumn(#"MergedQueries", "Table1_sum", {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}, {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"})

            in
            #"Tableexpanded",

 

The equivalent SQL code would be:

        with GroupedLines as (select key, sum(field_1) as sum_field_1, sum(field_2) as sum_field_2, sum(field_3) as sum_field_3, sum(field_4) as sum_field_4 
        from Table1
        where filter_item = 1
        group by key)
        select * from Table2_source
        left join GroupedLines
        on key_1 = key_2

 

Thanks

 

Kind regards

Jimmy801
Community Champion
Community Champion

Hello @SaaM 

 

you are referencing 2 external tables. One is the Table1 that takes 10 seconds to load. What about the table Table2_source? How long this takes to load?

At the end your final query needs to load both tables and then apply also your steps. 

I also tried to apply some changes to your code. Check out if this helps also

        #"join_sum_T1_T2_" = let

            BufferTable1 = Table.Buffer(Table.SelectColumns(Table.SelectRows(Table1, (sel)=> sel[#"filter_item"] = 1),{"key_1", "field_1", "field_2", "field_3", "field_4", "field_5"})),
BufferTable2 = Table.Buffer(#"Table2_source"),
#"GroupedLines" = Table.Group(BufferTable1 , {"key", "filter_item"}, {{"sum_field_1", each List.Sum([#"field_1"]), type nullable number}, {"sum_field_2", each List.Sum([field_32), type nullable number}, {"Ssum_field_3", each List.Sum([#"field_3"]), type nullable number}, {"sum_field_4", each List.Sum([#"field_4"]), type nullable number}}),

            #"MergedQueries" = Table.NestedJoin(BufferTable2 , {"key_2"}, #"GroupedLines", {"key_1"}, "Table1_sum", JoinKind.LeftOuter),

            #"Tableexpanded" = Table.ExpandTableColumn(#"MergedQueries", "Table1_sum", {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"}, {"sum_field_1", "sum_field_2", "sum_field_3", "sum_field_4"})

            in
            #"Tableexpanded",

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Anonymous
Not applicable

If adding 38 columns takes 50 ", the 10 'depends on the rest of the code not on recalculating the columns. try to strip the table.SelectRows and table select column statements from within the table.group and buffer the last table before grouby.

 

edhans
Super User
Super User

You can try to use Table.Buffer() around the last step of one of your tables. It would be helpful though to know what the 38 custom columns are - (calculated columns are a DAX thing, not Power Query) and then we could see if you had a model issue or a better way to approach this.

 

How to get good help fast. Help us help you.

How To Ask A Technical Question If you Really Want An Answer

How to Get Your Question Answered Quickly - Give us a good and concise explanation
How to provide sample data in the Power BI Forum - Provide data in a table format per the link, or share an Excel/CSV file via OneDrive, Dropbox, etc.. Provide expected output using a screenshot of Excel or other image. Do not provide a screenshot of the source data. I cannot paste an image into Power BI tables.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Thanks for your reply.

 

I tried your solution Table.Buffer(myTable), but it reacts the same if i did a reference or a duplicate, i.e it executes all the previous queries again and again.

 

What I meant by calculated columns, it's added columns to the original Source with power query formulas (sums of columns with group by key, row counts with group by key etc...).

 

I start in my model with my original Table with 20 columns and end with a Table with 58 columns (20+38 added columns)

 

 

What I am looking for is something similar or that does the same thing as in Excel (Copy and paste table values without the formulas and calculus).

SaaM_1-1616232946107.png

 

 

Kind regards

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors