Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
LaurentZ
Helper I
Helper I

Compare 2 tables easily - how do you do ?

Hi,

 

I've 2 tables:

- the first contains a forecast on a month (with a breakdown by customers and products) => imported in PQ as "OLD"

- the second countains the true realization on the same month => imported in PQ as "NEW"

 

I would like to get a third table giving me the difference New vs Old on all possible breakdown customer/products

Then append the 3 tables and Load to a Pivot Table, for which I can select the type of data I want to display: True, Forcast, Change vs Forecast.

 

I tried a way that works but my query is long and may crash sometimes:

Duplicate OLD, multiply each value column *-1 (18 columns!), then Append NEW and OLD(-1) and then GroupBy... it works but that's soooo long.

alternatively I unpivot my 18 columns, multiply *-1 and then repivoting... but that's even worst in term of timings.

 

 

PowerQuery Only, I work in Excel not in Power BI.

 

Do you know a better way ?

Thank you. 

11 REPLIES 11
Jimmy801
Community Champion
Community Champion

Hello @LaurentZ 

 

here another approach by using Table.Join, expanding the result, making your calculation and reshaping it. 

After that join all 3 created tables

let
    New = #table
	(
		{"Scenario","Customer","Product","Quantity"},
		{
			{"Actuals","A","1","5"},	{"Actuals","B","1","5"},	{"Actuals","B","2","1"}
		}
	),
    Old = #table
	(
		{"Scenario","Customer","Product","Quantity"},
		{
			{"Forecast","A","1","3"},	{"Forecast","B","2","2"}
		}
	),
	Difference = 
	let 
		ToJoin = Table.NestedJoin(New, {"Customer", "Product"}, Old, {"Customer", "Product"}, "Old", JoinKind.FullOuter ),
    	#"Expanded Old" = Table.ExpandTableColumn(ToJoin, "Old", {"Scenario", "Customer", "Product", "Quantity"}, {"Old.Scenario", "Old.Customer", "Old.Product", "Old.Quantity"}),
    	#"Replaced Value" = Table.ReplaceValue(#"Expanded Old",null,"0",Replacer.ReplaceValue,Table.ColumnNames(#"Expanded Old")),
    	#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each Number.From([Quantity])-Number.From([Old.Quantity])),
    	#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each "Difference"),
    	#"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Custom.1", "Custom", "Customer", "Product"}),
    	#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Custom", "Quantity"}, {"Custom.1", "Scenario"}})
	in 
		#"Renamed Columns",
	Append = New & Old & Difference,
    #"Changed Type" = Table.TransformColumnTypes(Append,{{"Quantity", Int64.Type}})
in
    #"Changed Type"

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Dear @Jimmy801@v-alq-msft@ziying35 , @Anonymous , @Greg_Deckler ,

Thank you all for your answer.

 

I saw some solutions but all are working with small tables and limited volume of data.

So let me share a table (excel) with totally fake data (figures are generated with RANDBETWEEN function).

OneDrive - Excel File   (edit I also put a pbix file, but I need an output in Excel, not in PowerBI)

 

So there are 2 tabs: NEW and OLD with Customer / Product in common, and up to 12 "Value Drivers", already imported in PowerQuery.

The aim here is to create a table corresponding to NEW - OLD, by keeping the same level of data (customer/product/all value drivers), named DIF

 

And at the end, append NEW, OLD & DIF and play with the Status to display the data I want (Actuals, Forecast, Difference) - no problem here I know how to do it.

 

Until now I tried this: 

Take OLD, multiply all Value Column * -1, then append NEW and OLD-1 and then GroupBy to only kepp the difference... very long.

 

I'm using this workaround but I would really like to get a better way, more flexible if you know how to proceed.

 

Thank you.

 

Laurent

 

 

 

Anonymous
Not applicable

 @LaurentZ 

I had not given code of any kind, I had limited myself to sharing some reflections (of common sense) on the dimensional analysis of the problem.
One of those considerations (*) I have roughly put it into practice on the tables you provided, just to get an idea of the times.
It seems to me that everything you ask for can be done within minutes.
I attach a link to the pbix file.

 

https://giga.allegati.tim.it/r?d=alice.it&wr=rocco.lupoi@telecomitalia.it&ws=sprmnt21&e=alice.it&c=R...

 

(*) the one that suggests the Roman dictum "divide et impera!"

 

empty post

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

I got the following result in less than 1':

 

image.png

 

 

you have to group by Customer field both table NEW and OLD:

 

image.png

 

then produce the difference Table:

 

 

let
    Source = Table.NestedJoin(OLD, {"Customer"}, NEW, {"Customer"}, "NEW", JoinKind.LeftOuter),
    #"Expanded NEW" = Table.ExpandTableColumn(Source, "NEW", {"cust"}, {"NEW.cust"}),
    #"Added Custom" = Table.AddColumn(#"Expanded NEW", "diff", each diff([NEW.cust],[cust])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"diff"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"diff", "cust"}})
in
    #"Renamed Columns"

 

 

which use this function:

 

 

let
    diff = (old,new)=>
    let
    names=Table.ColumnNames(old), 
    mm=Table.TransformRows(old,  (rowOld)=> Record.FromList({"Difference",rowOld[Customer],rowOld[Product]}&List.Transform({3..20}, each let rowNew=new{[Product=rowOld[Product]]}? in Record.FieldValues(rowOld){_}-Record.FieldValues(rowNew){_}),names))
    in Table.FromRecords(mm)
in 
  diff

 

 

and finally put all together:

 

 

let
    Source = Table.Combine({OLD[[cust]], NEW[[cust]], Difference}),
    #"Expanded cust" = Table.ExpandTableColumn(Source, "cust", {"Scenario", "Customer", "Product", "Val Driv 1", "Val Driv 2", "Val Driv 3", "Val Driv 4", "Val Driv 5", "Val Driv 6", "Val Driv 7", "Val Driv 8", "Val Driv 9", "Val Driv 10", "Val Driv 11", "Val Driv 12", "Val Driv 13", "Val Driv 14", "Val Driv 15", "Val Driv 16", "Val Driv 17", "Val Driv 18"}, {"Scenario", "Customer", "Product", "Val Driv 1", "Val Driv 2", "Val Driv 3", "Val Driv 4", "Val Driv 5", "Val Driv 6", "Val Driv 7", "Val Driv 8", "Val Driv 9", "Val Driv 10", "Val Driv 11", "Val Driv 12", "Val Driv 13", "Val Driv 14", "Val Driv 15", "Val Driv 16", "Val Driv 17", "Val Driv 18"})
in
    #"Expanded cust"

 

 

PS

you could find usefull place some if.. the .. else  in the function to check case where new and old doesn't match!

 

 

 

 

Thank you all for your answers and help.

I was off few days, so let me have a look at all your solutions and test them, then I'll tell you which solution suits me.

v-alq-msft
Community Support
Community Support

Hi, @LaurentZ 

 

Based on your description, I created data to reproduce your scenario. The pbix file is attached in the end.

Actual:

b1.png

 

Forecast:

b2.png

 

You may create a blank query and input the following codes to create a table for difference.

let
    Source = Table.AddColumn(Actual, "New Scenario", each "Difference"),
    #"Added Custom" = Table.AddColumn(Source, "New Quantity", each let customer=[Customer],product=[Product],
tab=Table.SelectRows(Forecast,each [Customer]=customer and [Product]=product)
in 
if Table.RowCount(tab)>0 then
[Quantity]-tab[Quantity]{0}
else [Quantity]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Scenario", "Quantity"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"New Scenario", "Customer", "Product", "New Quantity"}),
    #"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"New Scenario", "Scenario"}, {"New Quantity", "Quantity"}})
in
    #"Renamed Columns"

b3.png

 

Then you may create another blank query for final result as below.

let
    #"Appended Query" = Table.Combine({ Actual, Forecast, New})
in
    #"Appended Query"

b4.png

 

Best Regards

Allan

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

 

 

Anonymous
Not applicable

 Put it in very general terms, the question is schematized by saying that having to "cross" two tables of 10 ^ 5 lines, 10 ^ 10 comparisons will have to be made in the worst case. All these comparisons take a certain amount of time, which can be different if you use different lookup or merge techniques or something else.

but if instead we could divide the problem into, say, 100 under problems of 10 ^ 3 elements, the comparisons would be 10 ^ 3 X 10 ^ 3 X 100. Therefore, we would have a time 100 times shorter, with the same technique, compared to the original time.

In this case, how could one divide the problem into subproblems?

We should have more information to find an effective solution.

For example, how many customers are there?

How many products are there?

 

####

another way would be to limit the search for the row corresponding to the current new row in the old table to a small subset of the rows instead of scrolling through all 100,000 rows for each search. If you manage to limit the search to a group of 1000 lines, you improve the search time by 100 times.

 

 

just to give an idea, but an effective solution strongly depends on the structure of the tables, you can sort the tables in increasing order with respect to the customer column and when you do the search you use Table.PositionOf to find the first (and only) occurrence of the row with the same customer and product. The next search is done starting from the previous position on a reduced table table.skip (tab, positionof); again you can limit the search towards the end of the table by stopping when records are found with customerIDold> customerIDnew.
in this way, for each search, a few hundred comparisons are made instead of 100000. In theory, obtaining a gain of 1000 times!?!?!?!?

 

Greg_Deckler
Super User
Super User

@LaurentZ - Not really enough information to go on, please first check if your issue is a common issue listed here: https://community.powerbi.com/t5/Community-Blog/Before-You-Post-Read-This/ba-p/1116882

Also, please see this post regarding How to Get Your Question Answered Quickly: https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490

The most important parts are:
1. Sample data as text, use the table tool in the editing bar
2. Expected output from sample data
3. Explanation in words of how to get from 1. to 2.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Ok, then

I've this table in Excel imported in Power Query as NEW:

ScenarioCustomerProductQuantity
ActualsA15
ActualsB15
ActualsB21

 

I've this table in Excel imported in Power Query as OLD:

ScenarioCustomerProductQuantity
ForecastA13
ForecastB22

 

First, I want a creation of a table by Power Query, giving me the difference between Tables NEW and OLD

ScenarioCustomerProductQuantity
DifferenceA12
DifferenceB15
DifferenceB2-1

 

and finaly append those 3 tables to a final table:

ScenarioCustomerProductQuantity
ActualsA15
ActualsB15
ActualsB21
ForecastA13
ForecastB22
DifferenceA12
DifferenceB15
DifferenceB2-1

 

How would you proceed  ?

 

Note that this is very simplified, database are both >100,000 lines by tables and I've 18 columns of drivers for values (Quantity, turnover...)

 

ziying35
Impactful Individual
Impactful Individual

Hi, @LaurentZ 

Can you remove sensitive information from your real data and upload it to a cloud drive and share the link here? That way, when people help you solve the problem, they can write code that is more comprehensive in terms of operational etticiency

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors