Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
BE50
Frequent Visitor

Compare 2 datasets (budget vs realized, quotation vs project etc)

I am building a delivery validator for a large contractor. They send in a quotation with x items (could be work or material). During building process and after delivery they need to validate if they delivered according to qutation, which items they did not deliver, which items they delivered extra and where delivered numbers differ from quote.

In PQ I need to compare 2 datasets (quotation and project) and get a list with all items from both lists. This way it is much easier to validate 10.000 lines. I prepare the datasets with an index column. The index is generated from values in other columns and identical in the to tables.

Compare_2_datasets.jpg

In resulting table I will have all id's from both tables with no duplicates. I will add columns, some are common (InfoA) and some are unique (columns InfoX resp. InfoY). This way we can now validate delivered vs quoated.

What would be the best way to do that? The datasets could be ~10.000 rows each with combined 30 columns.

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

Code for table named Project

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARRkYZKsTpgQSOIIIiKNIIJGkMEQVSkMVTQyACq0gCs1AAmDDUVTEcCydhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoY = _t])
in
    Source

 Code for table named Quotation

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARREYZKsTpgQSOIIIiKMIIJGkMEQVSEMVTQ0ACq3QCs3wAmDDMVYizQ3FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoX = _t]),
    #"Merged Queries" = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.LeftOuter),
    #"Expanded Project" = Table.ExpandTableColumn(#"Merged Queries", "Project", {"InfoY"}, {"InfoY"}),
    Custom1 = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.RightAnti),
    #"Removed Columns" = Table.RemoveColumns(Custom1,{"IDX", "InfoA"}),
    #"Expanded Project1" = Table.ExpandTableColumn(#"Removed Columns", "Project", {"IDX", "InfoA", "InfoY"}, {"IDX", "InfoA", "InfoY"}),
    #"Appended Query" = Table.Combine({#"Expanded Project", #"Expanded Project1"})
in
    #"Appended Query"

 

View solution in original post

2 REPLIES 2
BE50
Frequent Visitor

Thank you to Vijay. Your solution is very elegant. I merge queries all the time, but never thought of using a step in same query as source 🌟 🌟 🌟

I came up with this solution:

1) Load each source separate (query only): qQuote, qProject

2) Rename columns so they have distinct names (Q_Unit, P_Unit, Q_Amount, P_Amount etc.)

3) Add IDX column in each cource

4) New qIDX query: Merge the 2, remove other columns, remove duplicates. I now have a combined index.

5) Merge qIDX and qQuote. Second merge with qProject. Reorder columns.

 

But I will try your approach also 😀

Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

Code for table named Project

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARRkYZKsTpgQSOIIIiKNIIJGkMEQVSkMVTQyACq0gCs1AAmDDUVTEcCydhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoY = _t])
in
    Source

 Code for table named Quotation

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARREYZKsTpgQSOIIIiKMIIJGkMEQVSEMVTQ0ACq3QCs3wAmDDMVYizQ3FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoX = _t]),
    #"Merged Queries" = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.LeftOuter),
    #"Expanded Project" = Table.ExpandTableColumn(#"Merged Queries", "Project", {"InfoY"}, {"InfoY"}),
    Custom1 = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.RightAnti),
    #"Removed Columns" = Table.RemoveColumns(Custom1,{"IDX", "InfoA"}),
    #"Expanded Project1" = Table.ExpandTableColumn(#"Removed Columns", "Project", {"IDX", "InfoA", "InfoY"}, {"IDX", "InfoA", "InfoY"}),
    #"Appended Query" = Table.Combine({#"Expanded Project", #"Expanded Project1"})
in
    #"Appended Query"

 

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors