Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
BE50
Frequent Visitor

Compare 2 datasets (budget vs realized, quotation vs project etc)

I am building a delivery validator for a large contractor. They send in a quotation with x items (could be work or material). During building process and after delivery they need to validate if they delivered according to qutation, which items they did not deliver, which items they delivered extra and where delivered numbers differ from quote.

In PQ I need to compare 2 datasets (quotation and project) and get a list with all items from both lists. This way it is much easier to validate 10.000 lines. I prepare the datasets with an index column. The index is generated from values in other columns and identical in the to tables.

Compare_2_datasets.jpg

In resulting table I will have all id's from both tables with no duplicates. I will add columns, some are common (InfoA) and some are unique (columns InfoX resp. InfoY). This way we can now validate delivered vs quoated.

What would be the best way to do that? The datasets could be ~10.000 rows each with combined 30 columns.

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

Code for table named Project

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARRkYZKsTpgQSOIIIiKNIIJGkMEQVSkMVTQyACq0gCs1AAmDDUVTEcCydhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoY = _t])
in
    Source

 Code for table named Quotation

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARREYZKsTpgQSOIIIiKMIIJGkMEQVSEMVTQ0ACq3QCs3wAmDDMVYizQ3FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoX = _t]),
    #"Merged Queries" = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.LeftOuter),
    #"Expanded Project" = Table.ExpandTableColumn(#"Merged Queries", "Project", {"InfoY"}, {"InfoY"}),
    Custom1 = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.RightAnti),
    #"Removed Columns" = Table.RemoveColumns(Custom1,{"IDX", "InfoA"}),
    #"Expanded Project1" = Table.ExpandTableColumn(#"Removed Columns", "Project", {"IDX", "InfoA", "InfoY"}, {"IDX", "InfoA", "InfoY"}),
    #"Appended Query" = Table.Combine({#"Expanded Project", #"Expanded Project1"})
in
    #"Appended Query"

 

View solution in original post

2 REPLIES 2
BE50
Frequent Visitor

Thank you to Vijay. Your solution is very elegant. I merge queries all the time, but never thought of using a step in same query as source 🌟 🌟 🌟

I came up with this solution:

1) Load each source separate (query only): qQuote, qProject

2) Rename columns so they have distinct names (Q_Unit, P_Unit, Q_Amount, P_Amount etc.)

3) Add IDX column in each cource

4) New qIDX query: Merge the 2, remove other columns, remove duplicates. I now have a combined index.

5) Merge qIDX and qQuote. Second merge with qProject. Reorder columns.

 

But I will try your approach also 😀

Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

Code for table named Project

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARRkYZKsTpgQSOIIIiKNIIJGkMEQVSkMVTQyACq0gCs1AAmDDUVTEcCydhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoY = _t])
in
    Source

 Code for table named Quotation

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8nQxMFTSUXJMTARREYZKsTpgQSOIIIiKMIIJGkMEQVSEMVTQ0ACq3QCs3wAmDDMVYizQ3FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [IDX = _t, InfoA = _t, InfoX = _t]),
    #"Merged Queries" = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.LeftOuter),
    #"Expanded Project" = Table.ExpandTableColumn(#"Merged Queries", "Project", {"InfoY"}, {"InfoY"}),
    Custom1 = Table.NestedJoin(Source, {"IDX"}, Project, {"IDX"}, "Project", JoinKind.RightAnti),
    #"Removed Columns" = Table.RemoveColumns(Custom1,{"IDX", "InfoA"}),
    #"Expanded Project1" = Table.ExpandTableColumn(#"Removed Columns", "Project", {"IDX", "InfoA", "InfoY"}, {"IDX", "InfoA", "InfoY"}),
    #"Appended Query" = Table.Combine({#"Expanded Project", #"Expanded Project1"})
in
    #"Appended Query"

 

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors