Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
thowa
Regular Visitor

Overlaying/merging two tables to add missing data in empty cells and new rows/columns

I have two tables each with a column of unique keys (dates).

The other columns have identical headers, albeit there may be rows and columns in one table absent in the other and vice versa.

The data in the columns is mostly but not entirly identical and one table may contain data missing in the other and vice versa.

 

I am looking for a solution workable in Excel Power Query that essentially creates an overlay of the two tables with one table having priority over the other. Output should be a new table that cotains all data of the first table plus the additional data contained in the other table.

 

Picture two people making observcations every day (rows) and log those observations every hour (columns). But someimtes one of them will miss an observation or they will not log identical observations every hour.

I want to get a table combining those observations with one observer having precedent over the other:

 

Table1 - has precedence:

Date123
1.1.20001 6
2.1.20007 4
3.1.200034 
4.1.20006

 

5

 

Table2 (note the different but overlapping headers):

Date2345
1.1.20002466
2.1.20003 7

 

3.1.20004 3 2

 

OutputTable - green is the new data from Table2 after "Overlay":

Date12345
1.1.200012666
2.1.20007347

 

3.1.2000343 2
4.1.20006

 

5  

Green = new data from Table2

Red = data from Table 1 with precedence over different data in Table2

 

Is there a simply way to do this without using an if then formula cycling through each individual cell (in reality those tables have thousands of cells)?

The key question is filling missing data (values "2" in 2 of 1.1.2000 and value missing value 3 in 2 of 3.1.2000 and discarding value 7 for 3 of 2.1.2000 of Table2 in favour of the preceding value 4 in Table1.

I know how to add the rows (Table.Combine) and columns (Table.Join) if it cannot be done in more elegant procedure together with the "overlay".

 

Note that I need this for Excel Power Query, not Power BI!

3 ACCEPTED SOLUTIONS
Anonymous
Not applicable

Hi @thowa ,

 

Sorry pressed a wrong button :).

 

I posted my solution in the thread that you are referring to:

https://community.powerbi.com/t5/Power-Query/Function-to-update-any-table-with-new-values-from-anoth...

 

Kind regards,

JB

View solution in original post

camargos88
Community Champion
Community Champion

@thowa ,

 

Table Result ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyBGIFIDZTitWJVjJCSJhDJUzAEsYICWOwIEgSJGGCkDADYiAyVYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"1" = _t, #"2" = _t, #"3" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"1", Int64.Type}, {"2", Int64.Type}, {"3", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value"),
#"Appended Query" = Table.Combine({#"Unpivoted Other Columns", #"Anti Join"}),
#"Pivoted Column" = Table.Pivot(#"Appended Query", List.Distinct(#"Appended Query"[Attribute]), "Attribute", "Value")
in
#"Pivoted Column"

 

Table (2) ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyAmITIDYD41idaCUjhKQxECsAsTkQg+WMEXIgTQowBUZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"2" = _t, #"3" = _t, #"4" = _t, #"5" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"2", Int64.Type}, {"3", Int64.Type}, {"4", Int64.Type}, {"5", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"

 

Anti join ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyBGIFIDZTitWJVjJCSJhDJUzAEsYICWOwIEgSJGGCkDADYiAyVYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"1" = _t, #"2" = _t, #"3" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"1", Int64.Type}, {"2", Int64.Type}, {"3", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value"),
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns", {"Date", "Attribute"}, #"Table (2)", {"Date", "Attribute"}, "Unpivoted Other Columns", JoinKind.RightAnti),
#"Removed Other Columns" = Table.SelectColumns(#"Merged Queries",{"Unpivoted Other Columns"}),
#"Expanded Unpivoted Other Columns" = Table.ExpandTableColumn(#"Removed Other Columns", "Unpivoted Other Columns", {"Date", "Attribute", "Value"}, {"Date", "Attribute", "Value"})
in
#"Expanded Unpivoted Other Columns"

 

Ricardo

 



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



View solution in original post

Thanks to

@camargos88and @Anonymous for the accepted solutions.

 

I have summarized both possibilities below - not sure which one performs better.

 

Camargos' solution coded into a function:

(BottomTable as table, TopTable as table)=>

let
    //Will overlay the Bottom Table with the TopTable.
    //Empty cells in Top table are transparent, i.e. existing data from BottomTable will persist unless the TopTable has data in the same cell.
    //New rows or columns from TopTable will be added
    //There must be a "Date" column as anchor column for the overlay in both tables
    //Columns may come back in a mixed siquence
    
    #"Changed Type1" = Table.TransformColumnTypes(TopTable,{"Date", type date}),
    #"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),

    #"Changed Type2" = Table.TransformColumnTypes(BottomTable,{{"Date", type date}}),
    #"Unpivoted Other Columns2" = Table.UnpivotOtherColumns(#"Changed Type2", {"Date"}, "Attribute", "Value"),

    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns1", {"Date", "Attribute"}, #"Unpivoted Other Columns2", {"Date", "Attribute"}, "Unpivoted Other Columns", JoinKind.RightAnti),
    #"Removed Other Columns" = Table.SelectColumns(#"Merged Queries",{"Unpivoted Other Columns"}),
    #"Expanded Unpivoted Other Columns" = Table.ExpandTableColumn(#"Removed Other Columns", "Unpivoted Other Columns", {"Date", "Attribute", "Value"}, {"Date", "Attribute", "Value"}),
    //Changed Name to keep Query similar to post (as if output from "Anti Join" Query)

    #"Appended Query" = Table.Combine({#"Unpivoted Other Columns1", #"Expanded Unpivoted Other Columns"}),
    Result = Table.Pivot(#"Appended Query", List.Distinct(#"Appended Query"[Attribute]), "Attribute", "Value")
in
    Result

 

jborro's solution (slighly changed in table names etc.):

(BottomTable as table, TopTable as table, AnchorColumnName as list)=>

let
    //Will overlay the Bottom Table with the TopTable.
    //Empty cells in Top table are transparent, i.e. existing data from BottomTable will persist unless the TopTable has data in the same cell.
    //New rows or columns from TopTable will be added
    //There must be a "Date" column as anchor column for the overlay in both tables
    //Columns may come back in a mixed siquence
    //IMPORTANT: the AnchorColumnNae Parameter must be the name of one of the columns! If the Power Query interface is used, you will be asked to pick a column from one of the tables and that will not work.
    AttributeFields = AnchorColumnName,
    ValueFields = List.RemoveItems(Table.ColumnNames(Source), AttributeFields),
    ValueFieldFunction = List.Accumulate(ValueFields, {}, (a, n) => a & {{n, (x)=>List.First(List.RemoveNulls(Table.Column(x, n)))}}),
    Source = Table.Combine({TopTable, BottomTable}),
    #"Results" = Table.Group(Source, AttributeFields, ValueFieldFunction)

in
    #"Results"

 

THANKS

View solution in original post

7 REPLIES 7
thowa
Regular Visitor

Just saw a similar proposal here (un-replied at this time): Function to update any table with new values from another table 

Seems like I am not the only one wondering...

Anonymous
Not applicable

Hi @thowa ,

 

Sorry pressed a wrong button :).

 

I posted my solution in the thread that you are referring to:

https://community.powerbi.com/t5/Power-Query/Function-to-update-any-table-with-new-values-from-anoth...

 

Kind regards,

JB

camargos88
Community Champion
Community Champion

Hi @thowa ,

 

I believe you need to apply the right anti join and append it to the first table.

 

 

 directquery_sqlserverdb.png

 

Check this file: Download PBIX 

 

Ricardo



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



@camargos88 

Thanks for the quick response!

I was also thinking that Table.Join with the right Joinkind should be the right track - but I couldn't figure out a solution.

Your's seems to work judging from the result, but I cannot open the file as I don't actually have Power BI (hence solution for Excel Power Query requested; code should be the same unless it is an unsupported feature).

 

Happy to accept the solution, if you can post the code.

 

Thomas

camargos88
Community Champion
Community Champion

@thowa ,

 

Table Result ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyBGIFIDZTitWJVjJCSJhDJUzAEsYICWOwIEgSJGGCkDADYiAyVYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"1" = _t, #"2" = _t, #"3" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"1", Int64.Type}, {"2", Int64.Type}, {"3", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value"),
#"Appended Query" = Table.Combine({#"Unpivoted Other Columns", #"Anti Join"}),
#"Pivoted Column" = Table.Pivot(#"Appended Query", List.Distinct(#"Appended Query"[Attribute]), "Attribute", "Value")
in
#"Pivoted Column"

 

Table (2) ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyAmITIDYD41idaCUjhKQxECsAsTkQg+WMEXIgTQowBUZKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"2" = _t, #"3" = _t, #"4" = _t, #"5" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"2", Int64.Type}, {"3", Int64.Type}, {"4", Int64.Type}, {"5", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"

 

Anti join ->

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQz1DMyMDBQ0lEyBGIFIDZTitWJVjJCSJhDJUzAEsYICWOwIEgSJGGCkDADYiAyVYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, #"1" = _t, #"2" = _t, #"3" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"1", Int64.Type}, {"2", Int64.Type}, {"3", Int64.Type}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Date"}, "Attribute", "Value"),
#"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns", {"Date", "Attribute"}, #"Table (2)", {"Date", "Attribute"}, "Unpivoted Other Columns", JoinKind.RightAnti),
#"Removed Other Columns" = Table.SelectColumns(#"Merged Queries",{"Unpivoted Other Columns"}),
#"Expanded Unpivoted Other Columns" = Table.ExpandTableColumn(#"Removed Other Columns", "Unpivoted Other Columns", {"Date", "Attribute", "Value"}, {"Date", "Attribute", "Value"})
in
#"Expanded Unpivoted Other Columns"

 

Ricardo

 



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Thanks to

@camargos88and @Anonymous for the accepted solutions.

 

I have summarized both possibilities below - not sure which one performs better.

 

Camargos' solution coded into a function:

(BottomTable as table, TopTable as table)=>

let
    //Will overlay the Bottom Table with the TopTable.
    //Empty cells in Top table are transparent, i.e. existing data from BottomTable will persist unless the TopTable has data in the same cell.
    //New rows or columns from TopTable will be added
    //There must be a "Date" column as anchor column for the overlay in both tables
    //Columns may come back in a mixed siquence
    
    #"Changed Type1" = Table.TransformColumnTypes(TopTable,{"Date", type date}),
    #"Unpivoted Other Columns1" = Table.UnpivotOtherColumns(#"Changed Type1", {"Date"}, "Attribute", "Value"),

    #"Changed Type2" = Table.TransformColumnTypes(BottomTable,{{"Date", type date}}),
    #"Unpivoted Other Columns2" = Table.UnpivotOtherColumns(#"Changed Type2", {"Date"}, "Attribute", "Value"),

    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Other Columns1", {"Date", "Attribute"}, #"Unpivoted Other Columns2", {"Date", "Attribute"}, "Unpivoted Other Columns", JoinKind.RightAnti),
    #"Removed Other Columns" = Table.SelectColumns(#"Merged Queries",{"Unpivoted Other Columns"}),
    #"Expanded Unpivoted Other Columns" = Table.ExpandTableColumn(#"Removed Other Columns", "Unpivoted Other Columns", {"Date", "Attribute", "Value"}, {"Date", "Attribute", "Value"}),
    //Changed Name to keep Query similar to post (as if output from "Anti Join" Query)

    #"Appended Query" = Table.Combine({#"Unpivoted Other Columns1", #"Expanded Unpivoted Other Columns"}),
    Result = Table.Pivot(#"Appended Query", List.Distinct(#"Appended Query"[Attribute]), "Attribute", "Value")
in
    Result

 

jborro's solution (slighly changed in table names etc.):

(BottomTable as table, TopTable as table, AnchorColumnName as list)=>

let
    //Will overlay the Bottom Table with the TopTable.
    //Empty cells in Top table are transparent, i.e. existing data from BottomTable will persist unless the TopTable has data in the same cell.
    //New rows or columns from TopTable will be added
    //There must be a "Date" column as anchor column for the overlay in both tables
    //Columns may come back in a mixed siquence
    //IMPORTANT: the AnchorColumnNae Parameter must be the name of one of the columns! If the Power Query interface is used, you will be asked to pick a column from one of the tables and that will not work.
    AttributeFields = AnchorColumnName,
    ValueFields = List.RemoveItems(Table.ColumnNames(Source), AttributeFields),
    ValueFieldFunction = List.Accumulate(ValueFields, {}, (a, n) => a & {{n, (x)=>List.First(List.RemoveNulls(Table.Column(x, n)))}}),
    Source = Table.Combine({TopTable, BottomTable}),
    #"Results" = Table.Group(Source, AttributeFields, ValueFieldFunction)

in
    #"Results"

 

THANKS

Thanks for the summary @thowa .

 

Let's call the 2 tables DictionaryTbl (the one that needs updating) and ChangeTbl (the one containing the updated values)

Can I confirm that both solutions:

1. Update DictionaryTbl when there are changed values in ChangeTbl

2. Add to DictionaryTbl when there are new values in ChangeTbl (as indicated by the ID column -- ChangeTbl has some ID values that do not exist in DictionaryTbl) ?

 

Thanks!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors