Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
zuludogm
Advocate II
Advocate II

Building pivot of secondary relationships in power query

There may be an answer to this and I simply have not effectively formed the question, but I'd like some insight on how best to build a matrix of relationships that are computed through secondary relationships.  I can already do this in R, but have a sense that this could be possible natively in M and I've not the skills to do it. Perhaps someone here can propose a solution.

 

Let's say that we have a list of Companies (A, B, C, and D) and they have products in a list of Industries (1, 2, 3, and 4).  I want to know in how many industries do any two companies compete.  (By corollary, one could also as by how many companies do any two industries connect?)

 

Let's say the companies are in these industries:

Company, Industries

A: is in industries 1, 2, and 3

B: 2, 4

C: 1, 3, 4

😧 1, 2

 

The output would be a cartesian join that would show:

A and B are competing in 1 common industry

A and C in 2

A and D in 2

B and C in 1

B and D in 1

C and D in 1

 

I can do this by jumping out to R and then doing a simple lapply() on an intersect() of the cartesian join of Companies to Companies.  But is there a way to do this using just m?

 

 

1 ACCEPTED SOLUTION

Hi Lydia

Thanks for the quick reply.  I started to reply back that your solution is framing the problem, but does not reach the target outcome and then... as I worked with your file, the answer suddenly came to me.

 

This can be done in one-shot, but I'll take it step-by-step to be clear.

 

1) Let's use your source as the reference.  I called this SourceTable:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTLUMdIxVorViVZyAvKMdEzAbGewjDGU5wJRpxQbCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Company = _t, Industries = _t])
in
    Source

2) Let's make the Cartesian Join of the Companies from your SourceTable.  I call this t1CartJoin:

let
    Source = SourceTable,
//remove all columns except the Company names and clean out duplicates (if any). #"Removed Columns" = Table.RemoveColumns(Source,{"Industries"}), #"Removed Duplicates" = Table.Distinct(#"Removed Columns"),

//now join the SourceTable back in as a new column. #"Added Custom" = Table.AddColumn(#"Removed Duplicates", "Custom", each Table1),
//now expand the table with only the Company name column. This creates a Cartesian Join.
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Company"}, {"Company.1"}),
//now it's time to clean up. rename the columns and get rid of duplicates.
#"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Company.1", "To.Company"}}), #"Removed Duplicates1" = Table.Distinct(#"Renamed Columns"),
//final step in clean-up is to make each row of the cartesian join unique.
// we don't need companies referring to themselves (e.g. A to A) and we don't
// need both A to B and B to A, so we filter out all rows that are not alphabetical.
#"Added Conditional Column" = Table.AddColumn(#"Removed Duplicates1", "Equal", each if [Company] = [To.Company] then true else if [Company] > [To.Company] then true else false), #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Equal] = false)), #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Equal"}) in #"Removed Columns1"

3) Let's now do the magic... we will create a table called t2RMatrix that has the target outcome:

//This takes the t1CartJoin and merges it back with the original SourceTable to create a
//  table of sets.  For example:
//  Company = A, Industries = 1,2,3
//  Company = B, Industries = 2,4

//  It then splits by delimiter into rows, which actually creates a Cartesian Join of Industries from
//  the Cartesian Join of Companies.

//  Then dedupe and pivot, and unpivot and you're done.

let
    Source = t1CartJoin,
   // Merge Queries to add the Industries related to the FromCompanies.
    #"Merged Queries" = Table.NestedJoin(Source,{"Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable" = Table.ExpandTableColumn(#"Merged Queries", "SourceTable", {"Industries"}, {"Industries"}),

   // Merge Queries to add the Industries related to the ToCompanies.
    #"Merged Queries1" = Table.NestedJoin(#"Expanded SourceTable",{"To.Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable1" = Table.ExpandTableColumn(#"Merged Queries1", "SourceTable", {"Industries"}, {"Industries.1"}),

   // Clean up to make it more readable
    #"Renamed Columns" = Table.RenameColumns(#"Expanded SourceTable1",{{"Industries.1", "To.Industries"}}),

   // Split FromIndustries into Rows
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Renamed Columns", {{"Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Industries"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Industries", Int64.Type}}),

   // Split ToIndustries into Rows
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"To.Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "To.Industries"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"To.Industries", Int64.Type}}),

   // Now create an Intersect by testing where the From and To Industries are the same.  Filter out the rows where they are not.
    #"Added Conditional Column" = Table.AddColumn(#"Changed Type1", "Custom", each if [Industries] = [To.Industries] then 1 else 0),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Custom] = 1)),

   // Clean out un-needed columns.
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Industries", "To.Industries"}),

   // Pivot to sum up the common industries by each From-To Company relationship.
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Company]), "Company", "Custom", List.Sum),

   // Unpivot that back into a simple table of From.Company, To.Company, common Industries
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Pivoted Column", {"To.Company"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Attribute", "From.Company"}, {"Value", "Industries"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"From.Company", "To.Company", "Industries"})
in
    #"Reordered Columns"

So, this is embarassing - by way of your answer, I found the answer (or an answer).

 

I was doing this a somewhat hard way by jumping out to R and then pushing the results back.  On the plus side, the R method can be created as a Function.  This method could also be set up as a Function with some work.

 

I run into this need to know secondary relationships pretty often. I call it the BagTag problem, because it is reversable (TagBag or BagTag are two sides of the same question). You see it when you look at any kind of basket analysis.  For example, Twitter data -> How many common hashtags or followers do these two people use/have?

 

It can be addressed in DAX by using many-to-many relationships in a basket analysis, but I sometimes want the results in the table instead of crunched in DAX.

 

OK - to wrap things up, here it is in a single shot that eliminates some un-necessary steps:

let
    Source = SourceTable,
    #"Added Custom" = Table.AddColumn(Source, "Custom", each SourceTable),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Company"}, {"Custom.Company"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Custom.Company", "To.Company"}, {"Company", "From.Company"}}),
    #"Added Conditional Column" = Table.AddColumn(#"Renamed Columns", "Custom", each if [From.Company] >= [To.Company] then true else false),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Custom] = false)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"}),
    #"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"To.Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable" = Table.ExpandTableColumn(#"Merged Queries", "SourceTable", {"Industries"}, {"SourceTable.Industries"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded SourceTable",{{"SourceTable.Industries", "To.Industries"}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Renamed Columns1", {{"Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Industries"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"From.Company", type text}, {"Industries", Int64.Type}}),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"To.Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "To.Industries"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"To.Industries", Int64.Type}}),
    #"Added Conditional Column1" = Table.AddColumn(#"Changed Type1", "Custom", each if [Industries] = [To.Industries] then 1 else 0),
    #"Filtered Rows1" = Table.SelectRows(#"Added Conditional Column1", each ([Custom] = 1)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Industries", "To.Industries"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns1", List.Distinct(#"Removed Columns1"[From.Company]), "From.Company", "Custom", List.Sum),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Pivoted Column", {"To.Company"}, "Attribute", "Value"),
    #"Renamed Columns2" = Table.RenameColumns(#"Unpivoted Columns",{{"Attribute", "From.Company"}, {"Value", "Common Industries"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns2",{"From.Company", "To.Company", "Common Industries"})
in
    #"Reordered Columns"

 

Best regards

Mark

View solution in original post

2 REPLIES 2
v-yuezhe-msft
Employee
Employee

@zuludogm,

Why not directly create a Matrix visual as shown in the following PBIX file?

https://1drv.ms/u/s!AhsotbnGu1Nok3D9krhThdZNXVbl

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi Lydia

Thanks for the quick reply.  I started to reply back that your solution is framing the problem, but does not reach the target outcome and then... as I worked with your file, the answer suddenly came to me.

 

This can be done in one-shot, but I'll take it step-by-step to be clear.

 

1) Let's use your source as the reference.  I called this SourceTable:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTLUMdIxVorViVZyAvKMdEzAbGewjDGU5wJRpxQbCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Company = _t, Industries = _t])
in
    Source

2) Let's make the Cartesian Join of the Companies from your SourceTable.  I call this t1CartJoin:

let
    Source = SourceTable,
//remove all columns except the Company names and clean out duplicates (if any). #"Removed Columns" = Table.RemoveColumns(Source,{"Industries"}), #"Removed Duplicates" = Table.Distinct(#"Removed Columns"),

//now join the SourceTable back in as a new column. #"Added Custom" = Table.AddColumn(#"Removed Duplicates", "Custom", each Table1),
//now expand the table with only the Company name column. This creates a Cartesian Join.
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Company"}, {"Company.1"}),
//now it's time to clean up. rename the columns and get rid of duplicates.
#"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Company.1", "To.Company"}}), #"Removed Duplicates1" = Table.Distinct(#"Renamed Columns"),
//final step in clean-up is to make each row of the cartesian join unique.
// we don't need companies referring to themselves (e.g. A to A) and we don't
// need both A to B and B to A, so we filter out all rows that are not alphabetical.
#"Added Conditional Column" = Table.AddColumn(#"Removed Duplicates1", "Equal", each if [Company] = [To.Company] then true else if [Company] > [To.Company] then true else false), #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Equal] = false)), #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"Equal"}) in #"Removed Columns1"

3) Let's now do the magic... we will create a table called t2RMatrix that has the target outcome:

//This takes the t1CartJoin and merges it back with the original SourceTable to create a
//  table of sets.  For example:
//  Company = A, Industries = 1,2,3
//  Company = B, Industries = 2,4

//  It then splits by delimiter into rows, which actually creates a Cartesian Join of Industries from
//  the Cartesian Join of Companies.

//  Then dedupe and pivot, and unpivot and you're done.

let
    Source = t1CartJoin,
   // Merge Queries to add the Industries related to the FromCompanies.
    #"Merged Queries" = Table.NestedJoin(Source,{"Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable" = Table.ExpandTableColumn(#"Merged Queries", "SourceTable", {"Industries"}, {"Industries"}),

   // Merge Queries to add the Industries related to the ToCompanies.
    #"Merged Queries1" = Table.NestedJoin(#"Expanded SourceTable",{"To.Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable1" = Table.ExpandTableColumn(#"Merged Queries1", "SourceTable", {"Industries"}, {"Industries.1"}),

   // Clean up to make it more readable
    #"Renamed Columns" = Table.RenameColumns(#"Expanded SourceTable1",{{"Industries.1", "To.Industries"}}),

   // Split FromIndustries into Rows
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Renamed Columns", {{"Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Industries"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Industries", Int64.Type}}),

   // Split ToIndustries into Rows
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"To.Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "To.Industries"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"To.Industries", Int64.Type}}),

   // Now create an Intersect by testing where the From and To Industries are the same.  Filter out the rows where they are not.
    #"Added Conditional Column" = Table.AddColumn(#"Changed Type1", "Custom", each if [Industries] = [To.Industries] then 1 else 0),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Custom] = 1)),

   // Clean out un-needed columns.
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Industries", "To.Industries"}),

   // Pivot to sum up the common industries by each From-To Company relationship.
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Company]), "Company", "Custom", List.Sum),

   // Unpivot that back into a simple table of From.Company, To.Company, common Industries
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Pivoted Column", {"To.Company"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Attribute", "From.Company"}, {"Value", "Industries"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns1",{"From.Company", "To.Company", "Industries"})
in
    #"Reordered Columns"

So, this is embarassing - by way of your answer, I found the answer (or an answer).

 

I was doing this a somewhat hard way by jumping out to R and then pushing the results back.  On the plus side, the R method can be created as a Function.  This method could also be set up as a Function with some work.

 

I run into this need to know secondary relationships pretty often. I call it the BagTag problem, because it is reversable (TagBag or BagTag are two sides of the same question). You see it when you look at any kind of basket analysis.  For example, Twitter data -> How many common hashtags or followers do these two people use/have?

 

It can be addressed in DAX by using many-to-many relationships in a basket analysis, but I sometimes want the results in the table instead of crunched in DAX.

 

OK - to wrap things up, here it is in a single shot that eliminates some un-necessary steps:

let
    Source = SourceTable,
    #"Added Custom" = Table.AddColumn(Source, "Custom", each SourceTable),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Company"}, {"Custom.Company"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Custom",{{"Custom.Company", "To.Company"}, {"Company", "From.Company"}}),
    #"Added Conditional Column" = Table.AddColumn(#"Renamed Columns", "Custom", each if [From.Company] >= [To.Company] then true else false),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Custom] = false)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"}),
    #"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"To.Company"},SourceTable,{"Company"},"SourceTable",JoinKind.LeftOuter),
    #"Expanded SourceTable" = Table.ExpandTableColumn(#"Merged Queries", "SourceTable", {"Industries"}, {"SourceTable.Industries"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded SourceTable",{{"SourceTable.Industries", "To.Industries"}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Renamed Columns1", {{"Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Industries"),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"From.Company", type text}, {"Industries", Int64.Type}}),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"To.Industries", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "To.Industries"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"To.Industries", Int64.Type}}),
    #"Added Conditional Column1" = Table.AddColumn(#"Changed Type1", "Custom", each if [Industries] = [To.Industries] then 1 else 0),
    #"Filtered Rows1" = Table.SelectRows(#"Added Conditional Column1", each ([Custom] = 1)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Industries", "To.Industries"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns1", List.Distinct(#"Removed Columns1"[From.Company]), "From.Company", "Custom", List.Sum),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Pivoted Column", {"To.Company"}, "Attribute", "Value"),
    #"Renamed Columns2" = Table.RenameColumns(#"Unpivoted Columns",{{"Attribute", "From.Company"}, {"Value", "Common Industries"}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns2",{"From.Company", "To.Company", "Common Industries"})
in
    #"Reordered Columns"

 

Best regards

Mark

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.