Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a table with direct relations between so-called 'routings' (let's say: process steps). Now I want to derive all indirect relations. For example when routing 1 directly preceeds 2, 2 directly preceeds 3, 3 directly preceeds 4, then 1 INdirectly preceeds 3 and 4 and 2 INdirectly preceeds 4. I use Power Query within Excel 2016.
Example table with direct relations:
From | To | RelationType |
1 | 2 | Direct |
1 | 3 | Direct |
2 | 4 | Direct |
3 | 5 | Direct |
5 | 6 | Direct |
6 | 7 | Direct |
6 | 8 | Direct |
And this is what the result should be (so comprising direct and indirect relations):
From | To | RelationType |
1 | 2 | Direct |
1 | 3 | Direct |
2 | 4 | Direct |
3 | 5 | Direct |
5 | 6 | Direct |
1 | 4 | Indirect |
1 | 5 | Indirect |
1 | 6 | Indirect |
1 | 7 | Indirect |
1 | 8 | Indirect |
3 | 6 | Indirect |
3 | 7 | Indirect |
3 | 8 | Indirect |
5 | 7 | Indirect |
5 | 8 | Indirect |
This is how I want to derive all indirect relationships:
1. Start with routings that aren't present in the To column (they're at the start of a chain of routings) (level 1);
2. Determine their successors (level 2);
3. Determine the successors of those successors (level 3);
4. Set indirect relationships between predecessors and successors of level 2;
5. Take routings of level 2 as level 1 and perform steps 1 to 4 again. This until the result of step 3 is empty (ends of chains are reached).
I am trying to create a custom recursive function for this, but it's a hard job. How could this best be solved?
Solved! Go to Solution.
@Nolock: I worked out the solution to prevent endless loops. In short: the 'froms' that are already done, are remember in a list and not offered to be done in a next recursion. At the end of the query self-referencing transitions are removed.
let Source = Excel.CurrentWorkbook(), tmpInput = Source{[Name="tmpInput"]}[Content], ChangedType = Table.TransformColumnTypes(tmpInput,{{"From", Int64.Type}, {"To", Int64.Type}, {"RelationType", type text}}), // get list of all descendants fnTransitiveRelationList = (sourceTbl as table, curToBeDoneList as list, alreadyDoneList as list) as list => let curNumber = List.First(curToBeDoneList), rowsStartingWithCurNumber = Table.SelectRows(sourceTbl, each [From] = curNumber), alreadyDoneList = List.Combine({alreadyDoneList, {curNumber}}), result = if Table.IsEmpty(rowsStartingWithCurNumber) and List.IsEmpty(curToBeDoneList) then {} else let toList = Table.Column(rowsStartingWithCurNumber, "To"), nextToBeDoneList = List.Distinct( List.Combine( { List.RemoveFirstN(curToBeDoneList, 1), toList } ) ), nextToBeDoneListNoAlreadyDone = List.Difference(nextToBeDoneList,alreadyDoneList), recursiveResultList = @fnTransitiveRelationList(sourceTbl, nextToBeDoneListNoAlreadyDone, alreadyDoneList), curRecursiveResultList = List.Distinct( List.Combine( { toList, recursiveResultList } ) ) in curRecursiveResultList in result, // create a table from all descendants with a from column and relation type = indirect fnTransitiveRelationTable = (sourceTbl as table, from as number) as table => let recursiveList = fnTransitiveRelationList(sourceTbl, {from},{}), recordList = List.Transform(recursiveList, each [From = from, To = _, RelationType = "Indirect"]), result = Table.FromRecords(recordList) in result, // add a column TableOfDescendants TableOfDescendents = Table.AddColumn(ChangedType, "TableOfDescendents", each fnTransitiveRelationTable(ChangedType, [From])), // combine input table with new descendants TableOfAllDescentantsTables = Table.Combine({ChangedType, Table.Combine(TableOfDescendents[TableOfDescendents])}), // distinct on columns From and To Result = Table.SelectRows(Table.Distinct(TableOfAllDescentantsTables, {"From", "To"}), each [From] <> [To]) in Result
Hi @JVos,
I've written the recursive part of the task and you have now a list of all descendent for every row.Unfortunately I can't finish it now because of time pressure. I'll continue tomorrow if you don't finish it by yourself till then.
EDIT: The working solution with comments. It creates a list of all descendents for every row and then merges the result with the origin table.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTICYpfMotTkEqVYHYiQMaoQSIUJqhBIhSmqEIhrhioE4ppjClkgCcUCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t, RelationType = _t]), ChangedType = Table.TransformColumnTypes(Source,{{"From", Int64.Type}, {"To", Int64.Type}, {"RelationType", type text}}), // get list of all descendants fnTransitiveRelationList = (sourceTbl as table, curToBeDoneList as list) as list => let curNumber = List.First(curToBeDoneList), rowsStartingWithCurNumber = Table.SelectRows(sourceTbl, each [From] = curNumber), result = if Table.IsEmpty(rowsStartingWithCurNumber) and List.IsEmpty(curToBeDoneList) then {} else let toList = Table.Column(rowsStartingWithCurNumber, "To"), nextToBeDoneList = List.Distinct( List.Combine( { List.RemoveFirstN(curToBeDoneList, 1), toList } ) ), recursiveResultList = @fnTransitiveRelationList(sourceTbl, nextToBeDoneList), curRecursiveResultList = List.Distinct( List.Combine( { toList, recursiveResultList } ) ) in curRecursiveResultList in result, // create a table from all descendants with a from column and relation type = indirect fnTransitiveRelationTable = (sourceTbl as table, from as number) as table => let recursiveList = fnTransitiveRelationList(sourceTbl, {from}), recordList = List.Transform(recursiveList, each [From = from, To = _, RelationType = "Indirect"]), result = Table.FromRecords(recordList) in result, // add a column TableOfDescendants TableOfDescendents = Table.AddColumn(ChangedType, "TableOfDescendents", each fnTransitiveRelationTable(ChangedType, [From])), // combine input table with new descendants TableOfAllDescentantsTables = Table.Combine({ChangedType, Table.Combine(TableOfDescendents[TableOfDescendents])}), // distinct on columns From and To Result = Table.Distinct(TableOfAllDescentantsTables, {"From", "To"}) in Result
Hi @Nolock: in the basis, I get your solution to work in my Excel file. Two things:
1. Can you shortly explain how you created the Base64 string that is present in your code?
2. With the real data - above I gave a simple example - the query function is running endless. This is maybe because the relations aren't strict transitve. For example:
From | To | RelationType |
67 | 79 | Direct |
79 | 69 | Direct |
79 | 71 | Direct |
71 | 72 | Direct |
72 | 73 | Direct |
67 | 94 | Direct |
94 | 95 | Direct |
95 | 96 | Direct |
Two paths to get from 67 to 71:
67 => 79 => 71
67 => 94 => 95 => 96 => 71
Note that in this real example there are no loops defined. However I need also to cope with such situations.
I am going to try to find a solution building on the code you provided. But if you quickly know how to solve this, I am pleased with your help.
You get the Base64 string when you create a table with help of PowerQuery Editor GUI via Home / External Data / Enter Data.
@Nolock: Can you help me a little bit more regarding the base64 string, please? The Power Query Editor that starts from Excel doesn't have the 'External Data' group. Excel has on its 'Data' tab the 'Get & Transform Data' group and there the option 'From Table/Range', but that's not giving me a base64 string.
Hi @JVos,
the Base64 code contains only a sample of data which I've used in my solution.
In Excel: Click on Data / Get Data / From Other Sources / Blank Query
It will open the PowerQuery Editor. Click on Advanced Editor and replace those 4 lines with the code I've posted.
You can see the result of the query in the background.
What you explained, is what I already understood. My question is: how to create that base64 string?
@Nolock: The duplicate paths to get from 67 to 71 aren't the problem, I found already. I am going to investigate now my complete set with data what might be the problem (probably a loop).
Problem is indeed a loop in the from-to's, e.g. 76 > 77 and 77 > 76.
@Nolock: I worked out the solution to prevent endless loops. In short: the 'froms' that are already done, are remember in a list and not offered to be done in a next recursion. At the end of the query self-referencing transitions are removed.
let Source = Excel.CurrentWorkbook(), tmpInput = Source{[Name="tmpInput"]}[Content], ChangedType = Table.TransformColumnTypes(tmpInput,{{"From", Int64.Type}, {"To", Int64.Type}, {"RelationType", type text}}), // get list of all descendants fnTransitiveRelationList = (sourceTbl as table, curToBeDoneList as list, alreadyDoneList as list) as list => let curNumber = List.First(curToBeDoneList), rowsStartingWithCurNumber = Table.SelectRows(sourceTbl, each [From] = curNumber), alreadyDoneList = List.Combine({alreadyDoneList, {curNumber}}), result = if Table.IsEmpty(rowsStartingWithCurNumber) and List.IsEmpty(curToBeDoneList) then {} else let toList = Table.Column(rowsStartingWithCurNumber, "To"), nextToBeDoneList = List.Distinct( List.Combine( { List.RemoveFirstN(curToBeDoneList, 1), toList } ) ), nextToBeDoneListNoAlreadyDone = List.Difference(nextToBeDoneList,alreadyDoneList), recursiveResultList = @fnTransitiveRelationList(sourceTbl, nextToBeDoneListNoAlreadyDone, alreadyDoneList), curRecursiveResultList = List.Distinct( List.Combine( { toList, recursiveResultList } ) ) in curRecursiveResultList in result, // create a table from all descendants with a from column and relation type = indirect fnTransitiveRelationTable = (sourceTbl as table, from as number) as table => let recursiveList = fnTransitiveRelationList(sourceTbl, {from},{}), recordList = List.Transform(recursiveList, each [From = from, To = _, RelationType = "Indirect"]), result = Table.FromRecords(recordList) in result, // add a column TableOfDescendants TableOfDescendents = Table.AddColumn(ChangedType, "TableOfDescendents", each fnTransitiveRelationTable(ChangedType, [From])), // combine input table with new descendants TableOfAllDescentantsTables = Table.Combine({ChangedType, Table.Combine(TableOfDescendents[TableOfDescendents])}), // distinct on columns From and To Result = Table.SelectRows(Table.Distinct(TableOfAllDescentantsTables, {"From", "To"}), each [From] <> [To]) in Result
@Nolock : I noticed a performance issue. Please see: https://community.powerbi.com/t5/Power-Query/Performance-issue-at-expanding-or-combining-tablecolumn.... If you have any idea...
Hi @JVos,
ok, I will look at it. Btw. I will publish a blog post about a transitive closure in Power Query next Tuesday. You can find it then here: https://community.powerbi.com/t5/Community-Blog/bg-p/community_blog
Hi @Nolock: Thank you for your solution. In the meantime I figured out a solution by myself, but it's very resource / time consuming when there are more than 5 iterations. I am going to try you solution next week.