In the last few months I have been answering dozens of users’ questions. And there were at least three or four of you with different use cases but the same nominator: I have a table with two columns From and To. And for every distinct value of the column From I want to see all nodes which I can reach by traversing the edges From --> To. This problem has a name: transitive closure.
What is a transitive closure?
The omniscient Wikipedia says to transitive closure: “In mathematics, the transitive closure of a binary relation R on a set X is the smallest relation on X that contains R and is transitive.
For example, if X is a set of airports and xRy means "there is a direct flight from airport x to airport y" (for x and y in X), then the transitive closure of R on X is the relation R+ such that x R+ y means "it is possible to fly from x to y in one or more flights". Informally, the transitive closure gives you the set of all places you can get to from any starting place.”
A picture is worth a thousand words:
What we want to achieve is shown on the following image:
The code
Let’s start with our data sample from the previous illustration. It is a table with two columns From and To. Every row is a directed edge from From to To.
It will be recursion, again, I know, but I don’t know any other way to solve this problem in PowerQuery. But before we come to a recursive function, let us create key-value pairs where key is a node and value is a list of all nodes, which you can reach directly from the key. The result won’t be a table but a record instead. (Why do I do that? It will be much faster – you can read more about this performance boost in my recent post Lookup Table in Power Query.)
// get all direct descendants as a record (key-value pairs From ==> To) DirectDescendantsRecord = List.Accumulate( FromDistinctList, [], (state, current) => state & Expression.Evaluate( "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current] ) ),
This fast access to all direct descendants of a node will be very helpful in the next step, in the recursive function fnTransitiveRelationList. This function gets a list of all nodes which we should visit and the newly created record with key-value pairs. It goes recursively through all the nodes until the list is empty. As a result, you get all nodes which you can visit from the current node.
// get recursively list of all descendants fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list => let result = if List.IsEmpty(toBeDoneList) then {} else let newToBeDoneList = List.RemoveItems( // combine all lists together List.Combine( // get a list of direct descendants for every value List.Transform( toBeDoneList, each Record.FieldOrDefault(directDescendantsRecord, _, {}) ) ), toBeDoneList ) in List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)}) in result,
And that’s all. The last part is only a call of the recursive function.
// get recursive all descendants of current value TransitiveClosure = Table.AddColumn( TableFromList, "AllDescendantRelations", (_) => let allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord), removeItself = List.RemoveItems(allRelations, {[From]}), toText = Text.Combine(removeItself, ";") in toText, type list )
And the whole code in one piece.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTJSitWBsIzBLCMgywTMMgayLMAsUyDLDMwyA7LM4SygbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]), ChangedType = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}}), // buffer the source table BufferedTable = Table.Buffer(ChangedType), // create a distinct sorted list of From values FromDistinctList = List.Sort(List.Distinct(Table.Column(BufferedTable, "From"))), // get all direct descendants as a record (key-value pairs From ==> To) DirectDescendantsRecord = List.Accumulate( FromDistinctList, [], (state, current) => state & Expression.Evaluate( "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current] ) ), // create a table of distinct From values TableFromList = Table.FromList(FromDistinctList, Splitter.SplitByNothing(), {"From"}), // get recursively list of all descendants fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list => let result = if List.IsEmpty(toBeDoneList) then {} else let newToBeDoneList = List.RemoveItems( // combine all lists together List.Combine( // get a list of direct descendants for every value List.Transform( toBeDoneList, each Record.FieldOrDefault(directDescendantsRecord, _, {}) ) ), toBeDoneList ) in List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)}) in result, // get recursive all descendants of current value TransitiveClosure = Table.AddColumn( TableFromList, "AllDescendantRelations", (_) => let allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord), removeItself = List.RemoveItems(allRelations, {[From]}), toText = Text.Combine(removeItself, ";") in toText, type list ) in TransitiveClosure
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.