cancel
Showing results for
Did you mean:

## Transitive Closure in Power Query

In the last few months I have been answering dozens of users’ questions. And there were at least three or four of you with different use cases but the same nominator: I have a table with two columns From and To. And for every distinct value of the column From I want to see all nodes which I can reach by traversing the edges From --> To. This problem has a name: transitive closure.

What is a transitive closure?

The omniscient Wikipedia says to transitive closure: “In mathematics, the transitive closure of a binary relation R on a set X is the smallest relation on X that contains R and is transitive.
For example, if X is a set of airports and xRy means "there is a direct flight from airport x to airport y" (for x and y in X), then the transitive closure of R on X is the relation R+ such that x R+ y means "it is possible to fly from x to y in one or more flights". Informally, the transitive closure gives you the set of all places you can get to from any starting place.”

A picture is worth a thousand words:

What we want to achieve is shown on the following image:

The code

Let’s start with our data sample from the previous illustration. It is a table with two columns From and To. Every row is a directed edge from From to To.

It will be recursion, again, I know, but I don’t know any other way to solve this problem in PowerQuery. But before we come to a recursive function, let us create key-value pairs where key is a node and value is a list of all nodes, which you can reach directly from the key. The result won’t be a table but a record instead. (Why do I do that? It will be much faster – you can read more about this performance boost in my recent post Lookup Table in Power Query.)

```// get all direct descendants as a record (key-value pairs From ==> To)
DirectDescendantsRecord = List.Accumulate(
FromDistinctList,
[],
(state, current) =>
state &
Expression.Evaluate(
"[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]",
[Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current]
)
),```

This fast access to all direct descendants of a node will be very helpful in the next step, in the recursive function fnTransitiveRelationList. This function gets a list of all nodes which we should visit and the newly created record with key-value pairs. It goes recursively through all the nodes until the list is empty. As a result, you get all nodes which you can visit from the current node.

```// get recursively list of all descendants
fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
let
result =
if List.IsEmpty(toBeDoneList) then
{}
else
let
newToBeDoneList = List.RemoveItems(
// combine all lists together
List.Combine(
// get a list of direct descendants for every value
List.Transform(
toBeDoneList,
each Record.FieldOrDefault(directDescendantsRecord, _, {})
)
),
toBeDoneList
)
in
List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
in
result,```

And that’s all. The last part is only a call of the recursive function.

```// get recursive all descendants of current value
TableFromList,
"AllDescendantRelations",
(_) =>
let
allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord),
removeItself = List.RemoveItems(allRelations, {[From]}),
toText = Text.Combine(removeItself, ";")
in
toText,
type list
)```

And the whole code in one piece.

```let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTJSitWBsIzBLCMgywTMMgayLMAsUyDLDMwyA7LM4SygbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}}),

// buffer the source table
BufferedTable = Table.Buffer(ChangedType),

// create a distinct sorted list of From values
FromDistinctList = List.Sort(List.Distinct(Table.Column(BufferedTable, "From"))),

// get all direct descendants as a record (key-value pairs From ==> To)
DirectDescendantsRecord = List.Accumulate(
FromDistinctList,
[],
(state, current) =>
state &
Expression.Evaluate(
"[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]",
[Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current]
)
),

// create a table of distinct From values
TableFromList = Table.FromList(FromDistinctList, Splitter.SplitByNothing(), {"From"}),

// get recursively list of all descendants
fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
let
result =
if List.IsEmpty(toBeDoneList) then
{}
else
let
newToBeDoneList = List.RemoveItems(
// combine all lists together
List.Combine(
// get a list of direct descendants for every value
List.Transform(
toBeDoneList,
each Record.FieldOrDefault(directDescendantsRecord, _, {})
)
),
toBeDoneList
)
in
List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
in
result,

// get recursive all descendants of current value
TableFromList,
"AllDescendantRelations",
(_) =>
let
allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord),
removeItself = List.RemoveItems(allRelations, {[From]}),
toText = Text.Combine(removeItself, ";")
in
toText,
type list
)
in
TransitiveClosure```

Very cool stuff @Nolock !

With regards to recursion, you can use List.Generate as replacement, as it us usually supposed to be faster: https://www.thebiccountant.com/2017/09/26/recursion-m-beginners/

And with regards to the lookups, an inner merge could also be an alternative (giving sufficient performance for me at least  https://www.thebiccountant.com/2017/02/14/dynamically-flatten-parent-child-hierarchies-in-dax-and-po...

Hi @ImkeF,

I will try to rewrite the recursion using an interation and I will also prepare some performance tests.

Hi @Nolock,

Thank you for publishing this. Can you tell me how I can make the step from the column 'AllDescendantRelations' to records? On the TransitiveClosure-result, I can choose for "Expand to new rows" (code: Table.ExpandListColumn(TransitiveClosure, "AllDescendantRelations")) but I get the error "We cannot convert the value "2;3;4;8" to type Table.".

BTW: I see you omitted the part of the solution for cyclic relations (which is relevant for your airplanes example, too).

JVos

Hi @Nolock,

I found a solution to the problem mentioned in my previous comment. I changed the 'TransitiveClosure' part as follows (I commented out what is in your code):

```    TransitiveClosure = Table.AddColumn(
TableFromList,
"AllDescendantRelations",
(_) =>
let
allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord),
removeItself = List.RemoveItems(allRelations, {[From]})
//toText = Text.Combine(removeItself, ";")
in
//toText,
removeItself,
type list
),```

I would like to know: why did you add Text.Combine? I can't imagine you did this without a reason.

JVos

@Nolock: I added functionality for cyclic relations to 'fnTransitiveRelationList':

```    // get recursively list of all descendants
//fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record, alreadyDoneList as list) as list =>
let
result =
if List.IsEmpty(toBeDoneList) then
{}
else
let

newToBeDoneList = List.RemoveItems(
// combine all lists together
List.Combine(
// get a list of direct descendants for every value
List.Transform(
toBeDoneList,
each Record.FieldOrDefault(directDescendantsRecord, _, {})
)
),
toBeDoneList
//)
),

in
//List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
in
result,```

And in 'TransitiveClosure':

```//allRelations = fnTransitiveRelationList({[ROUTING_ID_From]}, DirectDescendantsRecord),
allRelations = fnTransitiveRelationList({[ROUTING_ID_From]}, DirectDescendantsRecord,{}),```

JVos

@Nolock: I had a performance problem with my data. Investigating it with a test set, I found that a chain of 10 relations (1 => 2, 2 => 3, ... , 9 => 10) can just be done on my computer. With a chain of 11, I loose my patience...

@Nolock: I see a comment regarding cyclic relations is not there anymore, so here it is again.

To support cyclic relations, I adjusted the code of the 'fnTransitiveRelationList' part as follows:

```    // get recursively list of all descendants
//fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record, alreadyDoneList as list) as list =>
let
result =
if List.IsEmpty(toBeDoneList) then
{}
else
let

newToBeDoneList = List.RemoveItems(
// combine all lists together
List.Combine(
// get a list of direct descendants for every value
List.Transform(
toBeDoneList,
each Record.FieldOrDefault(directDescendantsRecord, _, {})
)
),
toBeDoneList
//)
),

in
//List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
in
result,```

And in the 'TransitiveClosure' part:

```//allRelations = fnTransitiveRelationList({[ROUTING_ID_From]}, DirectDescendantsRecord),
allRelations = fnTransitiveRelationList({[ROUTING_ID_From]}, DirectDescendantsRecord,{}),```

@Nolock: I've already posted several comments - still another one. This is about extra functionality I need, because there is an extra attribute that determines which indirect relations can be derived from the direct relations.

See the picture below. nodes 1, 2, 4 and 5 are country-specific, node 3 is country independent. Indirect relations 1 => 4 and 2 => 5 are allowed. But 1 => 5 and 2 => 4 not.

Do you have a solution for this? In the earlier provided solutions - with tables instead of lists - I can figure out a solution.

JVos

Hi @ImkeF and @JVos,

I have rewritten the algorithm. It uses an iteration instead of a recursion.

It can also handle cycles as you can see in the test data sample.

In the beginning of the code is also another data sample which shows, that it works with paths longer than 10.

```let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Hcu5EQAwCAPBXhQT+DfUwtB/G5aV3GxymegwDJQla5jSpLq0qC1t6kiHutKlXHIqpPhvQ9UD", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}}),

/*
x1 = Table.FromList({1..20}, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
x2 = Table.RenameColumns(x1, {"Column1", "From"}),
x3 = Table.AddColumn(x2, "To", each [From] + 1),
x4 = Table.TransformColumns(x3, {{"From", Text.From}, {"To", Text.From}}),
ChangedType = Table.TransformColumnTypes(x4,{{"From", type text}, {"To", type text}}),
*/

// buffer the source table
BufferedTable = Table.Buffer(ChangedType),

// create a distinct sorted list of From values
FromDistinctList = List.Sort(List.Distinct(Table.Column(BufferedTable, "From"))),

// get all direct descendants as a record (key-value pairs From ==> To)
DirectDescendantsRecord = List.Accumulate(
FromDistinctList,
[],
(state, current) =>
state &
Expression.Evaluate(
"[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]",
[Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current]
)
),

// get a count of all nested elements
fnCountOfRecordElements = (rec as record) => List.Count(List.Combine(Record.FieldValues(rec))),

fnNextStep = (
lastStepResult as record
) as record =>
let
// get descendants of the last step
lastDescendents = lastStepResult[result],

newDescendents = List.Transform(
Record.FieldNames(lastDescendents),
(fieldName) =>
let
// last descendents of a field name
values = Record.Field(lastDescendents, fieldName),
// for all last descendents get their descendents
newIndirectDescendents = List.Combine(
List.Transform(
values,
(value) => Record.FieldOrDefault(lastDescendents, value, {})
)
),
// remove descendents which we have already visited (cares of cycles)
// combine old and new descendents
allDescendants = List.Distinct(
),
// create a record of field name and all its descendants
result = Expression.Evaluate("[" & fieldName & "=allDescendants]", [allDescendants = allDescendants])
in
result
),

// create a new descendents record as a combination of all till now found descendents
combinedRecord = Record.Combine(newDescendents)
in
[
result = combinedRecord,
countOfRecordElements = fnCountOfRecordElements(combinedRecord),
lastCountOfRecordElements = lastStepResult[countOfRecordElements]
],

// get a list of all descendants (iteration)
fnTransitiveRelationList = (
directDescendantsRecord as record
) as table =>
let
// iterate until 2 following each other iterations have the same count of elements
whileTrue =
List.Generate(
// init
() =>
[
// all descendants found till now
result = directDescendantsRecord,
// count of all nested elements
countOfRecordElements = fnCountOfRecordElements(directDescendantsRecord),
// count of last all nested elements
lastCountOfRecordElements = -1
],
// do while counts of all nested elements are different
(current) => current[lastCountOfRecordElements] <> current[countOfRecordElements],
// calculate next step
(last) => fnNextStep(last)
),
// get last item of the list (the list contains all iterations of the algorithm)
lastItem = List.Last(whileTrue),
// get property result
combinedRecord = lastItem[result],
// create a table
toTable = Record.ToTable(combinedRecord),
// concat values
combineDescendants = Table.TransformColumns(toTable, {{"Value", each Text.Combine(_, ";")}})
in
combineDescendants,

// get recursive all descendants of current value
TransitiveClosure = fnTransitiveRelationList(DirectDescendantsRecord)
in
TransitiveClosure```

Hi @JVos,

to your last request about forbidden transitive relations: If you had a table with all forbidden relations, it should be possible to remove them from the result in every step of the iteration. But I think such table will be huge. Also a table of allowed relations will be huge. You know your scenario better than anybody else. You have to decide what is better.

Hi @Nolock:

1) Thank you for your new algorithm! It might last a week before I will work on it and give you feedback.

2) About forbidden transitive relations: indeed it's not workable to have a table with all forbidden relations. So I was thinking that at the moment of defining a transitive relation to check whether it's an allowed one. So my initial table of from-to's should not only have the identification of the node, but also the country-attribute for both the from-node as well as the to-node. But I have to look at your latest code to see if I can extend it for this subject.

JVos

Top Kudoed Posts
Latest Articles
Archives