Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
CGlembocki
Frequent Visitor

Reducing number of edges in multi-level, multi-parent, parent-child hierarchies

Hi,

 

First, Power BI is AWESOME! Thanks to all who put it together and contribute to this forum. I'm a recent convert and am continually surprised how much my weekly Excel tasks can be replaced by Power BI. 

 

Second, I'm trying to recreate the functionality of one of my Excel templates in Power BI. I'm starting with a directed edge table for a family (each row specifying a parent node and a child node). 

 

Capture.PNG

 

I need to reduce the number of edges by filtering out those edges that skip a vertex. The following graph shows all edges from the above table:

 

DirectedGraphwithAllEdges.png

 

For example, I need to filter out edge {a,h} because it skips over vertex b. In other words, we do not need to specify that grandparent a is related to child h because we already know grandparent a is related to parent b and parent b is related child h. 

 

The next graph shows the remaining edges after the vertex-skipping edges have been removed: 

 

 

DirectedGraphReducedEdges.png

 

In Excel, I have been able to determine which edges skip over existing vertices as follows:

  1. For each parent, creating an array of all of its children
  2. For each child, creating an array of all of its parents
  3. For each edge in my original table, comparing the array of children with the array of parents. If there is at least one vertex in common, I filter out that edge. Otherwise, I keep it for the final graph.

All of these operations are based on array formulas, meaning they are easy to break. 

 

For a Power BI solution, my initial thought is to deterime, for any edge X/Z, is there a vertex Y that appears as a child of X AND as a parent of Z. This suggests forming and comparing two arrays. I've used Excel's MATCH function but know there has to be a better way in Power BI. 

 

LOOKUPVALUE does not seem to work as it fails when multiple values are present. 

 

CALCULATE looking for the first NONBLANK does not seem to work because I need to check all rows with Z as a child, not just the first one.

 

I'm hesitant to start spawining separate columns for each parent and each child (using SUMMARIZE or GROUP) because the lookups (specifically, using SUMPRODUCT to find the intersections of specific arrays) causes Excel to slow to a crawl and I do not want to reproduce that issue in Power BI. 

 

I tried Merging Queries but that only duplicated the child edge. 

 

Would a Measure be a better approach? If so, how?

 

Ideally, I want to create a table of reduced edges based solely on directing Power BI to import a query. Any thoughts on how to get to this goal would be appreciated. 

 

Also, if anyone has any suggestions on which Power BI visual to use to create the final top-down ordered graph would be appreciated as well. I tried ForceGraph but that renders a circular distribution:

 

PowerBIEdges.PNG

Thanks

 

 

1 ACCEPTED SOLUTION
Phil_Seamark
Employee
Employee

Hi @CGlembocki

 

This might be getting close.  If you add this calculated column to your table it should show you which rows it thinks it should keep or throw out, which you can then filter.

 

Or you can use this code as the basis for a new calculated table that only contains the rows you want to keep

 

Column = 
VAR ParentArray = FILTER('Table1','Table1'[Parent]=EARLIER('Table1'[Parent]))
VAR ChildArray = SELECTCOLUMNS(FILTER('Table1','Table1'[Child]=EARLIER('Table1'[Child])),"Parent2",[Parent],"Child2",[Child])
VAR ArrayTest = GENERATE(ParentArray,FILTER(ChildArray,[Parent2] = [Child]))
RETURN if(COUNTROWS(ArrayTest)>0,"Keep","Remove")

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

View solution in original post

2 REPLIES 2
Phil_Seamark
Employee
Employee

Hi @CGlembocki

 

This might be getting close.  If you add this calculated column to your table it should show you which rows it thinks it should keep or throw out, which you can then filter.

 

Or you can use this code as the basis for a new calculated table that only contains the rows you want to keep

 

Column = 
VAR ParentArray = FILTER('Table1','Table1'[Parent]=EARLIER('Table1'[Parent]))
VAR ChildArray = SELECTCOLUMNS(FILTER('Table1','Table1'[Child]=EARLIER('Table1'[Child])),"Parent2",[Parent],"Child2",[Child])
VAR ArrayTest = GENERATE(ParentArray,FILTER(ChildArray,[Parent2] = [Child]))
RETURN if(COUNTROWS(ArrayTest)>0,"Keep","Remove")

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Thanks @Phil_Seamark!

 

The code works!

 

Chris

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.