cancel
Showing results for 
Search instead for 
Did you mean: 

Distinct Column Combinations

Super User
660 Views
Highlighted
Super User
Super User

Distinct Column Combinations

Suppose you have some data like this:

 

ID            DimKey3            DimKey4          DimKey5          DimKey6

1 blue red green green
2 blue blue red blue
3 blue red blue blue
4 green green red green
5 green green blue green
6 red red red red
7 blue blue blue blue
8 green green green green
9 blue green red green
10 blue red green green

 

You wish to find how many unique combinations that you have for just DimKey3, DimKey4 and DimKey5. DimKey6 does not matter.

 

To get the number of distinct rows where order matters:

 

mDistinctCombos = 
VAR tmptable1 = SELECTCOLUMNS(DistinctCombos,"key3",[DimKey3],"key4",[DimKey4],"key5",[DimKey5])
VAR tmptable2 = DISTINCT(tmptable1)
VAR distcount = COUNTROWS(tmptable2)
RETURN distcount

 

 

To get the number of distinct rows where order does not matter:

 

mDistinctCombosUnique = 
// This measure gets unique keys where order doesn't matter. In other words, red, green, blue is the same as blue, red, green
// Start by getting only the columns we care about
VAR tmptable1 = SELECTCOLUMNS(DistinctCombos,"key3",[DimKey3],"key4",[DimKey4],"key5",[DimKey5])
// Add columns that return the MAX and MIN of all 3 columns for each row
VAR tmptable1a = ADDCOLUMNS(tmptable1,"max",MAX(MAX([key3],[key4]),[key5]),"min",MIN(MIN([key3],[key4]),[key5]))
// Create a single string with all 3 column values and remove the max and min values calculated above, we now have our 3rd value
VAR tmptable1b = ADDCOLUMNS(tmptable1a,"mid",SUBSTITUTE(SUBSTITUTE([key3] & [key4] & [key5],[max],"",1),[min],"",1))
// We have now effectively "sorted" the values in the three columns for each row, so select our new 3 "sorted" columns
VAR tmptable2 = SELECTCOLUMNS(tmptable1b,"max",[max],"min",[min],"mid",[mid])
// Now get the distinct rows in the table
VAR tmptable3 = DISTINCT(tmptable2)
// Count the number of rows
VAR distcount = COUNTROWS(tmptable3)
RETURN distcount

 

 

 

 

 


Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!