Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Levenshtein String Distance Algorithm In DAX

Hello,

 

So far, this article is the closest I've come to trying to reach a measure that I've been working on for a while...

 

Measure equavalent for string similarity formula

 

I know it can be done in SQL through a scalar value function creation. I am wondering if it can be done in Power BI using some form of calculatetable, selected value, and and an iterator function like sumx and/or rankx. Here is an example of what I am trying to do...

 

I have one static column as seen below. I want to create a measure that finds the top n number of similarity percentages for a [TestColumn] selected value slicer matched with every other value within the same column. 

 

TestColumn

Leaf
Leaves
Trees
Leafly
Lost Lake
Hawaii
Free
Moist
posture
Classical
Classic
Jobe
Job
Freedom
Lost Music
rap
R&B
Rapper
Rapped
Wrap
Wrrap
Wrapper
Boy
Boys
Boston
 

The reason for this question is to help a company do a massive clean up to a 32k vendor list. 

 

Please let me know if this is possible in DAX. 

 

Thanks.

 

 

1 ACCEPTED SOLUTION
v-gizhi-msft
Community Support
Community Support

Hi,

 

Please try to create a seperate slicer table as the same with your original table first.

Create this column:

Column = LEN('Table'[TestColumn])

Then try this measure:

Measure = 
VAR SlicerText =
    SELECTEDVALUE ( 'Slicer Table'[TestColumn] )
VAR TableText =
    SELECTEDVALUE ( 'Table'[TestColumn] )
VAR length =
    MAX ( LEN ( SlicerText ), LEN ( TableText ) )
VAR TestTable =
    ADDCOLUMNS (
        GENERATESERIES ( 1, length, 1 ),
        "InSlicer", MID ( SlicerText, [Value], 1 ),
        "InTable", MID ( TableText, [Value], 1 )
    )
RETURN
    COUNTROWS ( FILTER ( TestTable, [InSlicer] = [InTable] ) )
        / COUNTROWS ( TestTable )

When you select one value in slicer, the result shows:

20.PNG

Here is my test pbix file:

pbix 

Hope this helps.

 

Best Regards,

Giotto Zhi

 

View solution in original post

4 REPLIES 4
v-gizhi-msft
Community Support
Community Support

Hi,

 

Please try to create a seperate slicer table as the same with your original table first.

Create this column:

Column = LEN('Table'[TestColumn])

Then try this measure:

Measure = 
VAR SlicerText =
    SELECTEDVALUE ( 'Slicer Table'[TestColumn] )
VAR TableText =
    SELECTEDVALUE ( 'Table'[TestColumn] )
VAR length =
    MAX ( LEN ( SlicerText ), LEN ( TableText ) )
VAR TestTable =
    ADDCOLUMNS (
        GENERATESERIES ( 1, length, 1 ),
        "InSlicer", MID ( SlicerText, [Value], 1 ),
        "InTable", MID ( TableText, [Value], 1 )
    )
RETURN
    COUNTROWS ( FILTER ( TestTable, [InSlicer] = [InTable] ) )
        / COUNTROWS ( TestTable )

When you select one value in slicer, the result shows:

20.PNG

Here is my test pbix file:

pbix 

Hope this helps.

 

Best Regards,

Giotto Zhi

 

Not Found file .pbix for download

Anonymous
Not applicable

@v-gizhi-msft 

 

Thank you, thank you, thank you! Works great!

Greg_Deckler
Super User
Super User

So what would be the expected output from the sample data you have provided? Are you basically trying to determine how many characters each value has in common with all of the other values in the column? 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.