Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Fuzzy Lookup

One of the most required functionalities in terms of data transformation for Power BI is the ability to do Fuzzy Lookup on two datasets so that input text values with minor errors can still be mapped to a dimension in PowerBI.

I know best practices would redirect this to the usage of a ETL tool and take care of this in a DW etc.... but most uses cases are within the universe of Self Service BI using Excel files.

 

Let me know if there is a workaround either using M, R or other approach.

 

Thank you

7 REPLIES 7
DrewBbc
Helper I
Helper I

let
Table1 = #table(type table [value = text],
{ {"spruce"}, {"cedar"}, {"birch"}, {"pine"}, {"poplar"}, {"aspen"},
{"baobab"}, {"larch"}, {"willow"} }),

Table2 = #table(type table [word = text],
{ {"sprce"}, {"cidar"}, {"pane"}, {"aspern"}, {"babab"} }),

AddCol = Table.AddColumn(Table2, "Custom", each Table1),
Expand = Table.ExpandTableColumn(AddCol, "Custom", {"value"}, {"value"}),

FuzzyCalc = Table.AddColumn(Expand, "Prct", each
2 * List.Count( List.Intersect(
{ Text.ToList([word]), Text.ToList([value]) } ) )
/ (List.Count(Text.ToList([word])) + List.Count(Text.ToList([value])))),

SortRows = Table.Sort(FuzzyCalc,{{"word", Order.Ascending}, {"Prct", Order.Descending}}),
AddIndex = Table.AddIndexColumn(SortRows, "Index", 1, 1),
RemoveDupls = Table.Distinct(AddIndex, {"word"})
in
RemoveDupls

v-shex-msft
Community Support
Community Support

Hi @Anonymous,

 

Currently, power bi not support fuzzy lookup. Is only supported to use full query(direct support) and part query(need some additional operations to work through it).  It not support to lookup the similar result(e.g. use "abc" to find out the "aac","a1c") and search with wildcard.

 

BTW, you can submit your requirement to ideas to help us improve the power bi.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
Anonymous
Not applicable

Hi Xiaoxin,

I.know that Power BI does not have this support yet... but for example it is possible to use R to do data transformation correct?
So imagine that in the same dataset I have two columns, one with all values and another that would have only the exact matches... Having this, my idea was to use R to fuzzy match one of the columns with the distinct values of the other one (possible values) and add a third column with the results.
Is this possible?

Thanks

HI @Anonymous,

 

If you meant use r script to deal with original data source, r script support these operations.


I think t-sql fuzzy matching is more simple than r script, you can refer to below links:

Fuzzy Logic function in R as in Matlab

Fuzzy matching using T-SQL

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
Greg_Deckler
Super User
Super User

Why not just use binning?


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Binning would require me to know all the values to select... but this is dynamic and tomorrow the items may be different.

Or am I understanding your suggestion wrong?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.