Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
erlicp
Frequent Visitor

Fuzzy Grouping

Hello, 

 

I have a long list of account names that I have complied from several files into a data flow. I then applied the fuzzy group function to the entire list of accounts. Picture below for ref : 

erlicp_1-1673979703611.png

My question is how do I make use of the grouped account names? In power desktop there are data groups that I currently use to group these account names together. In the screenshot below you can see the harvard account name group highlighted in red>   

erlicp_3-1673980028132.png

 

My question is. Is there anyway to use fuzzy grouping to create these data groups inside my data flow or do I have to create the groups manually in the desktop version? 

 

*please be gently am very new to data analytics and power bi. 

 

 

1 ACCEPTED SOLUTION
edhans
Super User
Super User

Data Groups are done in the model, not in Power Query or source data. 

If you wanted to group them in at Dataflow, you'd need to create a conditional column that would add the right grouping as another column. 
if [field] = "Boston's Children" then "Harvard Med"
else if [field] = "something else" then "Harvard Public Health"

and so on.

I'd argue the conditional column is the better way from a modeling standpoint to approach it, but it is more tedious than data grouping drag and drop.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

5 REPLIES 5
edhans
Super User
Super User

Data Groups are done in the model, not in Power Query or source data. 

If you wanted to group them in at Dataflow, you'd need to create a conditional column that would add the right grouping as another column. 
if [field] = "Boston's Children" then "Harvard Med"
else if [field] = "something else" then "Harvard Public Health"

and so on.

I'd argue the conditional column is the better way from a modeling standpoint to approach it, but it is more tedious than data grouping drag and drop.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
erlicp
Frequent Visitor

Yeah I'm not sure either one of these options is feasible the list of accounts contains roughly 500k rows. Maybe I am better off using the built in ML modules to try and group the accounts, I've done at least a few thousand manually already that I could use as a potential training model. 

 

You could create a list to merge and create your values, and a Fuzzy Merge is available which means you don't have to generate one for every 500K possible options.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
erlicp
Frequent Visitor

"You could create a list to merge" when you say create a list are you referring to making a transformation table? 

 

Just saying create a list of items (not a Power Query "List") that could be pulled in and then a fuzzy merge done. For example, if you turned down the sensitivity in Fuzzy Merge, pretty much anything with Harvard in it would match and could be grouped to the Harvard section.

It is like all AI type features though. It may work 95-97% of the time, and the rest you have to keep adding exceptions for.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors