Solved: Re: Cluster algorithm

chawong · ‎07-07-2017

For the "automatically find cluster" capability in PowerBI desktop, what type of algorithm/logic is being used to arrive at the clusters? Trying to understand how the tool is arriving at the clusters based on parameters I input. Thanks

v-sihou-msft · ‎07-09-2017

@chawong

For details about the Microsoft Clustering Algorithm, please refer to article below:

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm

Regards,

View solution in original post

v-sihou-msft · ‎07-09-2017

@chawong

For details about the Microsoft Clustering Algorithm, please refer to article below:

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm

Regards,

Greg_Deckler · ‎07-10-2017

@v-sihou-msft - From the technical link in the link you provided:

The Microsoft Clustering algorithm provides two methods for creating clusters and assigning data points to the clusters. The first, the K-means algorithm, is a hard clustering method. This means that a data point can belong to only one cluster, and that a single probability is calculated for the membership of each data point in that cluster. The second method, the Expectation Maximization(EM) method, is a soft clustering method. This means that a data point always belongs to multiple clusters, and that a probability is calculated for each combination of data point and cluster.+

You can choose which algorithm to use by setting the CLUSTERING_METHOD parameter. The default method for clustering is scalable EM.

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm-te...

So, does Power BI use K-Means or EM? Sounds like it is likely EM if that is the default.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

B_Real · ‎05-22-2018

@Greg_Deckler did you ever get an answer to this? It's still not obvious which clustering method is used in 'Automatically Find Clusters' in Power BI, or how to actually change it.

Any ideas @v-sihou-msft?

I'm not sure this question is solved yet.

webportal · ‎05-13-2019

@B_Real I believe it is K-means clustering because it doesn't work when you try to cluster using categorical variables.
K-means uses averages to determine cluster centroids, so therefore only numerical values are accepted.

Hope this helps, I also couldn't find any specific documentation about this.

zkazimov · ‎09-26-2017

Hi Guys,

I have customers clustered in power bi based on the margin. When I did clusterng, I had date filter = this month. Now, when I change date filter to this year, it is not reclustering. This month clustering grouped date into 5 groups and total of 461 customers. This year still shows 461 which I know is incorrect.

See images below. Is there anyhting I can do to ensure it reclusters once filter changed to any other date?

Greg_Deckler · ‎07-07-2017

My guess would be K-Means.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Cluster algorithm

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly