Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
For the "automatically find cluster" capability in PowerBI desktop, what type of algorithm/logic is being used to arrive at the clusters? Trying to understand how the tool is arriving at the clusters based on parameters I input. Thanks
Solved! Go to Solution.
For details about the Microsoft Clustering Algorithm, please refer to article below:
https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm
Regards,
For details about the Microsoft Clustering Algorithm, please refer to article below:
https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm
Regards,
@v-sihou-msft - From the technical link in the link you provided:
The Microsoft Clustering algorithm provides two methods for creating clusters and assigning data points to the clusters. The first, the K-means algorithm, is a hard clustering method. This means that a data point can belong to only one cluster, and that a single probability is calculated for the membership of each data point in that cluster. The second method, the Expectation Maximization(EM) method, is a soft clustering method. This means that a data point always belongs to multiple clusters, and that a probability is calculated for each combination of data point and cluster.+
You can choose which algorithm to use by setting the CLUSTERING_METHOD parameter. The default method for clustering is scalable EM.
So, does Power BI use K-Means or EM? Sounds like it is likely EM if that is the default.
@Greg_Deckler did you ever get an answer to this? It's still not obvious which clustering method is used in 'Automatically Find Clusters' in Power BI, or how to actually change it.
Any ideas @v-sihou-msft?
I'm not sure this question is solved yet.
@B_Real I believe it is K-means clustering because it doesn't work when you try to cluster using categorical variables.
K-means uses averages to determine cluster centroids, so therefore only numerical values are accepted.
Hope this helps, I also couldn't find any specific documentation about this.
Hi Guys,
I have customers clustered in power bi based on the margin. When I did clusterng, I had date filter = this month. Now, when I change date filter to this year, it is not reclustering. This month clustering grouped date into 5 groups and total of 461 customers. This year still shows 461 which I know is incorrect.
See images below. Is there anyhting I can do to ensure it reclusters once filter changed to any other date?
My guess would be K-Means.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
97 | |
96 | |
81 | |
74 | |
66 |
User | Count |
---|---|
129 | |
106 | |
106 | |
86 | |
72 |