cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Frequent Visitor

Efficient way to calculate top % within a category

Hi, I've got a huge data set of >1.5mil rows loaded in Power BI. I'm trying to segmentise the top 2%, next 20%, remaining 78% by value within each category (A, B, C). I've been adding columns using M instead of DAX as I read that the performance is faster since the data set is quite big. I was exploring creating tables based on top N values but that might be quite tedious; also sorting from high to low and assigning % might be an option too but it'll be too taxing on the performance. Any suggestions on efficient ways to do it?

 

Thank you!

 

Resulting table should add the 4th column with tagging

PersonCategoryValueTagging
Person 1A100Top 2%
Person 2A80Next 20%
Person 3A60Remaining 70%
Person 4A40Remaining 70%
Person 5A20Remaining 70%
Person 6A10Remaining 70%
Person 7B200...
Person 8C50...
Person 9C50...

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Community Support
Community Support

Hi @kucci 

You may try to build calculated columns, and it works well.

I build a sample which has 1 million rows to have a test.

Rank = RANKX(FILTER(Sheet2,Sheet2[Category]=EARLIER(Sheet2[Category])),Sheet2[Value],,DESC,Dense)
Tag = 
Var  _MaxRank = MAXX(FILTER(Sheet2,Sheet2[Category] = EARLIER(Sheet2[Category])),Sheet2[Rank])
return
SWITCH(TRUE(),Sheet2[Rank]<=_MaxRank*0.02,"	Top 2%",Sheet2[Rank]<=_MaxRank*0.22,"Next 20%","Remaining 70%")

Result:

1.png

If this reply still could't help you solve your problem, could you tell me your calculate logic to calculate TopN?

Did you calcualte the Top N by the rank for each category?

 

You can download the pbix file from this link: Efficient way to calculate top % within a category

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

 

View solution in original post

3 REPLIES 3
Highlighted
Super User IV
Super User IV

Highlighted
Community Support
Community Support

Hi @kucci 

You may try to build calculated columns, and it works well.

I build a sample which has 1 million rows to have a test.

Rank = RANKX(FILTER(Sheet2,Sheet2[Category]=EARLIER(Sheet2[Category])),Sheet2[Value],,DESC,Dense)
Tag = 
Var  _MaxRank = MAXX(FILTER(Sheet2,Sheet2[Category] = EARLIER(Sheet2[Category])),Sheet2[Rank])
return
SWITCH(TRUE(),Sheet2[Rank]<=_MaxRank*0.02,"	Top 2%",Sheet2[Rank]<=_MaxRank*0.22,"Next 20%","Remaining 70%")

Result:

1.png

If this reply still could't help you solve your problem, could you tell me your calculate logic to calculate TopN?

Did you calcualte the Top N by the rank for each category?

 

You can download the pbix file from this link: Efficient way to calculate top % within a category

 

Best Regards,

Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly. 

 

View solution in original post

Highlighted

Thanks, this worked well! I didn't realise that I had to create a calculated column instead of a measure.

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Power Platform 2020 release wave 2 plan

Features releasing from October 2020 through March 2021

Top Solution Authors
Top Kudoed Authors