Solved: Re: Cumulative % by category for pareto

hwong · ‎06-28-2021

I am trying to create a pareto chart using the column and line chart but am running into an issue trying to calculate the cumulative % that is required in a pareto.

I have 2 tables, Tests and Test Anomalies linked by Test ID:

Given the following data, I would like to see results below when sorting the results by the count of failures per anomaly type:

Tests

Test ID	Operator	Result	Date
1	ABC	Pass	Sunday, June 27, 2021
2	ABC	Fail	Sunday, June 27, 2021
3	XYZ	Pass	Sunday, June 27, 2021
4	XYZ	Fail	Sunday, June 27, 2021
5	ABC	Fail	Sunday, June 27, 2021
6	XYZ	Pass	Monday, June 28, 2021
7	XYZ	Fail	Monday, June 28, 2021
8	ABC	Pass	Monday, June 28, 2021
9	ABC	Fail	Monday, June 28, 2021
10	XYZ	Fail	Monday, June 28, 2021

Test Anomalies

Test ID	Anomaly
2	C
2	B
4	A
5	C
7	B
9	C
9	B
10	C
10	B
10	A

Expected Result

Anomaly	Count of Anomaly (sort DESC by this column)	Cumulative %
B	4	40%
C	4	80%
A	2	100%

How can I calculate the Cumulative % in this scenario?

P.S. I have read articles and seen other answers where there is a sortable column of pre-calculated values for calculating cumulative %, but in this case there is no sort column because the Count of Anomaly will be dynamic based on filter selection. For example, the user can filter by Date and that would change the sort order.

Sample PBIX can be downloaded here:

https://1drv.ms/u/s!Arsjvofj3uEZh8sJnGi8OVry7dWAQw?e=vgZzZv

hwong · ‎07-01-2021

Anyway, I figured it out using some temporary tables to get the "sort column". It was a lot more involved than I had anticipated. Here's the solution I ended up with:

Cumulative % = 
    VAR countSummaryTable =
        CALCULATETABLE(
            SUMMARIZE(
                'Test Anomalies',
                'Test Anomalies'[Anomaly],
                "Anomaly Count", COUNT('Test Anomalies'[Anomaly])
            ),
            REMOVEFILTERS('Test Anomalies'[Anomaly])
        )
    VAR rankedSummaryTable =
        ADDCOLUMNS(
            countSummaryTable,
            "Anomaly Count Rank", RANKX(countSummaryTable, 
                                    RANKX(countSummaryTable, [Anomaly Count] + DIVIDE(
                                                                                RANKX(countSummaryTable, [Anomaly]),
                                                                                COUNTROWS(countSummaryTable) + 1)
                                    ), ,ASC,Dense))
    VAR selectedAnomalyRank = MAXX(FILTER(rankedSummaryTable, [Anomaly]=SELECTEDVALUE('Test Anomalies'[Anomaly])), [Anomaly Count Rank])
    VAR totalAnomalies = SUMX(countSummaryTable, [Anomaly Count])
    RETURN        
        DIVIDE(
            SUMX(
                FILTER(rankedSummaryTable, [Anomaly Count Rank] <= selectedAnomalyRank),
                [Anomaly Count]
            ),
            totalAnomalies
        )

View solution in original post

Ashish_Mathur · ‎06-30-2021

Hi,

Any reason why Anomaly is in the order of B,C,A and not C,B,A when Both B and C have the same count? Also, for this specific question, what is the use of the Tests table. For your expected, all we need is the Test Anomalies Table.

Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

hwong · ‎07-01-2021

Hi,

In case of a tie (in this case B and C have the same count), it doesn't matter what order the Anomalies show up in. It just turns out that Power BI column chart and table showed it in the B,C,A order when I put the data in a visual and sorted it by the anomaly count.

The reason for the Tests table is because I want to filter by date. In the example data, filtering to show only Monday vs Sunday should result in a different ordering. For Sunday only, the pareto ordering becomes "C, A, B", and for Monday only, the pareto ordering becomes "B, C, A".

Thanks,

Hubert

v-yingjl · ‎06-30-2021

Hi @hwong ,

To calculate the running total value, you must need to have a 'sorted column'.

Based on your report, since you use the date column to filter, so needs another 'sorted column' to calculate the running total value.

Refer: Dynamically Calculate A Power BI Running Total Or Cumulative Total

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

hwong · ‎07-01-2021

Hi,

I need to sort by the count of the anomalies, which must be dynamic to account for filters and slicers. This is why I don't know what to use for a "sort column". The sort order is not known until the visual is presented and filters/slicers are accounted for.

Thanks,

Hubert

hwong · ‎07-01-2021

Anyway, I figured it out using some temporary tables to get the "sort column". It was a lot more involved than I had anticipated. Here's the solution I ended up with:

Cumulative % = 
    VAR countSummaryTable =
        CALCULATETABLE(
            SUMMARIZE(
                'Test Anomalies',
                'Test Anomalies'[Anomaly],
                "Anomaly Count", COUNT('Test Anomalies'[Anomaly])
            ),
            REMOVEFILTERS('Test Anomalies'[Anomaly])
        )
    VAR rankedSummaryTable =
        ADDCOLUMNS(
            countSummaryTable,
            "Anomaly Count Rank", RANKX(countSummaryTable, 
                                    RANKX(countSummaryTable, [Anomaly Count] + DIVIDE(
                                                                                RANKX(countSummaryTable, [Anomaly]),
                                                                                COUNTROWS(countSummaryTable) + 1)
                                    ), ,ASC,Dense))
    VAR selectedAnomalyRank = MAXX(FILTER(rankedSummaryTable, [Anomaly]=SELECTEDVALUE('Test Anomalies'[Anomaly])), [Anomaly Count Rank])
    VAR totalAnomalies = SUMX(countSummaryTable, [Anomaly Count])
    RETURN        
        DIVIDE(
            SUMX(
                FILTER(rankedSummaryTable, [Anomaly Count Rank] <= selectedAnomalyRank),
                [Anomaly Count]
            ),
            totalAnomalies
        )

Cumulative % by category for pareto

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

How to Get Your Question Answered Quickly

Jumpstart your career with the Fabric Career Hub

Cumulative % by category for pareto

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

How to Get Your Question Answered Quickly