Solved: Calculating an average of an average

freddiec03 · ‎01-26-2023

I am trying to find an average per user based off an average measure from two columns. Here is some sample data. This table made up of data from multiple tables. The table below it is the expected output I would like to see. Whenever I attempt this im just using the same measure for (Average Accuracry) as the same as (accuracry). The measure below works for the initial table but for the end game it appears that its taking the sum of all the prices paid divided by the sum of all the highs set and giving a much high %

Accuracy (%) = if(ISBLANK(max(table[high])),BLANK(),sum(Total[Paid])/max(table[high]))

Main Table
			*measure
Item	Paid	High	Accuracy	High Setter
a	2	3	67%	Bob
b	3	3	100%	Steve
c	5	4	125%	Jim
d	5	6	83%	Bob
e	2	2	100%	Lisa
f	3	4	75%	Steve
g	5	4	125%	Jim
h	4	4	100%	Jim
i	4	5	80%	Steve
j	2	3	67%	Bob
k	1	1	100%	Lisa

Output
High Setter	Average Accuracy
Bob	72.30%
Steve	85%
Jim	116.67%
Lisa	100%

lbendlin · ‎01-28-2023

Accuracy =
VAR a =
    ADDCOLUMNS(
        VALUES( 'Change Log'[ID] ),
        "HD",
            VAR mxid = [ID]
            RETURN
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                ),
        "HS",
            VAR mxid = [ID]
            VAR mxd =
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                )
            RETURN
                CALCULATE(
                    MAX( 'Change Log'[Name] ),
                    ALLSELECTED( 'Change Log' ),
                    'Change Log'[ID] = mxid,
                    'Change Log'[Date] = mxd
                ),
        "HV",
            VAR mxid = [ID]
            VAR mxd =
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                )
            RETURN
                CALCULATE(
                    SUM( 'Change Log'[High] ),
                    ALLSELECTED( 'Change Log' ),
                    'Change Log'[ID] = mxid,
                    'Change Log'[Date] = mxd
                ),
        "PDT",
            VAR mxid = [ID]
            RETURN
                CALCULATE( SUM( 'Payment Details'[Total] ), 'Payment Details'[ID] = mxid )
    )
VAR b =
    FILTER( a, [HS] = MAX( 'Change Log'[Name] ) )
RETURN
    AVERAGEX( b, DIVIDE( [HV], [PDT], 0 ) )

View solution in original post

Greg_Deckler · ‎01-27-2023

@freddiec03 It's either one of these two solutions but I can't be sure which:

First, please vote for this idea: https://ideas.powerbi.com/ideas/idea/?ideaid=082203f1-594f-4ba7-ac87-bb91096c742e

This looks like a measure totals problem. Very common. See my post about it here: https://community.powerbi.com/t5/DAX-Commands-and-Tips/Dealing-with-Measure-Totals/td-p/63376

Also, this Quick Measure, Measure Totals, The Final Word should get you what you need:
https://community.powerbi.com/t5/Quick-Measures-Gallery/Measure-Totals-The-Final-Word/m-p/547907

Also: https://youtu.be/uXRriTN0cfY
And: https://youtu.be/n4TYhF2ARe8

or:

This looks like a measure aggregation problem. See my blog article about that here: https://community.powerbi.com/t5/Community-Blog/Design-Pattern-Groups-and-Super-Groups/ba-p/138149

The pattern is:
MinScoreMeasure = MINX ( SUMMARIZE ( Table, Table[Group] , "Measure",[YourMeasure] ), [Measure])
MaxScoreMeasure = MAXX ( SUMMARIZE ( Table, Table[Group] , "Measure",[YourMeasure] ), [Measure])
AvgScoreMeasure = AVERAGEX ( SUMMARIZE ( Table, Table[Group] , "Measure",[YourMeasure] ), [Measure])
etc.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

grantsamborn · ‎01-26-2023

Hi @freddiec03

Have you considered using a calculated table? (or is that cheating?)

/********** Measure **********/
Accuracy (%) = 
IF(
    ISBLANK( MAX( 'Table'[High] ) ),
    BLANK(),
    SUM( 'Table'[Paid] )
        / MAX( 'Table'[High] )
)

/********** Calculated Table */
zSummary = 
SUMMARIZECOLUMNS(
    'Table'[Item],
    'Table'[High Setter],
    "Accuracy", [Accuracy (%)]
)

/********** Measure **********/
zAvg = 
AVERAGEX(
    FILTER(
        ALL( 'zSummary' ),
        'zSummary'[High Setter] = SELECTEDVALUE( zSummary[High Setter] )
    ),
    [Accuracy]
)

pbix: Average_of_an_Average.pbix

freddiec03 · ‎01-26-2023

So I actually did a bad job at explaining this.... in your item column the goal is to have a unique item ID but whenever I bring over the High data there is multiple highs for each item... I want to just look at the most recent high submission. (there is a datetime column to create a measure if needed just not sure how to proceed)

grantsamborn · ‎01-26-2023

Hi @freddiec03

Is there any way you could come up with a pbix with sample data?

Grant

freddiec03 · ‎01-27-2023

Average of an Average

first time posting a pbix file on here so let me know. One not to add is on the Date column that the format is DateTime and has times inserted as well in the same column. Just simplified the data as easy as possible for you though. The way the table is set up is how I have tried connecting 3 different data sets. I created one measure for Accuracy % as well. Let me know your thoughts or if you have any questions! Thanks!

grantsamborn · ‎01-27-2023

Hi @freddiec03

Are these your expected results?

Bob	156.07
Dan	157.59
Jill	103.41
Jim	135.53

freddiec03 · ‎01-27-2023

I would like to see the final two tables to look like the ones below. I only want to show the most recent high based off date and for the user average I would like it to only calculate for the ones represented in the table not for everyone that has inputed a high (as you can see in the pbix that multiple ppl have submitted a high but the table will only reflect the latest submission)

ID	Company	Total (Paid)	High	Change Accuracy	Name (High Setter)	Date	Manager
1	Sheetz	10	9	111%	Dan	1/14/2023	Lisa
2	McDonalds	7	8	87.50%	Bob	1/15/2023	Steve
3	Arby	8	6	133%	Jim	1/18/2023	Wes
4	Wendys	6	7	85.71%	Dan	1/26/2023	Jess
5	Subway	9	8	112.50%	JIm	1/16/2023	Mike
6	Walmart	8	6	133.33%	JIm	1/22/2023	Ken
7	Chick Fila	10	7	142.86%	Jim	1/22/2023	Bill

High Setter	Accuracy
Dan	98%
Bob	87.50%
Jim	131%

grantsamborn · ‎01-27-2023

Hi @freddiec03

I'm not having much luck at this. (tried changing relationships, "average of average" pattern...)

Maybe you should call in some of the big-guns...

Ashish_Mathur · ‎01-27-2023

Just cannot understand your question. Explain the expected result in detail.

Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

grantsamborn · ‎01-27-2023

Hi @Ashish_Mathur

I sent Freddie a msg on the side for clarification. These seem to be the required steps.

1. For each company, filter out all records except for the latest for that company to find [Setter] (employee).
2. For each record in the result set, calculate [Change Accuracy %].
3. For each employee in the result set, calculate an average of that employee's [Change Accuracy %] amounts.

lbendlin · ‎01-28-2023

(as you can see in the pbix that multiple ppl have submitted a high but the table will only reflect the latest submission)

That is an assumption bordering on a fallacy. How can you guarantee that each company has at least one entry in the change log? What should be done if there are two changes for the same company and date?

Please let me know why the sum total in my version doesn't match your sample.

For the second deliverable I would change the perspective and iterate through the list of names, check if they are high setter for any of the companies, and then apply the averagex.

freddiec03 · ‎01-30-2023

for the question on what if a high is submitted on the same day the format and data I am pulling is pulling DateTime so we have a time aspect as well... in my example I just used a simplified version of the data

lbendlin · ‎01-28-2023

Accuracy =
VAR a =
    ADDCOLUMNS(
        VALUES( 'Change Log'[ID] ),
        "HD",
            VAR mxid = [ID]
            RETURN
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                ),
        "HS",
            VAR mxid = [ID]
            VAR mxd =
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                )
            RETURN
                CALCULATE(
                    MAX( 'Change Log'[Name] ),
                    ALLSELECTED( 'Change Log' ),
                    'Change Log'[ID] = mxid,
                    'Change Log'[Date] = mxd
                ),
        "HV",
            VAR mxid = [ID]
            VAR mxd =
                CALCULATE(
                    MAX( 'Change Log'[Date] ),
                    REMOVEFILTERS( 'Change Log'[Name] ),
                    'Change Log'[ID] = mxid
                )
            RETURN
                CALCULATE(
                    SUM( 'Change Log'[High] ),
                    ALLSELECTED( 'Change Log' ),
                    'Change Log'[ID] = mxid,
                    'Change Log'[Date] = mxd
                ),
        "PDT",
            VAR mxid = [ID]
            RETURN
                CALCULATE( SUM( 'Payment Details'[Total] ), 'Payment Details'[ID] = mxid )
    )
VAR b =
    FILTER( a, [HS] = MAX( 'Change Log'[Name] ) )
RETURN
    AVERAGEX( b, DIVIDE( [HV], [PDT], 0 ) )

freddiec03 · ‎01-31-2023

This gets me very close to where I need to be but the biggest problem I am seeing is for the PDT column it is returning a calculated value rather than a data point that was set. The return values of 17,12,15,10 arent values that were submitted by the High setters.

lbendlin · ‎01-31-2023

Please see my question further up.

Ashish_Mathur · ‎01-27-2023

I cannot understand. Someone else who does will help. Sorry.

Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

grantsamborn · ‎01-27-2023

Hi @freddiec03

I took a look at your model and am trying a couple of other methods but with no luck yet.

(The model has changed quite a bit.)

I'll let you know if I have any luck.

freddiec03 · ‎01-27-2023

Yea apoligies as I didnt do to well explianing to begin with as some data is sensitive to share.... new to the forums here so if theres anything else you need please let me know

freddiec03 · ‎01-26-2023

When creating a calculate table Im getting an error shot back at me "The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value."

grantsamborn · ‎01-26-2023

Hi @freddiec03

Have you checked the pbix that I attached?

This might seem like a simple question but... Are you sure you created a Calculated Table, not a Calculated Column? the error you describe appears in that case.

Calculating an average of an average

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly