Solved: DAX Average

am_i_really · ‎02-06-2023

Hi All,

I'm working on a test project for school but I'm lost... I was given a sample report and I have to create a matrix table like below:

This is the question:

We'd like to compare how durable the Director/Actor relationships are by continent, represented by Netflix releases. We'd like to measure this by average number of movies done together, average length of time between their first and last movie together, and average number of movies done together per year of director's longevity. Please put together a matrix visualization with the results of this analysis and please also try to complete this challenge without modifying the data model. That means no calculated columns, calculated tables, or query modifications

I've attached the file: https://1drv.ms/u/s!AnJmyZTL2iv4gZEhp2wzhSZzgZyrDw?e=YpEK8c

Any help at all will be highly appreciated. Thank you in advance!

Greg_Deckler · ‎02-07-2023

@am_i_really Try this as I believe this provides the results shown:

AvgMoviesTogether 2 = 
    VAR __Table = SUMMARIZE(FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),[director],[cast],"__MoviesTogether", COUNTROWS(DISTINCT('Netflix'[show_id])))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

Now, as to which is "correct", that's a bit of a different matter. The dataset seems kind of jank. For example, HARRY BRADBEER as director did only one show, s2010 but for some reason the Netflix table lists it basically three times, once for Dramas, Children & Family Movies and Action & Adventure. Seems like kind of a hacky way to put the same thing in three categories I guess. So, the GROUPBY method counts this as 3 because it does not distinguish DISTINCT show_id while the second method using SUMMARIZE counts this "correctly" as 1 since it accounts for DISTINCT show_id.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

sandy2112 · ‎06-13-2023

Average Movies Together =

AVERAGEX(

SUMMARIZE(

FILTER(Netflix,Netflix[director]<>BLANK() || Netflix[cast]<>BLANK() && CONTAINSSTRING(Netflix[subcategory],"Movies")),

Netflix[director],

Netflix[cast],

"No of Movies Together",DISTINCTCOUNT(Netflix[show_id])

),

[No of Movies Together]

)

AverageReleaseLength =

AVERAGEX(

SUMMARIZE(

FILTER(Netflix,Netflix[director] <> BLANK() && Netflix[cast] <> BLANK() &&CONTAINSSTRING(Netflix[subcategory],"Movies")),

Netflix[director],

Netflix[cast],

"Min of Release Year",MIN(Netflix[Year_added]),

"Max of Release Year",MAX(Netflix[Year_added])

),

[Max of Release Year]-[Min of Release Year]+1

)

----

Average Longevity is still showing some different values when we keep under continent

Average Longevity =

VAR T1 =

SUMMARIZE (

FILTER (

'netflix',

[director] <> BLANK ()

&& [cast] <> BLANK ()

&& [show_id] <> BLANK ()

&& [director] <> [cast]

&& RELATED ( 'Content'[category] ) = "Movie"

),

'netflix'[director],

'netflix'[cast],

"Longevity",

CALCULATE (

MAX ( 'Content'[release_year] ) - MIN ( 'Content'[release_year] ) + 1,

ALLEXCEPT ( Netflix, 'Netflix'[director] ),

CROSSFILTER ( 'Content'[show_id], Netflix[show_id], BOTH )

),

"# of movies", COUNT(Netflix[cast])

)

RETURN

DIVIDE ( SUMX ( T1, [# of movies] ), SUMX ( T1, [Longevity] ) )

Greg_Deckler · ‎02-06-2023

@am_i_really Here is the first one. The others should be a fairly similar pattern. Not sure if the screen shot is supposed to be the answers because I get different numbers. In the formula below I provide 2 options. __Table filters out the rows where the director is blank (null) which I feel would be the right way to do it. __Table1 does not include this filter. Currently, this formula returns results for __Table and not __Table1 but you can easily switch it.

AvgMoviesTogether = 
    VAR __Table = GROUPBY(FILTER('Netflix',[director]<>BLANK()),[director],[cast],"__MoviesTogether", COUNTX(CURRENTGROUP(), [show_id]))
    VAR __Table1 = GROUPBY('Netflix',[cast],"__MoviesTogether", COUNTX(CURRENTGROUP(), [show_id]))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

If your professor gives you low marks for not using CALCULATE, refer them to this:

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

am_i_really · ‎02-07-2023

@Greg_Deckler

Thank you for your answer but its not giving the result that he specified. The screenshot that was attached is what he said the results have to be. Any idea how we may be able to get that?

M609 · ‎10-14-2023

Did you find any solution for the other 2 columns?

Greg_Deckler · ‎02-07-2023

@am_i_really Try this as I believe this provides the results shown:

AvgMoviesTogether 2 = 
    VAR __Table = SUMMARIZE(FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),[director],[cast],"__MoviesTogether", COUNTROWS(DISTINCT('Netflix'[show_id])))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

Now, as to which is "correct", that's a bit of a different matter. The dataset seems kind of jank. For example, HARRY BRADBEER as director did only one show, s2010 but for some reason the Netflix table lists it basically three times, once for Dramas, Children & Family Movies and Action & Adventure. Seems like kind of a hacky way to put the same thing in three categories I guess. So, the GROUPBY method counts this as 3 because it does not distinguish DISTINCT show_id while the second method using SUMMARIZE counts this "correctly" as 1 since it accounts for DISTINCT show_id.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

am_i_really · ‎02-07-2023

Thank you so much for your help and the explanation, the solution worked! If you don't mind can you please help with the other 2 columns as well?

Greg_Deckler · ‎02-07-2023

@am_i_really I'm not sure how the other two columns are being calculated in terms of how those numbers were arrived at. For example, if I use what I think is logical I get different results. PBIX is attached beneath signature.

AvgRelLength = 
    VAR __Table = 
        SUMMARIZE(
            FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),
            [director],[cast],[show_id],
            "__ReleaseYear", MAXX(FILTER('Content',[show_id] = EARLIER([show_id])),[release_year])
        )
    VAR __Table1 = 
        ADDCOLUMNS(
            SUMMARIZE(
                __Table,
                [director],[cast],
                "__MinYear",MINX(FILTER(__Table,[director]=EARLIER([director]) && [cast] = EARLIER([cast])), [__ReleaseYear]),
                "__MaxYear",MAXX(FILTER(__Table,[director]=EARLIER([director]) && [cast] = EARLIER([cast])), [__ReleaseYear])
            ),
            "__Diff", [__MaxYear] - [__MinYear]
        )
    VAR __Result = AVERAGEX(FILTER(__Table1,[__Diff]>0),[__Diff])
RETURN
    __Result

and

AvgMoviesLong = 
    VAR __Table = 
        SUMMARIZE(
            FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),
            [director],[cast],[show_id],
            "__ReleaseYear", MAXX(FILTER('Content',[show_id] = EARLIER([show_id])),[release_year])
        )
    VAR __Table1 = 
        ADDCOLUMNS(
            ADDCOLUMNS(
                SUMMARIZE(
                    __Table,
                    [director],[cast],
                    "__MinYear",MINX(FILTER(__Table,[director]=EARLIER([director])), [__ReleaseYear]),
                    "__MaxYear",MAXX(FILTER(__Table,[director]=EARLIER([director])), [__ReleaseYear]),
                    "__Count",COUNTROWS(FILTER(__Table,[director] = EARLIER([director]) && [cast] = EARLIER([cast])))
                ),
                "__Diff", [__MaxYear] - [__MinYear]
            ),
            "__NumPerYear", DIVIDE([__Count],[__Diff],0)
        )
    VAR __Result = AVERAGEX(FILTER(__Table1,[__Diff]>0),[__NumPerYear])
RETURN
    __Result

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

DAX Average

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024