Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
am_i_really
Frequent Visitor

DAX Average

Hi All,

I'm working on a test project for school but I'm lost...  I was given a sample report and I have to create a matrix table like below:

am_i_really_0-1675715223006.png

This is the question:

We'd like to compare how durable the Director/Actor relationships are by continent, represented by Netflix releases. We'd like to measure this by average number of movies done together, average length of time between their first and last movie together, and average number of movies done together per year of director's longevity. Please put together a matrix visualization with the results of this analysis and please also try to complete this challenge without modifying the data model. That means no calculated columns, calculated tables, or query modifications

 

I've attached the file: https://1drv.ms/u/s!AnJmyZTL2iv4gZEhp2wzhSZzgZyrDw?e=YpEK8c 

Any help at all will be highly appreciated. Thank you in advance!

1 ACCEPTED SOLUTION

@am_i_really Try this as I believe this provides the results shown:

 

AvgMoviesTogether 2 = 
    VAR __Table = SUMMARIZE(FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),[director],[cast],"__MoviesTogether", COUNTROWS(DISTINCT('Netflix'[show_id])))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

 

 

Now, as to which is "correct", that's a bit of a different matter. The dataset seems kind of jank. For example, HARRY BRADBEER as director did only one show, s2010 but for some reason the Netflix table lists it basically three times, once for Dramas, Children & Family Movies and Action & Adventure. Seems like kind of a hacky way to put the same thing in three categories I guess. So, the GROUPBY method counts this as 3 because it does not distinguish DISTINCT show_id while the second method using SUMMARIZE counts this "correctly" as 1 since it accounts for DISTINCT show_id.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

7 REPLIES 7
sandy2112
Frequent Visitor

Average Movies Together =
AVERAGEX(
    SUMMARIZE(
        FILTER(Netflix,Netflix[director]<>BLANK() || Netflix[cast]<>BLANK() && CONTAINSSTRING(Netflix[subcategory],"Movies")),
        Netflix[director],
        Netflix[cast],
        "No of Movies Together",DISTINCTCOUNT(Netflix[show_id])
    ),
   [No of Movies Together]
)
AverageReleaseLength =
    AVERAGEX(        
        SUMMARIZE(
          FILTER(Netflix,Netflix[director] <> BLANK() && Netflix[cast] <> BLANK() &&CONTAINSSTRING(Netflix[subcategory],"Movies")),
          Netflix[director],
          Netflix[cast],
          "Min of Release Year",MIN(Netflix[Year_added]),
          "Max of Release Year",MAX(Netflix[Year_added])
        ),
        [Max of Release Year]-[Min of Release Year]+1
    )
----
Average Longevity is still showing some different values when we keep under continent
Average Longevity =
VAR T1 =
    SUMMARIZE (
        FILTER (
            'netflix',
            [director] <> BLANK ()
                && [cast] <> BLANK ()
                && [show_id] <> BLANK ()
                && [director] <> [cast]
                && RELATED ( 'Content'[category] ) = "Movie"
        ),
        'netflix'[director],
        'netflix'[cast],
        "Longevity",
            CALCULATE (
                MAX ( 'Content'[release_year] ) - MIN ( 'Content'[release_year] ) + 1,
                ALLEXCEPT ( Netflix, 'Netflix'[director] ),
                CROSSFILTER ( 'Content'[show_id], Netflix[show_id], BOTH )
            ),
        "# of movies", COUNT(Netflix[cast])
    )
RETURN
    DIVIDE ( SUMX ( T1, [# of movies] ), SUMX ( T1, [Longevity] ) )
Greg_Deckler
Super User
Super User

@am_i_really Here is the first one. The others should be a fairly similar pattern. Not sure if the screen shot is supposed to be the answers because I get different numbers. In the formula below I provide 2 options. __Table filters out the rows where the director is blank (null) which I feel would be the right way to do it. __Table1 does not include this filter. Currently, this formula returns results for __Table and not __Table1 but you can easily switch it.

 

AvgMoviesTogether = 
    VAR __Table = GROUPBY(FILTER('Netflix',[director]<>BLANK()),[director],[cast],"__MoviesTogether", COUNTX(CURRENTGROUP(), [show_id]))
    VAR __Table1 = GROUPBY('Netflix',[cast],"__MoviesTogether", COUNTX(CURRENTGROUP(), [show_id]))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

If your professor gives you low marks for not using CALCULATE, refer them to this:

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler 

 

Thank you for your answer but its not giving the result that he specified. The screenshot that was attached is what he said the results have to be. Any idea how we may be able to get that?

Did you find any solution for the other 2 columns? 

@am_i_really Try this as I believe this provides the results shown:

 

AvgMoviesTogether 2 = 
    VAR __Table = SUMMARIZE(FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),[director],[cast],"__MoviesTogether", COUNTROWS(DISTINCT('Netflix'[show_id])))
    VAR __Result = AVERAGEX(__Table,[__MoviesTogether])
RETURN
    __Result

 

 

Now, as to which is "correct", that's a bit of a different matter. The dataset seems kind of jank. For example, HARRY BRADBEER as director did only one show, s2010 but for some reason the Netflix table lists it basically three times, once for Dramas, Children & Family Movies and Action & Adventure. Seems like kind of a hacky way to put the same thing in three categories I guess. So, the GROUPBY method counts this as 3 because it does not distinguish DISTINCT show_id while the second method using SUMMARIZE counts this "correctly" as 1 since it accounts for DISTINCT show_id.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thank you so much for your help and the explanation, the solution worked! If you don't mind can you please help with the other 2 columns as well?

@am_i_really I'm not sure how the other two columns are being calculated in terms of how those numbers were arrived at. For example, if I use what I think is logical I get different results. PBIX is attached beneath signature.

AvgRelLength = 
    VAR __Table = 
        SUMMARIZE(
            FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),
            [director],[cast],[show_id],
            "__ReleaseYear", MAXX(FILTER('Content',[show_id] = EARLIER([show_id])),[release_year])
        )
    VAR __Table1 = 
        ADDCOLUMNS(
            SUMMARIZE(
                __Table,
                [director],[cast],
                "__MinYear",MINX(FILTER(__Table,[director]=EARLIER([director]) && [cast] = EARLIER([cast])), [__ReleaseYear]),
                "__MaxYear",MAXX(FILTER(__Table,[director]=EARLIER([director]) && [cast] = EARLIER([cast])), [__ReleaseYear])
            ),
            "__Diff", [__MaxYear] - [__MinYear]
        )
    VAR __Result = AVERAGEX(FILTER(__Table1,[__Diff]>0),[__Diff])
RETURN
    __Result

and

AvgMoviesLong = 
    VAR __Table = 
        SUMMARIZE(
            FILTER('Netflix',[director]<>BLANK() && [cast] <> BLANK() && [director]<>[cast]),
            [director],[cast],[show_id],
            "__ReleaseYear", MAXX(FILTER('Content',[show_id] = EARLIER([show_id])),[release_year])
        )
    VAR __Table1 = 
        ADDCOLUMNS(
            ADDCOLUMNS(
                SUMMARIZE(
                    __Table,
                    [director],[cast],
                    "__MinYear",MINX(FILTER(__Table,[director]=EARLIER([director])), [__ReleaseYear]),
                    "__MaxYear",MAXX(FILTER(__Table,[director]=EARLIER([director])), [__ReleaseYear]),
                    "__Count",COUNTROWS(FILTER(__Table,[director] = EARLIER([director]) && [cast] = EARLIER([cast])))
                ),
                "__Diff", [__MaxYear] - [__MinYear]
            ),
            "__NumPerYear", DIVIDE([__Count],[__Diff],0)
        )
    VAR __Result = AVERAGEX(FILTER(__Table1,[__Diff]>0),[__NumPerYear])
RETURN
    __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors