Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
DanielaAmadeuPr
Frequent Visitor

Select three different columns using measure

Hello everyone,

I'm studying measures through a classic rock top500 ranking which I found on Kaggle.

For my purpose I intend to select three different columns: Music, YearRanking, Top500.

 

Music contains all the music from this database;

YearRanking contains all the years, from 2015 to 2022, that the interview was applied;

Top500 contains a ranking that goes from 0 to 500, where 0 means that nobody rated that song and 500 means that this song was rated, but it's in the last position.

 

Ok, so the problem I'm facing is:

I want to filter this 3 columns to find the 5 most rated song, from 1 to 5, where ALL these songs must be presents in 2015, 2016, 2017, 2018, 2019, 2020, 2021 AND 2022.

 

How can I solve that?

I tried to get the closest I can, using this measure:

 

TopMusic = CALCULATE(COUNTROWS(classic_rock_playlist), classic_rock_playlist[Top500]<>0, classic_rock_playlist]<6)

 

But when I use this as a solution, the logic applied here is: Find the songs that is rated from 1 to 5 in any year from 2015 to 2022.

But what I want is to find the songs that is rated from 1 to 5 in all the years.

 

Thanks in advance.

1 ACCEPTED SOLUTION

@DanielaAmadeuPr It's not a trivial problem to solve. Your filter clause is the problem with your CALCULATE and you really shouldn't be using CALCULATE as a beginner. CALCULATE is an incredibly complex function. With your filter clause you are always going to be in the position of getting music that appears in the top 5 for one or more years but not all years. That's the problem you have to solve for. So, to explain the code:

TopMusicTable = 
/*
First, get a table for all music that appears in the top 5 and group that by Year and a unique Key in this case since there are multiple songs named One. The Key is simply a concatenation of artist and music with a | character in between. Use TOCSV(__Table) in the RETURN to visualize the table that is returned. You can use a Card visual for this.
*/
  VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Key]
    )
/*
This simply counts the distinct Year values so now you know how many years you are dealing with
*/
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
/*
/*
Next, we add a column called __NumYears to our base table. This column calculates how times that music appears in the base table (__Table). It does this by getting the Key adn then counting the rows in the base table where the Key matches. Again, use TOCSV(__Table1) as the RETURN value to visualize this table in a Card visual for example.
*/
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "__NumYears", 
    VAR __Music = [Key]
    VAR __Result = COUNTROWS(FILTER(__Table,[Key] = __Music))
        RETURN
          __Result
    )
/*
Now all we have to do is to filter the table with the additional column (__Table1) where the __NumYears column matches our __NumYears VAR that we created, meaning that the song appeared in all years because they match. We only want the key column as a return value so we use SELEECTCOLUMNS for that and just we only want distinct values so we use DISTINCT as well.
*/
  VAR __Result = DISTINCT(SELECTCOLUMNS(FILTER(__Table1, [__NumYears] = __NumYears),"Key",[Key]))
RETURN
  __Result

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

18 REPLIES 18
Greg_Deckler
Super User
Super User

@DanielaAmadeuPr I'm thinking something like this:

Measure =
  VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]>0 && [Top500]<6)
      [YearRanking],
      [Music]
    )
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
  VAR __Table1 = 
    ADDCOLUMNS(
      __Table,
      "__NumYears" = 
          VAR __Music = [Music]
          VAR __Result = COUNTROWS(FILTER(__Table,[Music] = __Music))
        RETURN
          __Result
    )
  VAR __Result = COUNTROWS(FILTER(__Table1, [__NumYears] = __NumYears))
RETURN
  __Result

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Hello, @Greg_Deckler 
First of all, thanks for helping me.
I tried your code as a solution, but it's only filtering the musics that are in some of the years.

Here is your code:

TopMusic = 
VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Music]
    )
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "NumYears", 
    VAR __Music = [Music]
    VAR __Result = COUNTROWS(FILTER(__Table,[Music] = __Music))
        RETURN
          __Result
    )
  VAR __Result = COUNTROWS(FILTER(__Table1, __NumYears = __NumYears))
RETURN
  __Result


And here is the bar chart:
sample.png

 As you can see in the purple bar, it's a value that occurs only in 2022, but it's present when it should not be present.

😞

@DanielaAmadeuPr Yeah, kind of flying blind without the dataset, is it this one?

Classic Rock - Top 500 songs | Kaggle

If not, which dataset are you using?


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Yes!

@DanielaAmadeuPr OK, here you go. PBIX is attached below signature.

TopMusic = 
VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Music]
    )
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "__NumYears", 
    VAR __Music = [Music]
    VAR __Result = COUNTROWS(FILTER(__Table,[Music] = __Music))
        RETURN
          __Result
    )
  VAR __Result = COUNTROWS(DISTINCT(SELECTCOLUMNS(FILTER(__Table1, [__NumYears] = __NumYears),"__Music",[Music])))
RETURN
  __Result

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Hello, @Greg_Deckler 

Apologies for the delay.
I went back to studying the problem, and came very close to the solution. My only impediment has been replacing SELECTEDVALUE for another measure that fits better.

Here's my new measure:

 

TopMusic =
    CALCULATE(SELECTEDVALUE(classic_rock_playlist[Top500]), classic_rock_playlist[Top500]<6 ,classic_rock_playlist[YearRanking]="2015")
 
The result:
 
classicrock.png
 
As you can see, if one value is present in Top500, it repeats in all the years that follows. But there's only three Top500 that repeats in the following years. So I really believe that my problem is the SELECTEDVALUE, because it returns only one value.

@DanielaAmadeuPr I don't think that SELECTEDVALUE is your issue. See updated PBIX file that presents your visual. PBIX is attached below signature.

Greg_Deckler_0-1675022431351.png

 

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler I see.
Ok, it works better. But as a beginner, is there an easier way to solve that?

I can study this solution of course, but my initial idea was to be capable of solving this task in a way that I can understand.

 

If it's not possible ok, I can handle that. The solution you proposed is incredible and I gonna study it.

 

Thanks for being so patient.

@DanielaAmadeuPr It's not a trivial problem to solve. Your filter clause is the problem with your CALCULATE and you really shouldn't be using CALCULATE as a beginner. CALCULATE is an incredibly complex function. With your filter clause you are always going to be in the position of getting music that appears in the top 5 for one or more years but not all years. That's the problem you have to solve for. So, to explain the code:

TopMusicTable = 
/*
First, get a table for all music that appears in the top 5 and group that by Year and a unique Key in this case since there are multiple songs named One. The Key is simply a concatenation of artist and music with a | character in between. Use TOCSV(__Table) in the RETURN to visualize the table that is returned. You can use a Card visual for this.
*/
  VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Key]
    )
/*
This simply counts the distinct Year values so now you know how many years you are dealing with
*/
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
/*
/*
Next, we add a column called __NumYears to our base table. This column calculates how times that music appears in the base table (__Table). It does this by getting the Key adn then counting the rows in the base table where the Key matches. Again, use TOCSV(__Table1) as the RETURN value to visualize this table in a Card visual for example.
*/
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "__NumYears", 
    VAR __Music = [Key]
    VAR __Result = COUNTROWS(FILTER(__Table,[Key] = __Music))
        RETURN
          __Result
    )
/*
Now all we have to do is to filter the table with the additional column (__Table1) where the __NumYears column matches our __NumYears VAR that we created, meaning that the song appeared in all years because they match. We only want the key column as a return value so we use SELEECTCOLUMNS for that and just we only want distinct values so we use DISTINCT as well.
*/
  VAR __Result = DISTINCT(SELECTCOLUMNS(FILTER(__Table1, [__NumYears] = __NumYears),"Key",[Key]))
RETURN
  __Result

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Hello @Greg_Deckler 

I was wondering... if I can't use CALCULATE as a beginner, what should I use instead?

 

Thanks

@DanielaAmadeuPr Use this approach (below). Also, my new channel DAX For Humans is all about this technique although it's very early in the series so far.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler 

With your table solution, I now have a new problem: I need to filter the Blank songs generated by your table. I don't see it as an effective solution.

 

 

teste.png

@DanielaAmadeuPr 

You could do this, updated PBIX see page 2 attached.

Rankings Measure = 
VAR __Table = 
    SUMMARIZE(
      FILTER(ALL('classic_rock_playlist'),[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Key]
    )
  VAR __NumYears = COUNTROWS(DISTINCT(ALL('classic_rock_playlist'[YearRanking])))
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "__NumYears", 
    VAR __Music = [Key]
    VAR __Result = COUNTROWS(FILTER(__Table,[Key] = __Music))
        RETURN
          __Result
    )
  VAR __TopTable = DISTINCT(SELECTCOLUMNS(FILTER(__Table1, [__NumYears] = __NumYears),"Key",[Key]))
  VAR __Key = MAX('classic_rock_playlist'[Key])
  VAR __Result = IF(__Key IN SELECTCOLUMNS(__TopTable,"__Key",[Key]),MAX('classic_rock_playlist'[Top500]), BLANK())
RETURN
  __Result

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thanks a lot for explaining even the tiny details. I'll keep studying DAX and CALCULATE even more.

Hey, @Greg_Deckler 

Thanks again for helping me.

The requirements is:
The values of the ranking must be from 1 to 5;
The songs that are from 1 to 5 must be present in all years from 2015 until 2022.
There are only three songs that follow those rules:
Stairway to heaven, One, and Bohemian Rhapsody.

Thanks again!!!

@DanielaAmadeuPr Right, the PBIX file does exactly that and comes out with an answer of 3. You can even revise the measure to CONCATENATEX them up so you get the actual songs back. Here's the CONCATENATEX version:

TopMusic2 = 
VAR __Table = 
    SUMMARIZE(
      FILTER('classic_rock_playlist',[Top500]<>0 && [Top500]<6),
      [YearRanking],
      [Music]
    )
  VAR __NumYears = COUNTROWS(DISTINCT('classic_rock_playlist'[YearRanking]))
  VAR __Table1 = 
    ADDCOLUMNS(__Table, "__NumYears", 
    VAR __Music = [Music]
    VAR __Result = COUNTROWS(FILTER(__Table,[Music] = __Music))
        RETURN
          __Result
    )
  VAR __Result = CONCATENATEX(DISTINCT(SELECTCOLUMNS(FILTER(__Table1, [__NumYears] = __NumYears),"__Music",[Music])),[__Music],",")
RETURN
  __Result

 

Greg_Deckler_0-1674424037830.png

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
FreemanZ
Super User
Super User

hi @DanielaAmadeuPr 

what context do you have for the measure?

try like:

TopMusic = 
CALCULATE(
        COUNTROWS(classic_rock_playlist), 
        classic_rock_playlist[Top500]<>0, 
        classic_rock_playlist[Top500]<6,
        classic_rock_playlist[YearRanking]>=2015,
        classic_rock_playlist[YearRanking]<=2022
)

Hello @FreemanZ ,
I was trying exactly this as a solution. But the return I get is the same: I can see all the musics ranking from 1 to 5 that are in one year or another.

 

Thanks.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.