Solved: Re: Complex Filtering and Ranking Issue

rob2 · ‎12-26-2019

I have a very simple dataset. A competitors and calendar table (PBIX file attached). However, having issues trying to create a measure to account for multiple criteria. Faily new to Power BI.....

Competitors table: competitor_name, SKU, scrape_date, price, first_scrape_date.

Calendar table: date

I have two slicers that I would like the user to control:

1) date

2) # of competitors to show

I would like to show a line chart with the following criteria:

Criteria 1) Filter competitors & items that have a "first_scrape_date" prior to 6/29/2016.

Critera 2) Sum the # of scrapes for the "eligible" competitor/items from #1 between 6/29/2016 and 12/1/2016 by Vendor.

Criteria 3) Take the top 2 competitors with the most scrapes from #2 and plot them on a line chart.

It should look like this (created in Excel):

PBIX File: https://www.dropbox.com/s/n618p4a2wz0ifer/Competitor_Example%20v3.pbix?dl=0

Thanks for your help.

kentyler · ‎12-31-2019

Thank you for giving such a clear explanation.
Here is a partial solution. I have worked it out with multiple small measures so I could debug it as I went

My first step was to create a slider. I wanted a slider that would only set a date range and not filter the competitors table, so I created a new calendar table and based by slider on that.
I went round and round on this problem. Finally i decided that it was a question about the competitors, and not about the records of the scrapes. So for clarity I created a table of competitors (this would be a dimension table)

Now I wrote a measure to calculate the total scrapes for each competitor. I ran it through daxformatter.com to make it easier to understand

Total Scrapes =
VAR competitor =
    SELECTEDVALUE ( Comp[Competitor] )
VAR filter_start_date =
    MIN ( Calendar_Slicer[Date] )
VAR filter_end_date =
    MAX ( Calendar_Slicer[Date] )
VAR scrapes =
    CALCULATE (
        COUNTROWS ( Competitors ),
        ALL ( Competitors ),
        Competitors[comp_name] = competitor,
        Competitors[first_scrape_date] < filter_start_date,
        Competitors[scrape_date] >= filter_start_date,
        Competitors[scrape_date] <= filter_end_date
    )
RETURN
    scrapes

First the code gets the competitor from the competitor table that is in the filter context (SELECTEDVALUE)

Then it gets the first and last dates from the slicer. One of the complications of this logic is that it is not just about one date condition, but 2.
Then it uses CALCULATE to filter the rows of the contributors table.

It applies ALL() to remove any external filters

then it filters for competitors who first scrape date is before the first date in the slicer

then it filters for scrapes with dates between the start and end dates of the slicer

Now that we have a measure to count scrapes, we apply TOPN to create a table listing the top 2 competitors
This is the table measure:

Top 2 = topn(2,Comp,[Total Scrapes])
,It gets the top 2 records from the Comp table (the table of Competitors) and runs the measure Total Scrapes to get the number of scrapes for each competitor.
From your example it looks like you then want to calculate the average amount for each of these competitors for the start date of the slicer date range to the end date ? Do you want to get the average per day, per month, or just the start and end averages. Those would be further calculations that would build on the measures I've described here.

Did this post answer your question? Mark it as a solution so others can find it!

Help when you know. Ask when you don't!

Join the conversation at We Talk BI find out more about me at Slow BI

View solution in original post

kentyler · ‎12-31-2019

Thank you for giving such a clear explanation.
Here is a partial solution. I have worked it out with multiple small measures so I could debug it as I went

My first step was to create a slider. I wanted a slider that would only set a date range and not filter the competitors table, so I created a new calendar table and based by slider on that.
I went round and round on this problem. Finally i decided that it was a question about the competitors, and not about the records of the scrapes. So for clarity I created a table of competitors (this would be a dimension table)

Now I wrote a measure to calculate the total scrapes for each competitor. I ran it through daxformatter.com to make it easier to understand

Total Scrapes =
VAR competitor =
    SELECTEDVALUE ( Comp[Competitor] )
VAR filter_start_date =
    MIN ( Calendar_Slicer[Date] )
VAR filter_end_date =
    MAX ( Calendar_Slicer[Date] )
VAR scrapes =
    CALCULATE (
        COUNTROWS ( Competitors ),
        ALL ( Competitors ),
        Competitors[comp_name] = competitor,
        Competitors[first_scrape_date] < filter_start_date,
        Competitors[scrape_date] >= filter_start_date,
        Competitors[scrape_date] <= filter_end_date
    )
RETURN
    scrapes

First the code gets the competitor from the competitor table that is in the filter context (SELECTEDVALUE)

Then it gets the first and last dates from the slicer. One of the complications of this logic is that it is not just about one date condition, but 2.
Then it uses CALCULATE to filter the rows of the contributors table.

It applies ALL() to remove any external filters

then it filters for competitors who first scrape date is before the first date in the slicer

then it filters for scrapes with dates between the start and end dates of the slicer

Now that we have a measure to count scrapes, we apply TOPN to create a table listing the top 2 competitors
This is the table measure:

Top 2 = topn(2,Comp,[Total Scrapes])
,It gets the top 2 records from the Comp table (the table of Competitors) and runs the measure Total Scrapes to get the number of scrapes for each competitor.
From your example it looks like you then want to calculate the average amount for each of these competitors for the start date of the slicer date range to the end date ? Do you want to get the average per day, per month, or just the start and end averages. Those would be further calculations that would build on the measures I've described here.

Did this post answer your question? Mark it as a solution so others can find it!

Help when you know. Ask when you don't!

Join the conversation at We Talk BI find out more about me at Slow BI

rob2 · ‎01-06-2020

Thanks for looking into this.

Below is the link to the updated PBIX file including your measures:

https://www.dropbox.com/s/do8alneb5i1sh1d/Competitor_Example%20v3.pbix?dl=0

I changed the measure your provided to get the average price for each competitor. Looking good so far:

However, when I try to graph the measure I get data for competitors C & D. Is there a way to get rid of them and only graph the data points for competitor A & B? Thanks!!

kentyler · ‎01-06-2020

I was able to fix the graph by applying the same filter as the one you used in the upper right hand table.

I'm a personal Power Bi Trainer I learn something every time I answer a question

The Golden Rules for Power BI

Use a Calendar table. A custom Date tables is preferable to using the automatic date/time handling capabilities of Power BI. https://www.youtube.com/watch?v=FxiAYGbCfAQ
Build your data model as a Star Schema. Creating a star schema in Power BI is the best practice to improve performance and more importantly, to ensure accurate results! https://www.youtube.com/watch?v=1Kilya6aUQw
Use a small set up sample data when developing. When building your measures and calculated columns always use a small amount of sample data so that it will be easier to confirm that you are getting the right numbers.
Store all your intermediate calculations in VARs when you’re writing measures. You can return these intermediate VARs instead of your final result to check on your steps along the way.

Did this post answer your question? Mark it as a solution so others can find it!

Help when you know. Ask when you don't!

Join the conversation at We Talk BI find out more about me at Slow BI

rob2 · ‎01-06-2020

I would like the filtering to be done dynamically without worrying about what TOPN is selected. In practice - dataset is much bigger and complicated.

Thanks.

kentyler · ‎01-06-2020

How about doing a screen share and taking a look. As the question gets more complex it gets harder and harder to deal with it by posting. Just email me ( ken@8thfold.com ) a time and day and I'll send you a meeting invitation.

Did this post answer your question? Mark it as a solution so others can find it!

Help when you know. Ask when you don't!

Join the conversation at We Talk BI find out more about me at Slow BI

Complex Filtering and Ranking Issue

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024