Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
messyjesse_
Frequent Visitor

Cohort comparison over a date range--overlapping values on a line chart

Hi all,

 

I have a very basic multidimensional dataset: a default rate on annual cohorts, per week, per location. I'm using a standard star schema data model (first screenshot below). For this project, "location" and "college" are synonymous.

 

Based on the earliest cohort's starting date, I calculated out "FirstDayOfWeek" for each record. Each cohort is tracked over a period of several years. With a FirstDayOfWeek calculation, data points can occur on the same day. For an extremely simplified example, the 53rd week of the 2015 cohort and the 1st week of the 2016 cohort will have the same "FirstDayOfWeek". Some mocked-up data is shown in the second screenshot below. (I have seven total locations.)

 

The basic 2D line chart only permits a single field for Legend and Value--as such, I am currently averaging the default rate across all locations for each cohort year and plotting them against the date range. This outputs nicely when combined with tooltips to describe the week recorded for each cohort (third screenshot). The lines stop at the end of each cohort's reporting period. (2016 and 2017 are still active.)

 

What I'd like to do is have a separate line for each location in each cohort. Changing the legend from Cohort to Location gets me really close (visually) to what I want to accomplish, but as you can see in the fourth screenshot, the average is including all data points reported on a given date, including overlapping cohort data.

 

Any thoughts? If any of this is unclear, let me know. I could provide the PBIX file if needed but I would need to change some of the data elements.

 

Thanks in advance!

 

 

Star schema data modelStar schema data model

 

Sample data showing three dimensions: cohort, week, locationSample data showing three dimensions: cohort, week, locationLine chart over a date range. Dates determined by multiplying WeekNo by 7 and adding the days to a predetermined start date. Each line represents the average of all locations.Line chart over a date range. Dates determined by multiplying WeekNo by 7 and adding the days to a predetermined start date. Each line represents the average of all locations.Changing the legend from cohort to location gets me really close visually to what I want; however, the average is outputting based on the data overlapping between cohorts with records on the same day.Changing the legend from cohort to location gets me really close visually to what I want; however, the average is outputting based on the data overlapping between cohorts with records on the same day.

3 REPLIES 3
messyjesse_
Frequent Visitor

For the life of me I cannot edit my message due to some "invalid HTML" nonsense, so I just want to expand this thought:

 

... as you can see in the fourth screenshot, the average is including all data points reported on a given date, including overlapping cohort data. I want to avoid averaging overlapping data points for different cohorts.

Hi @messyjesse_

Could you share me your pbix for better analysis?

You could upload your file to OneDrive and copy the link, then paste the link into the Url of Insert/Edit link.

 

Best Regards

Maggie

These invalid HTML error messages are driving me batty. Here's the PBIX.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.