Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello!
I'm having trouble trying to make visualizations about COVID status change on various employees.
I have an SQL source table like this:
Index | Employee | Date | Status | Region |
1 | Person A | 16/08/2020 | No symptoms | North |
2 | Person B | 16/08/2020 | Suspect | North |
3 | Person C | 16/08/2020 | No symptoms | West |
4 | Person A | 17/08/2020 | Suspect | North |
5 | Person D | 17/08/2020 | No symptoms | West |
6 | Person E | 17/08/2020 | No symptoms | East |
7 | Person F | 17/08/2020 | Confirmed | West |
8 | Person A | 18/08/2020 | Confirmed | North |
9 | Person B | 18/08/2020 | Suspect | North |
10 | Person C | 18/08/2020 | Suspect | West |
Link: https://docs.google.com/spreadsheets/d/1eP_zSoF0t37OQ4xxCf46TwxE5X0ICdl_4Je4H_OWzsA/edit?usp=sharing
The goal is to create matrix visuals like this for each region:
Behaviour note: In the example above we see that on the first day (16/08/2020) we only have employee "Person C" input in the West region, with status "No symptoms". He didn't have an update on the next day (17/08/2020), so he keeps counting as a "No symptoms". On day 18/08/2020, however, his status is changed to "Suspect".
The source table is going to have many many thousands of lines. What are your suggestions?
Thanks in advance for your time!
Solved! Go to Solution.
Here is a variation that looks at the question a bit different. Namely, it also considers what happens outside of the test range, especially before. For that an additional measure "Unknown" is added to cover all the days before the first result for each person.
A Dates table is used in disconnected mode. The measures are added to the matrix with "show on rows". That gives the added advantage that you can control the sort order, making it a bit more logical. A Totals row can be added if desired, but it's not really useful as it will be the same for all days. Below shows the example when "West" is filtered.
Here are the measures:
Unknown :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var s= ADDCOLUMNS(e,"FirstDate",CALCULATE(min(Symptoms[Date]),Filter(allselected(Symptoms),Symptoms[Employee]=[emp])))
var c= ADDCOLUMNS(s,"Check",if([FirstDate]>d,1,0))
return sumx(c,[Check])
No Symptoms :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="No Symptoms",1,0))
return sumx(c,[Check])
Suspect :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="Suspect",1,0))
return sumx(c,[Check])
Confirmed :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="Confirmed",1,0))
return sumx(c,[Check])
yes, there's room for improvement, for sure...
Here is one way to do it.
Add a disconnected table (no relationship to other tables), by clicking on New Table on the Modeling tab and enter the expression below.
StatusValues = VALUES(Covid[Status])
Make a matrix visual with the above column on rows, your Date column on column, and this measure in Values (replace Covid with your actual table name).
Latest Status Count =
SUMX (
VALUES ( StatusValues[Status] ),
CALCULATE (
VAR thisstatus =
SELECTEDVALUE ( StatusValues[Status] )
VAR thisdate =
MIN ( Covid[Date] )
VAR summary =
ADDCOLUMNS (
ALL ( Covid[Employee] ),
"latest", CALCULATE (
LASTNONBLANKVALUE ( Covid[Date], MIN ( Covid[Status] ) ),
ALL ( Covid[Date] ),
Covid[Date] <= thisdate
)
)
RETURN
COUNTROWS ( FILTER ( summary, [latest] = thisstatus ) ) + 0
)
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Here is a variation that looks at the question a bit different. Namely, it also considers what happens outside of the test range, especially before. For that an additional measure "Unknown" is added to cover all the days before the first result for each person.
A Dates table is used in disconnected mode. The measures are added to the matrix with "show on rows". That gives the added advantage that you can control the sort order, making it a bit more logical. A Totals row can be added if desired, but it's not really useful as it will be the same for all days. Below shows the example when "West" is filtered.
Here are the measures:
Unknown :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var s= ADDCOLUMNS(e,"FirstDate",CALCULATE(min(Symptoms[Date]),Filter(allselected(Symptoms),Symptoms[Employee]=[emp])))
var c= ADDCOLUMNS(s,"Check",if([FirstDate]>d,1,0))
return sumx(c,[Check])
No Symptoms :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="No Symptoms",1,0))
return sumx(c,[Check])
Suspect :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="Suspect",1,0))
return sumx(c,[Check])
Confirmed :=
var d = max(SymptomDates[Date])
var e = SELECTCOLUMNS(allselected(Symptoms[Employee]),"emp",Symptoms[Employee])
var ld= ADDCOLUMNS(e,"LastDate",CALCULATE(max(Symptoms[Date]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]<=d && Symptoms[Status] in {"No Symptoms","Suspect","Confirmed"})))
var c= ADDCOLUMNS(ld,"Check",if(CALCULATE(max(Symptoms[Status]),filter(ALLSELECTED(Symptoms),Symptoms[Employee]=[emp] && Symptoms[Date]=[LastDate]))="Confirmed",1,0))
return sumx(c,[Check])
yes, there's room for improvement, for sure...
Thank you @lbendlin and @mahoneypat for the replies.
I applied Ibedlin's solution but I don't get it right. My final matrix always shows the number of the final day for all the days, as seen below:
I've double-checked the Measure's code and the date table (disconnected). First I went for CALENDARAUTO() and then CALENDAR(FIRSTDATE(Symptoms[Date]),LASTDATE(Symptoms[Date])). Do you have a suggestion about this problem?
I uploaded the .pbix file here: https://drive.google.com/file/d/153iCRT5u80nXolMgY1jMBcXFlRoYgyIE/view?usp=sharing
Thanks all!
You used the wrong date field in the visual. It needs to be SymptomDates[Date], not Symptoms[Date].
I would also recommed this changed formula for SymptomDates
Thank you very much! Now it works.
I also marked the post before as the solution.
One approach is to fill the gaps in between measurements. Use the CROSSFILTER(,,NONE) pattern between your fact table and your dates table and a LASTNONBLANKVALUE base measure to fill them. That approach scales well.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
104 | |
101 | |
79 | |
72 | |
64 |
User | Count |
---|---|
142 | |
108 | |
101 | |
81 | |
74 |