Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello,
I have a dataset where there are multiple patients and each patient has multiple values (several lab results ) . The report needs to have the ability to filter/slice on the diagnosis of the patient such as (Diabetes 1, Diabetes 2). Each patient can have 1 or more diagnoses.
We need to show the average lab result per patient/patients for the diagnoses selected but we don't want to incorrectly reflect a multiplied value of the lab results based on how many diagnoses exist.
Example: (patient abc ghi has 2 diagnoses so their lab events are duplicated, 1 for each diagnosis).
labeventId Patient Lab result Diagnosis
1 abc 1.6 Diabetes 1
1 abc 1.6 Diabetes 2
2 abc 5.0 Diabetes 1
2 abc 5.0 Diabetes 2
3 def 4 Diabetes 1
4 ghi 2.0 Diabetes 1
4 ghi 2,0 Diabetes 2
Using the above would result in counting a single labevent twice (1.6, 1.6). We want to count each event once only no matter how many diagnoses there are.
What is the best way to model this:
a. Include the diagnosis in the lab result dataset
b. have a separate table with the patientid and diagnoses and join to the labevent based on patient.
In either one how can we ensure the actual lab results don't get duplicated/inflated based on the number of diagnoses?
Solved! Go to Solution.
I resolved this by including the diagnosis in the main result data. Even though there are multiple records per lab event if there is more than one diagnosis to a patient, the measures are not duplicating their values.
I believe the true correct model would be to have a separate table for patients and a separate table with patientId, diagnosis. Then to relate Patient > PatientDiagnosis and relate "labEvent" to Patient. However, when I did this, regardless of the crossjoin and cardinality configuration of the joins/relationships, the measures were not calculating correctly so I had to resort to including the diagnosis in the main labEvent table and not use relationships.
Hi @CB,
You may create a calculated table with the formula below.
Table = SUMMARIZE ( 'Table1'; 'Table1'[Labeventid]; 'Table1'[Patient]; "result"; CALCULATE ( MAX ( Table1[Lab result] ); FILTER ( ALL ( Table1 ); 'Table1'[Labeventid] = MAX ( 'Table1'[Labeventid] ) && 'Table1'[Patient] = MAX ( 'Table1'[Patient] ) ) ) )
Then you will get the out put below.
For more details, you could have a reference of the test file attached below.
Best Regards,
Cherry
Hello,
Thank you for your input (apologies for the delay in getting back to this). The calculated table would not suffice as all my slicers still need to be taken into account. Is there a way I can do something like a subquery that says "pull all the patients that have a diagnosis in ( table that holds patientid , diagnosis) " rather than say join the lab event data to the diagnosis data which would pull multiple records of each labevent if there were multiple diagnoses?
I resolved this by including the diagnosis in the main result data. Even though there are multiple records per lab event if there is more than one diagnosis to a patient, the measures are not duplicating their values.
I believe the true correct model would be to have a separate table for patients and a separate table with patientId, diagnosis. Then to relate Patient > PatientDiagnosis and relate "labEvent" to Patient. However, when I did this, regardless of the crossjoin and cardinality configuration of the joins/relationships, the measures were not calculating correctly so I had to resort to including the diagnosis in the main labEvent table and not use relationships.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
112 | |
100 | |
80 | |
64 | |
57 |
User | Count |
---|---|
146 | |
110 | |
93 | |
84 | |
67 |