Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
CB
Frequent Visitor

Best Modeling - Joining to multiple records without duplicating/multiplying the values

Hello,

I have a dataset where there are multiple patients and each patient has multiple values (several lab results ) . The report needs to have the ability to filter/slice on the diagnosis of the patient such as (Diabetes 1, Diabetes 2). Each patient can have 1 or more diagnoses. 

 

We need to show the average lab result per patient/patients for the diagnoses selected but we don't want to incorrectly reflect a multiplied value of the lab results based on how many diagnoses exist. 

 

Example: (patient abc ghi has 2 diagnoses so their lab events are duplicated, 1 for each diagnosis). 
labeventId  Patient      Lab result    Diagnosis

            1       abc            1.6              Diabetes 1

            1       abc            1.6              Diabetes 2

            2      abc             5.0              Diabetes 1
            2      abc             5.0              Diabetes 2

            3      def              4                 Diabetes 1

            4      ghi              2.0              Diabetes 1

            4      ghi              2,0              Diabetes 2

 

Using the above would result in counting a single labevent twice (1.6, 1.6). We want to count each event once only no matter how many diagnoses there are. 

 

What is the best way to model this:

a. Include the diagnosis in the lab result dataset 

b. have a separate table with the patientid and diagnoses and join to the labevent based on patient. 

 

In either one how can we ensure the actual lab results don't get duplicated/inflated based on the number of diagnoses? 

 

 

1 ACCEPTED SOLUTION
CB
Frequent Visitor

I resolved this by including the diagnosis in the main result data. Even though there are multiple records per lab event if there is more than one diagnosis to a patient, the measures are not duplicating their values. 

 

I believe the true correct model would be to have a separate table for patients and a separate table with patientId, diagnosis. Then to relate Patient > PatientDiagnosis and relate "labEvent" to Patient. However, when I did this, regardless of the crossjoin and cardinality configuration of the joins/relationships, the measures were not calculating correctly so I had to resort to including the diagnosis in the main labEvent table and not use relationships. 

View solution in original post

3 REPLIES 3
v-piga-msft
Resident Rockstar
Resident Rockstar

Hi @CB,

 

You may create a calculated table with the formula below.

 

Table =
SUMMARIZE (
    'Table1';
    'Table1'[Labeventid];
    'Table1'[Patient];
    "result"; CALCULATE (
        MAX ( Table1[Lab result] );
        FILTER (
            ALL ( Table1 );
            'Table1'[Labeventid] = MAX ( 'Table1'[Labeventid] )
                && 'Table1'[Patient] = MAX ( 'Table1'[Patient] )
        )
    )
)

Then you will get the out put below.

 

Capture.PNG

 

For more details, you could have a reference of the test file attached below.

 

Best Regards,

Cherry

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hello,

Thank you for your input (apologies for the delay in getting back to this). The calculated table would not suffice as all my slicers still need to be taken into account. Is there a way I can do something like a subquery that says "pull all the patients that have a diagnosis in ( table that holds patientid , diagnosis) "  rather than say join the lab event data to the diagnosis data which would pull multiple records of each labevent if there were multiple diagnoses? 

CB
Frequent Visitor

I resolved this by including the diagnosis in the main result data. Even though there are multiple records per lab event if there is more than one diagnosis to a patient, the measures are not duplicating their values. 

 

I believe the true correct model would be to have a separate table for patients and a separate table with patientId, diagnosis. Then to relate Patient > PatientDiagnosis and relate "labEvent" to Patient. However, when I did this, regardless of the crossjoin and cardinality configuration of the joins/relationships, the measures were not calculating correctly so I had to resort to including the diagnosis in the main labEvent table and not use relationships. 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.