Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Need help with duplicated data

Hi everyone, i've just started using Power BI and seen parts of it's capabilities. However, i've gotten into a problem with structuring my data. 

 

I'm interested in making Power BI count the amount of specific veterinary treatments per week per county. My raw data is public records, so I can not continiusly be restructuring it. I've made a simple graph and made Power BI count the amount of treatments, but due to the data, treatments are counted several times 

 

In my dataset each treatment often have several inputs (one per week) as they usualy are ongoing over several weeks. Normally just removing duplicates would solve this issue. However, sites can treat multiple times a year, so this option would also give me the wrong count. 

 

So, is there a way to get power BI to ignore treatments with less than 5 weeks between them? I've attached a small sample from my sheet below. 

 

https://ibb.co/k5pcQmg

 

1 ACCEPTED SOLUTION

@Anonymous

Ok, I refined a bit what we had yesterday. No real changes. Following the logic explained earlier, we have three calculated columns to come up with an ID per treatment that you can then use as you please. See below. Here's your .pbix modified, with the three new columns. Does this solve your issue?

 

1. First column

 

Table1[TreatmentChange] = 
VAR MaxWeekGap = 2
VAR CurrentWeek = Table1[Uke]
VAR PreviousWeek =
    CALCULATE (
        MAX ( Table1[Uke] );
        Table1[Uke] < CurrentWeek;
        ALLEXCEPT ( Table1; Table1[Lokalitetsnummer])
    )
RETURN
    IF (
        ISBLANK ( PreviousWeek );
        1;
        IF (
            ( CurrentWeek - PreviousWeek )
                > MaxWeekGap;
            1;
            0
        )
    )

2. Second column. The number of treatment is determined as you explained. For a specific farm (Lokalitetsnummer), we start with treatment number 1 and go only to treatment number 2 when we encounter that more than two weeks have elapsed since the last time the site reported.    

 

Table1[TreatmentNumber] = 
VAR CurrentWeek = Table1[Uke]
RETURN
    CALCULATE (
        SUM ( Table1[TreatmentChange] );
        Table1[Uke] <= CurrentWeek;
        ALLEXCEPT ( Table1; Table1[Lokalitetsnummer] )
    )

3. Third column with the treatment ID. It is a concatenation of Lokalitetsnummer and the number of treatment. Thus, for example for site with Lokalitetsnummer 11116  we happen to have 11116_1, 11116_2 and 11116_3. Three different treatments.

 

Table1[TreatmentID] = Table1[Lokalitetsnummer] & "_" & Table1[TreatmentNumber] 

 

View solution in original post

19 REPLIES 19

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.