Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Identify the duplicates in complex data set

Hi,

 

I am sorry, but it is me again. Your formula was great, but it brought up another few issue,s after i started using big data set.

 

As i mentioned i included index but it didn't seem to resolve the issue for THE bigger data set.

 

So i will use another example to explain:

 

 I need to identify how many duplicated activities there are raised for the same post code and street within the last 24 hours since completion date.  However, first activity after previous activity was completed wouldn't count as duplicate as it is classed as a new entry, but any other activity after would be duplicate.

 

The formula which i use don't seem to like that completion date is the same hours and seconds. But also, includes some of activities, whihc are completed at the same time but the first activity was raised later then second one.

 

Worth to mention that i sorted the data in advance editor :  #"Sorting" = Table.Sort(#"Changed Type",{{"Postcode", Order.Ascending}, {"CompletedDate", Order.Ascending},{"Street", Order.Ascending}})

 

Also,I included Index based on post code and used the below formula in Report tab and not Data  . Example data attached.

ActivityRaisedDateCompletedDateStreetPostcodeOpsRegion_Join
0123 456ABCD01/01/1980 15:1902/01/1980 12:00Little streetAB1 2CDEastern South
0123 456BCDE01/01/1980 15:2602/01/1980 12:00Little streetAB1 2CDEastern South
0123 456CABFE01/01/1980 15:5202/01/1980 12:00Little streetAB1 2CDEastern South

 

Duplicated(24hrs) =

var dupCount = CALCULATE(COUNTROWS(Table1),FILTER(Table1,Table1[Postcode] = EARLIER(Table1[Postcode])&& Table1[Street]=EARLIER(Table1[Street])&& Table1[CompletedDate]>=EARLIEr(Table1[CompletedDate])-1&&Table1[CompletedDate]=Table1[CompletedDate] && Table1[CompletedDate]<=EARLIER(Table1[CompletedDate])&& Table1[Index PostCode]=Table1[Index PostCode] &&Table1[Index PostCode]<=EARLIER(Table1[Index PostCode])))

var lastraiseddate = CALCULATE(MIN(Table1[RaisedDate]),FILTER(Table1,Table1[Postcode]=EARLIER(Table1[Postcode])&& Table1[Street]=EARLIER(Table1[Street])&& Table1[Index PostCode]=EARLIER(Table1[Index PostCode])-1))

RETURN IF(Table1[RaisedDate]<=Table1[CompletedDate]&& Table1[RaisedDate]>=lastraiseddate&& dupCount>1,"Y","N")

 

Thank you so much for your help in advance.

3 REPLIES 3
Anonymous
Not applicable

Hi Lydia,

 

I included Index based on post code and used the below formula in Report tab.

 

Index PostCode =
RANKX ( Table1, Table1[Postcode],, ASC, DENSE )
 
I am not allowed to share an actual data, but i will make some exaples and will post it asap.
 
Thank you
Kind rEgards,
Dovile
v-yuezhe-msft
Employee
Employee

 
Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
v-yuezhe-msft
Employee
Employee

@Anonymous ,

How do you add index in your table? Could you please share complete data of this table and post expected result based on the sample data here?

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.