Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anno2019
Helper IV
Helper IV

How to TAG duplicates using a specific criteria and rules using query editor

Hi Guru's

I received assistance from @v-lili6-msft 

The link to the solution is below, works well but I need to know if this is possible in Query Editor.

https://community.powerbi.com/t5/Desktop/How-to-TAG-duplicates-using-a-specific-criteria-and-rules/t...

 

The problem I am finding is that I need to delete the "ignore" rows before loading because i have too much data to process, it would help if I could run this before loading.

 

The formula for the calculated column is as follows:  A link to the workbook can be found in the original post, link above.

Primary File Number line TAG = var countsales=CALCULATE(COUNTA('Table'[Account Owner Role]),FILTER('Table','Table'[File Number]=EARLIER('Table'[File Number])&&'Table'[Account Owner Role]="Sales")) return
var highestinvoice=CALCULATE(MAX('Table'[Invoice Amount]),FILTER('Table','Table'[File Number]=EARLIER('Table'[File Number])&&'Table'[Account Owner Role]="Sales")) return
IF(countsales<>BLANK(),IF('Table'[Invoice Amount]=highestinvoice&&'Table'[Account Owner Role]="Sales","Primary","Ignore"),IF('Table'[Invoice Amount]=CALCULATE(MAX('Table'[Invoice Amount]),FILTER('Table','Table'[File Number]=EARLIER('Table'[File Number]))),"Primary","Ignore"))

 

1 ACCEPTED SOLUTION

Hi @Anno2019 

 

Please see attached file.
I used a different apporach and it appears quite fast with the sample data of 40k+ row

Just change the source of data to your xlsx file and see the last 4 steps

 

 

PQryee.png


Regards
Zubair

Please try my custom visuals

View solution in original post

12 REPLIES 12
Zubair_Muhammad
Community Champion
Community Champion

@Anno2019 

 

Attaching the sample file as well with above formula


Regards
Zubair

Please try my custom visuals

@Zubair_Muhammad 

Thank you so much for your assistance, it works well with the example data :-).

Need to pick your brain further please.

I had removed the column Invoice Number from my data, now it fails and I am not sure how to fix because the formula you assisted with uses this column [Invoice Number].  I tried replacing it with other column data but failed.

Also, the [file number] column is also failing as my data actually contains file numbers that have text in it aswell.  Not sure why this makes a difference.  Can you help? 

@Anno2019 

 

Are you performing other Power Query transformations in your Dataset as well?

If so.... #"Changed Type" should be replaced by the last step's name in your Query Editor before adding the custom column

 

=let 
T1=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number] 
and x[Account Owner Role]="Sales"),
T2=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number]),
mytable=if Table.IsEmpty(T1) then T2 else T1
in
if [Invoice Number]=
Table.Max(mytable,"Invoice Amount")[Invoice Number]
then
"Primary"
else
"Ignore"

Pqry.png


Regards
Zubair

Please try my custom visuals

@Zubair_Muhammad 

Yes, I am, it is the first thing I did.  But having trouble with the fact that I had to removed the column called [Invoice Number], I tried replacing the formula with other column names but have failed.  not sure which one to replace it with.

also, would it matter if my column for [file number] does not only contain numbers?  just making sure...

@Anno2019 

 

You can also use Invoice Amount or any other unique field ( highlighted in red font below)

File Number's data type shouldn't matter

 

let 
T1=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number] 
and x[Account Owner Role]="Sales"),
T2=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number]),
mytable=if Table.IsEmpty(T1) then T2 else T1
in
if [Invoice Amount]=
Table.Max(mytable,"Invoice Amount")[Invoice Amount]
then
"Primary"
else
"Ignore"

 


Regards
Zubair

Please try my custom visuals

@Zubair_Muhammad, thanks again... I am running it now but for some reason it is taking forever to process 🙂

my file is around 9MB large with 44000 rows, not sure why it is processing over 3GB of data and it is still going 🙂  Will let you know tomorrow if I picked up any errors.

Any idea on why it is taking this long and processes this much data?

 

file size processing.JPG

Hi
Could you share your file?
I will try few other tricks to get the optimized solution.

Regards
Zubair

Please try my custom visuals

Hi @Anno2019 

 

Please see attached file.
I used a different apporach and it appears quite fast with the sample data of 40k+ row

Just change the source of data to your xlsx file and see the last 4 steps

 

 

PQryee.png


Regards
Zubair

Please try my custom visuals

@Zubair_Muhammad 

 

Three words for you!!  LIKE A BOSS!!!

Works perfectly, I will have to test with a larger size file as I expect my final data to have over 400000 rows before de-duplicating 🙂

Wish me luck.  And thanks again.

@Zubair_Muhammad  Sure, but will need time to change some of the names etc etc for security purposes.  Will upload once completed.

Zubair_Muhammad
Community Champion
Community Champion

@Anno2019 

 

Try this custom column formula

 

It works with sample data

 

=let 
T1=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number] 
and x[Account Owner Role]="Sales"),
T2=Table.SelectRows(#"Changed Type",
(x)=>x[File Number]=[File Number]),
mytable=if Table.IsEmpty(T1) then T2 else T1
in
if [Invoice Number]=
Table.Max(mytable,"Invoice Amount")[Invoice Number]
then
"Primary"
else
"Ignore"

Regards
Zubair

Please try my custom visuals

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.