Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Remove Specific Duplicate Row

Hi,

 

Is there a way to pick which row gets removed when you remove duplicates? I have a list of opportunities, some of which are duplicates. They can have a state of Active or Inactive. I want to remove all duplicate opportunities, but if any of the duplicates have a state of Active then I want to keep that one. It doesn't matter if there are multiple opportunities with state Active, just keep one of those if there is one. If there are duplicates and all the duplicates are Inactive, then any of those records that are kept are fine.

 

Does that make sense? Basically I want to say look at the column of opportunity IDs, if you see duplicates, check the state and if one state is Active and another is Inactive, keep the Active one.

 

If I knew how it decides which one it keeps when it removes duplicates, that would help.

 

Thanks

6 REPLIES 6
ambi95
Helper I
Helper I

Hey, did you ever get a solution to this? because I have the exact same problem statement

PauloStri
New Member

Hello, I have the same need. Could you please tell us how to do it?

The concept is simple: remove duplicates based on a condition (value from another column).

Please, it shouldn't be so complex. Could someone help us?

Thanks a lot. Best regards

v-lili6-msft
Community Support
Community Support

hi  @Anonymous 

I would suggest you do a filter before "Remove duplicates" step.

 

Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
amitchandak
Super User
Super User

@Anonymous , You can use remove duplicate in Edit query/data transformation

https://radacad.com/remove-duplicate-doesnt-work-in-power-query-for-power-bi-here-is-the-solution

 

Another option is that you create a new table using summarize like

summarize(Table,Table[ID],"Status",max(Table[Status]), "Value",min(Table[Value]))

 

Add remove columns as per need.

Anonymous
Not applicable

Hi @amitchandak 

 

That article doesn't mention anything about removing select duplicates based on values in another column.

 

Summarize wouldn't work, I need to do this in Power Query because there are a lot more data transformations to do afterwards.

Could you not do it in two steps - before you remove duplicates, make some sort of calculated column that counts the instances of that row's value in the column you're doing the dupe check on, then make a conditional column that sees if the new column's more than 1 and your status is inactive, and use that to delete your inactive dupes first? Not particularly efficient, but it should work

 

edit - probably need to add in some condition to consider the situation where there's two rows and they're both inactive, but the general principle is the same

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.