Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
dyabes
Helper I
Helper I

Remove duplicate and keep row based on a column with the lowest value

I'm tyring to remove duplicate on the ID column but would like to keep the row based on the date with the lowest value. Basically, keep the ID with the earliest instance. I tried the Sort -> Table.Buffer -> Remove Duplicate approach but that process is incredibly slow most likely because the dataset is an append of multiple large CSV files.

 

Are there other approaches that are more efficient?

 

2018-12-09_9-09-24.png

 

Thanks in advance!

David

2 ACCEPTED SOLUTIONS
AlexisOlson
Super User
Super User

I'd suggest doing a Group By in the query editor. Group by ID and use Min as the aggregation type for the Date column.

View solution in original post

Hi @dyabes

 

Attached the sample file for your reference.

 

Regards,

Cherie

Community Support Team _ Cherie Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

6 REPLIES 6
Anonymous
Not applicable

Hi Alexis

 

I am also facing similar issue: I have three columns : Supplier Name, Status and Points. there is duplicate value in Supplier and unique value in Status and Points. I need to display the Supplier with lowest points and display corresponding status. Basically need to remove duplicate suppliers keeping the lowest points record

 

Regards

Arun

AlexisOlson
Super User
Super User

I'd suggest doing a Group By in the query editor. Group by ID and use Min as the aggregation type for the Date column.

Thank you. I think I should have provided the complete dataset I'm wokring on. I also need to keep the corresponding row values from other columns

 

2018-12-09_10-36-51.png

Hi @dyabes

 

Attached the sample file for your reference.

 

Regards,

Cherie

Community Support Team _ Cherie Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi Dear

You need to use Group By.

Select Group By -> Select ID column as Group BY  and then in Operation select Min and in Column Select Date.
You will get your result.

You can do what I suggested an then merge the extra column(s) back in after.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.