Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
jjmauras
Advocate I
Advocate I

Remove or Hide Duplicate Values

Is there a way to Remove or Ignore duplicate values when they occur in a column?

 

I'm pulling data from our SQL Server. I have 2 related tables. Every once in a while a duplicate value will occur in my One to Many relationship which causes failure. I'm looking for a way to remove/ignore a duplicate value so I can avoid this. Thanks.

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

There is a Remove Duplicates step when editing your query, it is in the Reduce Rows section of the ribbon, the M code is:

#"Removed Duplicates" = Table.Distinct(#"[Previous Step]", {"[Column Name]"})


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

12 REPLIES 12
Anonymous
Not applicable

I am having the same issue where I am pulling information from a SQL table but can not force the query to remove duplicates based on newest or most recent. I am attempting to remove duplicates out of an account number column, and wanting to keep only the most recent record for each account number. 

 

I connect to the SQL table. 

Edit Queires,

Sort the Table by Entry Date

Then remove duplicates from Account Number

 

When i check the output, I am still seeing records that are not the most recent for specific account numbers. 

 

I was told it is because itr was not sorted at the source. I do not have access to the source to change the sort order. 

 

What can I add in the language to force the sort in the imported data?

 

let
Source = Sql.Databases("123esco.cloudapp.net"),
123_Esco = Source{[Name="123_Esco"]}[Data],
dbo_Meters = CLOUDDATA{[Schema="dbo",Item="Meters"]}[Data],
#"Sorted Rows" = Table.Sort(dbo_Meters,{{"EntryDate", Order.Descending}}),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"UtilityAccountNumber"})
in
#"Removed Duplicates"

 

@Anonymous, I ran into this issue in Power Query as well. I followed the problem in this article by Ken Puls, and it fixed the problem on the first shot.

Anonymous
Not applicable

Thanks I will take a look

 

@Anonymous If you have access to query the table, you could write a query to only retrieve the latest records. Here are a number of solutions that can be used as a guide for you if you aren't very familiar with SQL. You can write the SQL Query in the Advanced Options when you are connecting to the database.


Looking for more Power BI tips, tricks & tools? Check out PowerBI.tips the site I co-own with Mike Carlo. Also, if you are near SE WI? Join our PUG Milwaukee Brew City PUG
Greg_Deckler
Super User
Super User

There is a Remove Duplicates step when editing your query, it is in the Reduce Rows section of the ribbon, the M code is:

#"Removed Duplicates" = Table.Distinct(#"[Previous Step]", {"[Column Name]"})


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Using Table.Distinct, which rows are kept (not removed)?

 

For instance, if I try to remove duplicates based on a "user_id" column, the first row with each user_id will be kept, and the following rows with that user_id will be removed? (wondering if I need to sort before to make sure I keep the rows I want to keep)

I have done this on a table but then when I try to join to another table on that field I get the error that I can't create the relationship as one of the columns must have unique values. Isn't applying that step getting me to where it's unique or am I missing a step?

Thanks everyone. @Greg_Deckler I think I'm going to go with this solution for the moment as it worked right away. My question is will it stick? For example is it a one time thing or will it remove duplicates every time the SQL data is refreshed?

Yes, query steps stick. When you run a query, it is really just executing a series of steps. You can see the steps recorded on the righthand side. of the query editing window.

 

Handling this in the source is great, but not always possible. As an end-user focused tool, it is not reasonable to expect that users will have access to the backend to make changes, write stored procedures, expert knowledge to write SQL, etc.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@jjmauras Best practice would be to handle this at the source. Per @Seth_C_Bauer solution.

Business Intelligence Architect / Consultant

@jjmauras You can handle this in SQL with a CTE or subquery using the row_number() function. This will handle everything automatically.

 


Looking for more Power BI tips, tricks & tools? Check out PowerBI.tips the site I co-own with Mike Carlo. Also, if you are near SE WI? Join our PUG Milwaukee Brew City PUG
Fabiola_K
Advocate IV
Advocate IV

I'm not an expert, but i've done in these cases is to create a third table with the
distinct values, and use it as a bridge between both tables.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.