Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Jase71ds
Advocate I
Advocate I

PBI PowerQuery: Don't remove the duplicates, SHOW me the Duplicates!

In PBI's PQ, I have a table of ~30K rows. I know there are duplicates because I see the row count drop by ~200 when I "Remove Duplicates".

Here's my problem -  I need a query/table to show me what those dups are. (IOW, don't remove dups, rather remove everything that is NOT a duplicate). This is not hard in Excel, but so far, I can't figure out a PQ solution (and I don't know M code - though I'm happy to apply it to the script editor if someone gives me some).

Thanks for any help.

Jase.

1 ACCEPTED SOLUTION
collinq
Super User
Super User

Hi @Jase71ds ,

 

If you are wanting to keep the original query as is the easiest way to get this information (in my thinking) is to Reference that original query.  In the reference, just do a Group By and group on the field that you are using to find duplicates and use the "count" .  That will give you a count of every row in the group.  Then, filter on everything greater than one on the count column.

 

I would appreciate Kudos if my response was helpful. I would also appreciate it if you would Mark this As a Solution if it solved the problem. Thanks!



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!
Private message me for consulting or training needs.




View solution in original post

4 REPLIES 4
collinq
Super User
Super User

Hi @Jase71ds ,

 

If you are wanting to keep the original query as is the easiest way to get this information (in my thinking) is to Reference that original query.  In the reference, just do a Group By and group on the field that you are using to find duplicates and use the "count" .  That will give you a count of every row in the group.  Then, filter on everything greater than one on the count column.

 

I would appreciate Kudos if my response was helpful. I would also appreciate it if you would Mark this As a Solution if it solved the problem. Thanks!



Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!
Private message me for consulting or training needs.




Wow!

That worked amazingly well. So crazy simple.

Here's what I did:

1. Group By...

2. Selected the column that had some duplicates

3. Operation: Count Distinct Rows

-----------------

The output column then generated 1's and 2's. I filtered for the 2's, and there it was, all the rows that had contained duplicates!

Now I know what info to send to our data guy and ask him to clean up.

Thanks so much. Such an easy solution!!

Jase.

Hey @Jase71ds ,

 

I am glad it worked!




Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!
Private message me for consulting or training needs.




Hmmm... I have never used this feature. Looking at it now.

I  might need to spend a few minutes on YT.

I'll get back with you.

Thanks!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.