cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
ivyhai Member
Member

Identify duplicates and show up records

I've got contact table with columns: names, emails. 

 

I want to identify duplicates in emails and show up their names. 

for example, show me the table like this.   any ideas, thanks. CRM.PNG

 

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Identify duplicates and show up records

You could create a calculated column with the following formula:

Count Emails = 
Var Emails = [Emails]
RETURN

CALCULATE(
    COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[Emails] = Emails
)

From here, you could simply look for any rows where the Count Emails is greater than 1.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


13 REPLIES 13
Super User
Super User

Re: Identify duplicates and show up records

You could create a calculated column with the following formula:

Count Emails = 
Var Emails = [Emails]
RETURN

CALCULATE(
    COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[Emails] = Emails
)

From here, you could simply look for any rows where the Count Emails is greater than 1.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


ivyhai Member
Member

Re: Identify duplicates and show up records

@Ross73312 not toally understand this DAX, but it's toally work!!!.  many thanks!!!

Super User
Super User

Re: Identify duplicates and show up records

Here is a quick explaination for your learning:


This formula is run on a row by row basis, so keepthat in mind.

Var Emails = [Emails]
RETURN

This part takes the answer on the current row in the field [Emails] and places it in memory to be used later on (called a variable).

CALCULATE() is used to run a statement, but with extra conditions

COUNTROWS('Your Table') is going to get us the count we care about, but since we are doing this calculation row by row, there is only 1 row that could be counted

 

all('Your Table') is used to tell the CALCULATE to look at every row in the table, instead of just the current row.  Of course we don't actually want every row, or else we'd get the same number

'Your Table'[Emails] = Emails  is a filter that is going to be applied to the all('Your Table').  This will filter the ALL down until it only has in it rows where the [Email] field contains the value we put into that variable in the first step.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


Wiz Frequent Visitor
Frequent Visitor

Re: Identify duplicates and show up records

@Ross73312 - I'm new to PowerBI. I'm curious to know if the order of the other functions matter when using the CALCULATE function.

Super User
Super User

Re: Identify duplicates and show up records

Good question! The documentation doesn't seem to specify, however I always assume that it do the worst case scenario of whatever i'm attempting.  So i write statements as best as possible, then test if I get the desired result.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


edday Frequent Visitor
Frequent Visitor

Re: Identify duplicates and show up records



@Ross73312

 

Hi Ross. This code works for me as well but I am a bit confused:

 

Are you able to just use:

 

Count Emails = 

CALCULATE(
    COUNTROWS('Your Table'),
    all('Your Table'),
    'Your Table'[Emails]
)

 (ie. exclude the VAR function)? Correct me if I am wrong, but doesn't that just define the variable 'Invoice'? Shouldn't the code work the same without it (albeit less efficiently). I tried excluding that part of the coding and it didn't work (comes up with an error saying "Cannot convert value ............. of type Text to type True/False".

 

Thanks so much in advance!

 

edday.

 

 

Super User
Super User

Re: Identify duplicates and show up records

I can't think if i've ever attempted to write this query in the method you have described, please try and let me know.  The reasons for my structure (based on my attempts and other similar languages i've used):

Firstly, Calculate throws a hissy fit over certain peices of information being used in the constraits.  The variable gets around this problem. I often do this as a goto as i know it will always work.

Secondly, when you need to do something like 'Your Table'[Emails] = 'Your Table'[Emails].  Where what you are trying to do is say 'Only return lines where the Email address field matches the current email', this doesn't work in DAX as i've written it.  In your example, if DAX takes your line 'Your Table'[Emails] to mean that, i'd love to know that it does it that way.  I'm on vacation so i can't really just load it up and test it right now.  I still prefer my method for  the code readability as i'm explicit in exactly what i'm doing.  This is often important when i come back to old projects later down the line, however if your method actually works it would be a neat DAX trick to know about.


   

              Did I answer your question? Mark my post as a solution!
       

Proud to be a Datanaut!


   


edday Frequent Visitor
Frequent Visitor

Re: Identify duplicates and show up records

@Ross73312

 

Thanks for the clarification Ross. I have been unsuccessful in using your code without the Var function. Specifically Power BI comes up with the error "Cannot convert value …………. of type Text to type True/False". I tried directly converting the column of data into the True/False data type through the query editor, but was still unsuccessful. 

 

At this point I am not sure if your code without the Var would work or not, but your explanation makes sense. I'd say Power BI is throwing a hissy fit indeed.

 

Thanks again. 

gtamir Member
Member

Re: Identify duplicates and show up records

Why it doesn't work for me? 

Count LAK_NUM =
VAR LAK_NUM = [LAK_NUM] RETURN
CALCULATE(
COUNTROWS('Customers'),
all('Customers'),
'Customers'[LAK_NUM]= LAK_NUM
)

 

I get an error: The value for 'LAK_NUM' cannot be determined. Either 'LAK_NUM' doesn't exist, or there is no current row for a column named 'LAK_NUM'.

I was not able to insert an image to the post.