Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply

Seperate duplicate and unique

Hello Community,
I am trying to get the unique id and duplicate it from the below dataset. Can anyone have some suggestions? 
I have tried with column from example in Pow
community_1.jpg

1 ACCEPTED SOLUTION
v-yangliu-msft
Community Support
Community Support

Hi  @tejapowerbi123 ,

Here are the steps you can follow:

1. Create calculated column.

ID1 = MID('Dataset'[ID],4,6)
ID2 =
 MID('Dataset'[ID],22,6)
IF1 =
var _select=SELECTCOLUMNS('Dataset',"D1",[ID1])
return
IF(
    'Dataset'[ID2] in _select,1,0)
IF2 =
var _select=SELECTCOLUMNS('Dataset',"D2",[ID2])
return
IF(
    'Dataset'[ID1] in _select,1,0)
Unique =
SWITCH(
    TRUE(),
    'Dataset'[IF1]=0&&'Dataset'[IF2]=1,'Dataset'[ID2],
    'Dataset'[IF1]=1&&'Dataset'[IF2]=0,'Dataset'[ID1],
    'Dataset'[IF1]=0&&'Dataset'[IF2]=0,'Dataset'[ID1]
)
Duplicate =
SWITCH(
    TRUE(),
    'Dataset'[IF1]=0&&'Dataset'[IF2]=1,'Dataset'[ID1],
    'Dataset'[IF1]=1&&'Dataset'[IF2]=0,'Dataset'[ID2],
    'Dataset'[IF1]=0&&'Dataset'[IF2]=0,'Dataset'[ID2]   
)

vyangliumsft_0-1654738025497.png

2. Create calculated table.

Final Result =
SUMMARIZE('Dataset','Dataset'[ID],'Dataset'[Unique],'Dataset'[Duplicate])

3. Result:

vyangliumsft_1-1654738025498.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

10 REPLIES 10
v-yangliu-msft
Community Support
Community Support

Hi  @tejapowerbi123 ,

Here are the steps you can follow:

1. Create calculated column.

ID1 = MID('Dataset'[ID],4,6)
ID2 =
 MID('Dataset'[ID],22,6)
IF1 =
var _select=SELECTCOLUMNS('Dataset',"D1",[ID1])
return
IF(
    'Dataset'[ID2] in _select,1,0)
IF2 =
var _select=SELECTCOLUMNS('Dataset',"D2",[ID2])
return
IF(
    'Dataset'[ID1] in _select,1,0)
Unique =
SWITCH(
    TRUE(),
    'Dataset'[IF1]=0&&'Dataset'[IF2]=1,'Dataset'[ID2],
    'Dataset'[IF1]=1&&'Dataset'[IF2]=0,'Dataset'[ID1],
    'Dataset'[IF1]=0&&'Dataset'[IF2]=0,'Dataset'[ID1]
)
Duplicate =
SWITCH(
    TRUE(),
    'Dataset'[IF1]=0&&'Dataset'[IF2]=1,'Dataset'[ID1],
    'Dataset'[IF1]=1&&'Dataset'[IF2]=0,'Dataset'[ID2],
    'Dataset'[IF1]=0&&'Dataset'[IF2]=0,'Dataset'[ID2]   
)

vyangliumsft_0-1654738025497.png

2. Create calculated table.

Final Result =
SUMMARIZE('Dataset','Dataset'[ID],'Dataset'[Unique],'Dataset'[Duplicate])

3. Result:

vyangliumsft_1-1654738025498.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

@v-yangliu-msft Thank you so much 🙂😀

lbendlin
Super User
Super User

Do you need this in DAX or can it be a Power Query solution?  How big is your actual table?

@lbendlin 
Anything works for me dax or power query.My original source is in Azure SQL and size of the table is approximate 12,000 Rows and 6 columns.

provide some more sample data, in usable format please. Not as screenshot.

@lbendlin 
Here is the dataset and result for your reference,

Dataset
ID
id:383249,related id:386244
id:387739,related id:383249
id:387692,related id:383249
id:392116,related id:386244
id:387739,related id:392116
id:395555,related id:395556

 

Final Result
IDUniqueDuplicate
id:383249,related id:386244386244383249
id:387739,related id:383249387739383249
id:387692,related id:383249387692383249
id:392116,related id:386244386244392116
id:387739,related id:392116387739392116
id:395555,related id:395556395555395556

Please explain the expected result in line 1 and 4. 386244 is not unique.

@lbendlin sorry it was a typo,

Dataset
ID
id:383249,related id:386244
id:387739,related id:383249
id:387692,related id:383249
id:392116,related id:386245
id:387739,related id:392116
id:395555,related id:395556

 

Final Result
IDUniqueDuplicate
id:383249,related id:386244386244383249
id:387739,related id:383249387739383249
id:387692,related id:383249387692383249
id:392116,related id:386244386245392116
id:387739,related id:392116387739392116
id:395555,related id:395556395555

395556

 

Please explain line 6. Why is 395556 a duplicate?

 

As you can see there are many scenarios to consider. Please rethink your requirement.

@lbendlin If it is just one the row then keep it as it is that is the explaniation for line 6.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.