Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Using First 6 Characters in a Table to Remove Duplicates

Good afternoon,

 

First and foremost, I apologize for the poor question name. I wasn't sure what else to name this post.

 

Second, it's important to note that currently I'm unable to make changes in the Power Query editor as some changes are currently being implemented to the underlying data.

 

I'm needing some assistance with how to best approach this issue, or it there's a way to do this on the report side sans query editor.

 

I have a table that has a column with names and sources. All the names in the name column are supposed to be unique values, however some of the names are basically the same except for an additional dash in the middle of the name. Also, the 2 (basically) similar names also have 2 different sources. The other thing I do wish to note is that I'm only dealing with 2 sources.

Name Example: 325143 Article1

                          325143 - Article 1

 

The other conundrum is that 1 of the names is in a different name format compared to the other names (please refer to the name example above).

Anomaly Name: 4815 - 557799 Article2

                           4815 - 557799 - Article2

 

Basically, my question is this:

1. Is there a way to return the unique name values, prioritizing the name without a dash before the article#? For further clarification, I have attached a sample of what the table looks like currently (Original Visual) and what the desired outcome is (Desired Visual).

Crystal77_0-1646855826591.png

Hopefully, this makes sense and thank you in advance for any assistance!

1 ACCEPTED SOLUTION

Hi @Anonymous 

 

Sorry for the late reply. I didn't think of a good method only using DAX at the report side. I think of a method which is a combination of Power Query and DAX. You may have a look at it. 

 

First in Power Query Editor, add two custom columns to extract the numbers and Name length for each row. 

let _allNumbers = Text.Select([Name], {"0".."9"}) in if Text.Length(_allNumbers)>= 10 then Text.Start(_allNumbers,10) else Text.Start(_allNumbers,6)
Text.Length([Name])

vjingzhang_0-1647595780162.png

 

After applying the data to the model, add a new column with the following DAX. 

Flag = 
VAR _shortnameLen = CALCULATE(MIN(SourceData[Name Length]),ALLEXCEPT(SourceData,SourceData[Numbers]))
RETURN
IF(SourceData[Name Length]=_shortnameLen,1,0)

vjingzhang_1-1647596249023.png

 

You can put this new Flag column on the table visual as a filter field to show values whose Flag is 1. 

vjingzhang_2-1647596391075.png

 

Hope it helps.

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

View solution in original post

5 REPLIES 5
Anonymous
Not applicable

Good morning,

Yes, all the article numbers are unique. Is there a way to extract or distinguish the sources outside of the Power Query editor?

Regarding the last 2 rows, ideally it would be better if all the numbers were extracted.

 

I have found a way to do this in the PQE, but as it frequently updates, I wanted to see if there was a way to do make those changes without it.

 

Thank you.

Hi @Anonymous 

 

Here is another method only using Power Query. You can download the pbix to see details. The previous method sample is also included in it. 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyNjE1U9BVcCwqyUzOSTVU0lEKz8zOVIrVgUsiSbnn56fnpCIkzWGSRlgkLWCSxlgkLRF2mqDbaWaAkDRFljQ2sTAESlmagyBCjRluNUgqYC6IBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Source = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Source", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Numbers", each let _allNumbers = Text.Select([Name], {"0".."9"}) in if Text.Length(_allNumbers)>= 10 then Text.Start(_allNumbers,10) else Text.Start(_allNumbers,6)),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Name Length", each Text.Length([Name])),
    #"Grouped Rows" = Table.Group(#"Added Custom1", {"Numbers"}, {{"All Data", each _, type table [Name=nullable text, Source=nullable text, Numbers=text, Name Length=number]}}),
    #"Added Custom2" = Table.AddColumn(#"Grouped Rows", "All Data Sort", each Table.Sort([All Data],{"Name Length",Order.Descending})),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "RemoveData", each if Table.RowCount([All Data Sort]) > 1 then Table.RemoveFirstN([All Data Sort],1) else [All Data Sort]),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom3",{"RemoveData"}),
    #"Expanded RemoveData" = Table.ExpandTableColumn(#"Removed Other Columns", "RemoveData", {"Name", "Source", "Numbers", "Name Length"}, {"Name", "Source", "Numbers", "Name Length"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Expanded RemoveData",{"Source", "Name"})
in
    #"Removed Other Columns1"

 

Jing

Hi @Anonymous 

 

Sorry for the late reply. I didn't think of a good method only using DAX at the report side. I think of a method which is a combination of Power Query and DAX. You may have a look at it. 

 

First in Power Query Editor, add two custom columns to extract the numbers and Name length for each row. 

let _allNumbers = Text.Select([Name], {"0".."9"}) in if Text.Length(_allNumbers)>= 10 then Text.Start(_allNumbers,10) else Text.Start(_allNumbers,6)
Text.Length([Name])

vjingzhang_0-1647595780162.png

 

After applying the data to the model, add a new column with the following DAX. 

Flag = 
VAR _shortnameLen = CALCULATE(MIN(SourceData[Name Length]),ALLEXCEPT(SourceData,SourceData[Numbers]))
RETURN
IF(SourceData[Name Length]=_shortnameLen,1,0)

vjingzhang_1-1647596249023.png

 

You can put this new Flag column on the table visual as a filter field to show values whose Flag is 1. 

vjingzhang_2-1647596391075.png

 

Hope it helps.

 

Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.

Anonymous
Not applicable

Thank you for all your help! I actually did it this way as well.

v-jingzhang
Community Support
Community Support

Hi @Anonymous 

 

Are the first six number characters unique for every article? If so, we can consider extracting them to decide whether an article is duplicated or have different sources. And the last two rows Article6 has a different name format from other articles, can we extract "3481" only or extract "979797" or extract both together to tell it's unique?

vjingzhang_0-1647251764439.png

 

If you want to remove rows, usually using Power Query Editor is more suggested. On the report side, we cannot remove rows, so we usually create a measure to decide whether a value meets some conditions and filter out values that do or do not meet. The measure will play as a visual-level filter. 

 

Let me know more details about the questions in the first paragraph so I can try to provide a possible solution. 

 

Best Regards,
Community Support Team _ Jing

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.