Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Merge rows based on duplicate ID field

Hi I am looking at trying to merge two field rows based on a duplicate value

ID

RefID (Duplicate Field)

Location (To Merge)

Area (To Merge)

1

1

UK

Cheshire

2

1

UK

Lancashire

3

2

USA

New York

4

2

USA

New York

5

2

UK

Cheshire

6

3

USA

New York

7

4

USA

New York

8

4

USA

Wisconsin

 

Output

ID

RefID

Location

Area

1

1

UK

Cheshire; Lancashire

3

2

USA; UK

New York; Cheshire

6

3

USA

New York

7

4

USA; USA

New York; Wisconsin

 

Also is it possible for the field to not have duplicate values such as “UK; UK; USA; UK; USA” but only have one unique value ie “UK; USA.

 

Thanks

1 ACCEPTED SOLUTION
Anonymous
Not applicable

@Anonymous  - Try this Power Query script:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSAeNQbyDhnJFanJFZlKoUqxOtZIQs45OYl5yIkDMGCoHkQ4MdgaRfarlCZH5RNljKBLeUKUwKwy4zoIAxdk3mQAET7FIWKFLhmcXJ+XnFmXlKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, RefID = _t, Location = _t, Area = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"RefID", Int64.Type}, {"Location", type text}, {"Area", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"RefID"}, {{"All Rows For RefID", each _, type table [ID=number, RefID=number, Location=text, Area=text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "ID", each List.First([All Rows For RefID][ID])),
    DistinctLocations = Table.AddColumn(#"Added Custom", "Location", each List.Sort(List.Distinct([All Rows For RefID][Location]))),
    ConcatenateLocations = Table.TransformColumns(DistinctLocations, {"Location", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    DistinctAreas = Table.AddColumn(ConcatenateLocations, "Area", each List.Sort(List.Distinct([All Rows For RefID][Area]))),
    ConcatenateAreas = Table.TransformColumns(DistinctAreas, {"Area", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    #"Removed Columns" = Table.RemoveColumns(ConcatenateAreas,{"All Rows For RefID"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"ID", "RefID", "Location", "Area"})
in
    #"Reordered Columns"
I hope this helps. If it does, please Mark as a solution.
I also appreciate Kudos.

View solution in original post

4 REPLIES 4
Anonymous
Not applicable

@Anonymous  - Try this Power Query script:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSAeNQbyDhnJFanJFZlKoUqxOtZIQs45OYl5yIkDMGCoHkQ4MdgaRfarlCZH5RNljKBLeUKUwKwy4zoIAxdk3mQAET7FIWKFLhmcXJ+XnFmXlKsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, RefID = _t, Location = _t, Area = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"RefID", Int64.Type}, {"Location", type text}, {"Area", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"RefID"}, {{"All Rows For RefID", each _, type table [ID=number, RefID=number, Location=text, Area=text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "ID", each List.First([All Rows For RefID][ID])),
    DistinctLocations = Table.AddColumn(#"Added Custom", "Location", each List.Sort(List.Distinct([All Rows For RefID][Location]))),
    ConcatenateLocations = Table.TransformColumns(DistinctLocations, {"Location", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    DistinctAreas = Table.AddColumn(ConcatenateLocations, "Area", each List.Sort(List.Distinct([All Rows For RefID][Area]))),
    ConcatenateAreas = Table.TransformColumns(DistinctAreas, {"Area", each Text.Combine(List.Transform(_, Text.From), ";"), type text}),
    #"Removed Columns" = Table.RemoveColumns(ConcatenateAreas,{"All Rows For RefID"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"ID", "RefID", "Location", "Area"})
in
    #"Reordered Columns"
I hope this helps. If it does, please Mark as a solution.
I also appreciate Kudos.
Anonymous
Not applicable

Amazing this worked and have learnt more about Group By 🙂

Greg_Deckler
Super User
Super User

You can do this in DAX using the following:

Table 2 = SUMMARIZE('Table1',[RefID (Duplicate Field)],"ID",MIN('Table1'[ID]),"Location",CONCATENATEX(DISTINCT('Table1'[Location (To Merge)]),[Location (To Merge)],";"),"Area",CONCATENATEX(DISTINCT('Table1'[Area (To Merge)]),[Area (To Merge)],";"))

 

@ImkeF  should be able to provide a Power Query solution


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
amitchandak
Super User
Super User

Try using

https://docs.microsoft.com/en-us/dax/concatenatex-function-dax

 

If required also summarize function

https://docs.microsoft.com/en-us/dax/summarize-function-dax

 

Appreciate your Kudos. In case, this is the solution you are looking for, mark it as the Solution. In case it does not help, please provide additional information and mark me with @
Thanks.

My Recent Blog - https://community.powerbi.com/t5/Community-Blog/Comparing-Data-Across-Date-Ranges/ba-p/823601

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.