Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
bullius
Helper V
Helper V

How to count duplicate values in M

Hi

 

I have a table that looks like this:

 

IDPersonID
1A
2A
3B
4C
5D
6E
7F
8G
9G
10G

 

I want to add a column that counts how many employments each PersonID has, e.g.:

 

EmploymentIDPersonIDEmploymentsCount
1A2
2A2
3B1
4C1
5D1
6E1
7F1
8G3
9G3
10G3

 

I want to do this in the Query Editor, so using M.

 

Thanks.

3 ACCEPTED SOLUTIONS
MFelix
Super User
Super User

Hi @bullius,

 

You need to do a group by the Person ID and then do a Merge of the Step before the Group by and the Group by result, this part you can do it by merging the table with itself and then changing the first table to the step you want.

 

See the M code for a input table in power query.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"Count", each Table.RowCount(_), type number}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"PersonID"},#"Grouped Rows",{"PersonID"},"Grouped Rows",JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Count"}, {"Grouped Rows.Count"})
in
    #"Expanded Grouped Rows"

Step by step:

group.png

 

Regards,

MFelix

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



View solution in original post

Hi @bullius,

 

Using the option by @MarcelBeug,

 

I have redone the M code:

 

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"Count", each _, type table}, {"PersonID.1", each Table.RowCount(_), type number}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"ID"}, {"ID"})
in
    #"Expanded Count"

So this one is for Marcel 😄

 

Regards

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



View solution in original post

Thanks @MFelix.

 

I would have done it a little bit different:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"AllRows", each _, Value.Type(#"Changed Type")}, {"Count", each Table.RowCount(_), type number}}),
    #"Expanded AllRows" = Table.ExpandTableColumn(#"Grouped Rows", "AllRows", {"ID"}, {"ID"})
in
    #"Expanded AllRows"

 

You missed the video part between 0:30 - 0:50 where I replace type table.

 

A disadvantage (or bug or design error or issue) of operation "All Rows" in Group By:
all column types of the nested tables are reset to "Any", which you can see after expansion.

 

That's why I always replace type table with Value.Type(step name) where step name is the same step name as the first parameter of Table.Group.

 

This is all explained in this video fragment, which is actually a part of a playlist of 3 videos about Value.Type.

Specializing in Power Query Formula Language (M)

View solution in original post

9 REPLIES 9
MFelix
Super User
Super User

Hi @bullius,

 

You need to do a group by the Person ID and then do a Merge of the Step before the Group by and the Group by result, this part you can do it by merging the table with itself and then changing the first table to the step you want.

 

See the M code for a input table in power query.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"Count", each Table.RowCount(_), type number}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type",{"PersonID"},#"Grouped Rows",{"PersonID"},"Grouped Rows",JoinKind.LeftOuter),
    #"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Count"}, {"Grouped Rows.Count"})
in
    #"Expanded Grouped Rows"

Step by step:

group.png

 

Regards,

MFelix

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Using Group By with (additional) operation "All Rows" is equivalent with merging 2 tables.

Specializing in Power Query Formula Language (M)

Thanks for all your responses!

 

@MarcelBeug, could you give an example of the code that would include the "All Rows" operation?

Hi @bullius,

 

Using the option by @MarcelBeug,

 

I have redone the M code:

 

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"Count", each _, type table}, {"PersonID.1", each Table.RowCount(_), type number}}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"ID"}, {"ID"})
in
    #"Expanded Count"

So this one is for Marcel 😄

 

Regards

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Thanks @MFelix.

 

I would have done it a little bit different:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXJUitWJVjKCs4yBLCcwywTIcgazTIEsFzDLDMhyBbPMgSw3MMsCyHIHsyzhLEMDCDMWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, PersonID = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PersonID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PersonID"}, {{"AllRows", each _, Value.Type(#"Changed Type")}, {"Count", each Table.RowCount(_), type number}}),
    #"Expanded AllRows" = Table.ExpandTableColumn(#"Grouped Rows", "AllRows", {"ID"}, {"ID"})
in
    #"Expanded AllRows"

 

You missed the video part between 0:30 - 0:50 where I replace type table.

 

A disadvantage (or bug or design error or issue) of operation "All Rows" in Group By:
all column types of the nested tables are reset to "Any", which you can see after expansion.

 

That's why I always replace type table with Value.Type(step name) where step name is the same step name as the first parameter of Table.Group.

 

This is all explained in this video fragment, which is actually a part of a playlist of 3 videos about Value.Type.

Specializing in Power Query Formula Language (M)

Hi @MarcelBeug,

 

You are correct I have made the change but copied the incorrect M code :D.

 

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Thanks @MarcelBeug,

 

Always learning :D:D

 

MFelix


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



jthomson
Solution Sage
Solution Sage

There's almost certainly a more efficient way to do this than the below, but you can:

 

- Duplicate your query

- Group the rows in the duplicate by person ID, including a column that counts the number of rows

- Join this back into your original query

 

edit - just remembered that I do something quite similar in one of my reports and duplicating it isn't necessary at all, you can do similar to steps 2/3 above, but reword the grouping step so that it creates a new table, so something like

 

NewTable = Table.Group(#"Previous Step", {"Employee ID"}, {{"Employment Count", each Table.RowCount(_), type number}}),

ImportNewTable = Table.Join(#"Previous Step", "Employee ID", NewTable, "Employee ID")

MarcelBeug
Community Champion
Community Champion

You can use Group By.

 

Coincidentally I answered a similar question with a video.

In the video, a minimum is returned instead of a count, otherwise itfits your requirements,

 

Specializing in Power Query Formula Language (M)

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.