Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
rayinOz
Helper III
Helper III

Finding multiple entries / Creating new table

Hello,

 

So I have a table that has the following columns:

 

| Username | Course Name | Course Status |

 

The table contains data related to course enrolments/completions.

 

There is Course A and Course B.

 

Users are only supposed to enrol in one course, but we have situations where a user enrols in both.

 

So, I want to find the users who are enroled into both courses. Idealy a new table with the users are are enroled into both courses.

 

Thoughts?

 

Thanks!

 

RayinOz

2 ACCEPTED SOLUTIONS
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

View solution in original post

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

View solution in original post

6 REPLIES 6
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

Marcel,

 

Actually, I can't get it to work correctly. If I select Username and remove duplicates, it will remove all but one entry. Which is OK, however, I do want it to keep the entry with the latest date... 

 

Here's a screenshot of an example of a duplicate entry.

2017-04-20_15-07-47.png

 

 

 

In this table, the highlighted user (aaj) has completed two courses and I want to eliminate one of the entries... the one with the recent date I want to keep... the older date I want to ignore. In this example the dates are the same, so it doesn't matter which one goes away.

 

Does that make sense? Is this possible?

 

Rayinoz

 

 

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

nevermind, i figured it out... duh... LOL

 

= Table.Buffer(Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}))

Thanks Marcel... how do I wrap the code in the Table.buffer?

 

Here's the code that I do have.... 

 

    #"Sorted Rows" = Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Username"})
in
    #"Removed Duplicates"

Thanks for your help... I appreciate it so much!

 

Rayinoz

Thank you Marcel!! 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.