Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
MT88
New Member

Compare rows in different columns

Hi @ everyone,

 

I have a table with dublicates in two different columns and a third column which got a date.

I want to compare the rows in the first two columns (where the duplicates are) and if the next row contains the same information as the row before, the script should keep the row with the newer date and delete the row with the older date.

 

Example:

The script would see that row 2 got the same information as row 1 = Name and the Course is a duplicate,

so next it looks at the column "Date" and keep the row with the newer date = "11.01.2022" and delete the row with the older date = "19.07.2021".

 

NameCourseDate
Mr TBasketball11.01.2022
Mr TBasketball19.07.2021
Mrs LGolf10.05.2022
Mr MFootball20.12.2021

 

 

Can anyone help me ? I couldnt find a topic with the same problem.

1 ACCEPTED SOLUTION
AlB
Super User
Super User

@MT88 

You need to read the table in correctly in Excel. Assuming the data is formatted as table and named "Tabelle3", as in the xlsx you have shared:

let
    Source = Excel.CurrentWorkbook(),
    Tabelle3 = Source{[Name="Tabelle3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Tabelle3,{{"Mitarbeiter", type text}, {"Kurs", type text}, {"Ablaufdatum", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Ablaufdatum", each List.Max([Ablaufdatum]), type date}})
in
    #"Grouped Rows"

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

View solution in original post

8 REPLIES 8
AlB
Super User
Super User

@MT88 

You need to read the table in correctly in Excel. Assuming the data is formatted as table and named "Tabelle3", as in the xlsx you have shared:

let
    Source = Excel.CurrentWorkbook(),
    Tabelle3 = Source{[Name="Tabelle3"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Tabelle3,{{"Mitarbeiter", type text}, {"Kurs", type text}, {"Ablaufdatum", type date}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Ablaufdatum", each List.Max([Ablaufdatum]), type date}})
in
    #"Grouped Rows"

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

I did it (in excel) ! It was the name of the table that was wrong in my code -.-

 

And in Power Bi Power Query the last error was a syntax error -> comma

Finally found it then it worked.

 

THANK YOU !

AlB
Super User
Super User

@MT88 

It works with no issues on my side. Note the only relevant step below is the last one: #"Grouped Rows"

The previous ones are just to get the data in

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WilDSUXJKLM5OLUlKzMkBcowMdf2TS3SNjJRidbBJGyBLR2FIG+t6lebApCPRpQ0NdR1L03WNDMHSjujSBua6LqnJMN1OGIYbIduNKY3icmd0aWMD3eDUApjdLhh2W8KdFgsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Mitarbeiter = _t, Kurs = _t, Ablaufdatum = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Mitarbeiter", type text}, {"Kurs", type text}, {"Ablaufdatum", type date}}),

    #"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Ablaufdatum", each List.Max([Ablaufdatum]), type nullable date}})
in
    #"Grouped Rows"

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

When I copy the code, I always get the result for the sample or previous case, but not from my table I actually loaded. When I want the to change the source, its gets me the previous result or an error.

So the code works but not for my actually source its shows only the result for the sample.

 

My Code (my Problem is to link it to the actually table / source):

let
Source = Excel.CurrentWorkbook

#"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Ablaufdatum", each List.Max([Ablaufdatum]), type nullable date}})
in
#"Grouped Rows"

 

I also tried (in Power Bi / Power Query):

let
Source = Table.FromRows #"Ablaufende Kurse"
#"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Ablaufdatum", each List.Max([Ablaufdatum]), type nullable date}})
in
#"Grouped Rows"

AlB
Super User
Super User

@MT88 

I'd need a sample of #"Ablaufende Kurse" to be able to reproduce the issue

You might not need these two steps:

#"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Ablaufdatum", type date}}, "en-GB"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"Mitarbeiter", type text}, {"Kurs", type text}}),

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

MT88
New Member

First : Thanks for that solution ! 🙂

It works for the example but not for the real table.

 

Can you explain me how I convert the script for the real table ?

I tried to rebuild it for my table but it ends with an expression error.

 

A value of type "Table" can not be convert in type "List"

Details:

Value = [Table]

Type = [Type]

 

Name of Table / Source = Ablaufende Kurse

Table is an excel sheet.

 

Names of Columns:

Name = Mitarbeiter

Course = Kurs

Date = Ablaufdatum

 

Heres a copy of my code:

 


let
Source = Table.FromRows(#"Ablaufende Kurse", let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Mitarbeiter = _t, Kurs = _t, Ablaufdatum = _t]),
#"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Ablaufdatum", type date}}, "en-GB"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"Mitarbeiter", type text}, {"Kurs", type text}}),


#"Grouped Rows" = Table.Group(#"Changed Type", {"Mitarbeiter", "Kurs"}, {{"Count", each List.Max([Ablaufdatum]), type nullable date}})
in
#"Grouped Rows"


 


 

 

 

AlB
Super User
Super User

@MT88 

Another option is to group by Name and Course and select the max date as the operation:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8i1SCFHSUXJKLM5OLUlKzMkBcgwN9QwM9YwMjIyUYnVwKLHUMzAHKTGEKilW8AEKu+fnpIFkDfQMTFEM8AWKuuXnw7QbGegZGkG1xwIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Course = _t, Date = _t]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date", type date}}, "en-GB"),
    #"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"Name", type text}, {"Course", type text}}),


    #"Grouped Rows" = Table.Group(#"Changed Type", {"Name", "Course"}, {{"Count", each List.Max([Date]), type nullable date}})
in
    #"Grouped Rows"

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

AlB
Super User
Super User

Hi @MT88 

Place the following M code in a blank query to see the steps.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8i1SCFHSUXJKLM5OLUlKzMkBcgwN9QwM9YwMjIyUYnVwKLHUMzAHKTGEKilW8AEKu+fnpIFkDfQMTFEM8AWKuuXnw7QbGegZGkG1xwIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Course = _t, Date = _t]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date", type date}}, "en-GB"),
    #"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"Name", type text}, {"Course", type text}}),

    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Name", Order.Ascending}, {"Course", Order.Ascending}, {"Date", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Name", "Course"})
in
    #"Removed Duplicates"

It first sorts rows, so that the earlier dates stay on top, then removes duplicates based on Name-Course. The duplicate kept will be the one appearing first in the table (from top to bottom)

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors