Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
ratercero
Helper III
Helper III

On Power Query Editor, how to filter rows with the earliest date each name aprears

Hello I merged 4 tables, that have duplicate values, I want to keep only the one that has the earliest date for each "Lote" (ignoring case), how can I fix this? The code I have removes duplicates but not in a defined order:

 

 

let
    Source1 = Table.AddColumn(PNC_TELA, "Source", each "PNC_TELA"),
    Source3 = Table.AddColumn(TONO_MALO, "Source", each "TONO_MALO"),
    #"Removed Other Columns" = Table.SelectColumns(Source3,{"Fecha de Ingreso", "Lote", "Defecto 1", "Disposición", "CatCausa", "Causa", "Comentarios", "Proceso Responsable", "Source"}),
    Source2 = Table.AddColumn(PNC_PARTES_CORTADAS, "Source", each "PNC_PARTES_CORTADAS"),
    Source4 = Table.AddColumn(VARIACION_TONO, "Source", each "VARIACION_TONO"),

SourceMerge = Table.Combine({#"Removed Other Columns",Source4, Source1, Source2}),
    #"Filtered Rows1" = Table.SelectRows(SourceMerge, each [Lote] <> null and [Lote] <> ""),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1",{{"Fecha de Ingreso", type date}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each true),
  #"Remove Duplicates" = Table.Distinct(#"Filtered Rows", { "Lote", Comparer.OrdinalIgnoreCase })
in
    #"Remove Duplicates"

Best Regards

14 REPLIES 14
Vvelarde
Community Champion
Community Champion

@ratercero

 

Hi, please post a Data Sample to replicate your case.

 

Regards

 

Victor

Lima - Peru




Lima - Peru

Hi @Vvelarde,

 

Here it is:

 

Data Sample

 

 

 

 

 

@ratercero

 

One option is Group By Your Columns and Min Date

 

Regards

 

Victor




Lima - Peru

@Vvelarde,

 

can you provide guidance on how to perform this?

@ratercero

 

Hi please review the image 

 

GroupByMinDate.gif

 

Regards

 

Victor




Lima - Peru

@Vvelarde,

 

This does not work as I intend, due to the values in the other columns are not the same, keeps duplicate "lote". I need something that determines the minimum date for each "lote" and keep that row only.

 

@ratercero

 

Your data is like this:

 

Lote OTHERCOL1 OTHERCOL2 Date

AAA1   FFFF         GGGGG        15/02/17

AAA2   GGFFD    RRRRR         11/02/17

AAA1    RERER    GGGF         03/02/17

AAA2    RRR        AAASD       15/02/17

 

And you want to keep this:

 

Lote OTHERCOL1 OTHERCOL2 Date

AAA1    RERER    GGGF         03/02/17

AAA2   GGFFD    RRRRR         11/02/17 

 

Its ok this.




Lima - Peru

Exactly, 

@ratercero

 

OK, please look the image.

 

GroupByMinDate.gif

 

Regards

 

Victor

Lima - Peru




Lima - Peru

@Vvelarde,

 

what this does, is that shows only dates on the listings, causing even more duplicates on my data as in you date those dates do no duplicate it you do not see it on your query, please review my data sample once ome maybe I made a mistake while replicating what is on the image.

 

Data Sample

@ratercero

 

Upss i made a mistake in the merge. The Merge is with Lote from Both Tables.

 

 




Lima - Peru

@Vvelarde,

 

This kept the table as before the merge, based on you idea tried something on the query, this finds the earliest date on the table, then I just have to filter all "true", I need to do the same thing but earliest date for each batch in order to get the result Im looking for.

 

Maybe you can help me review.

 

Here is the M code:

 

let
    Source1 = Table.AddColumn(PNC_TELA, "Source", each "PNC_TELA"),
    Source3 = Table.AddColumn(TONO_MALO, "Source", each "TONO_MALO"),
    #"Removed Other Columns" = Table.SelectColumns(Source3,{"Fecha de Ingreso", "Lote", "Defecto 1", "Disposición", "CatCausa", "Causa", "Comentarios", "Proceso Responsable", "Source"}),
    Source2 = Table.AddColumn(PNC_PARTES_CORTADAS, "Source", each "PNC_PARTES_CORTADAS"),
    Source4 = Table.AddColumn(VARIACION_TONO, "Source", each "VARIACION_TONO"),

SourceMerge = Table.Combine({#"Removed Other Columns",Source4, Source1, Source2}),
    #"Filtered Rows1" = Table.SelectRows(SourceMerge, each [Lote] <> null and [Lote] <> ""),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1",{{"Fecha de Ingreso", type date}}),
    #"Filtered Rows2" = Table.SelectRows(#"Changed Type", each [Disposición] <> "APROBADO" and Text.StartsWith([Defecto 1], "3")),
    #"FindMinDate" = Table.AddColumn( #"Filtered Rows2", "MinDate", each if [Fecha de Ingreso]=List.Min(#"Filtered Rows2"[Fecha de Ingreso]) then "true" else "false")
in
    FindMinDate

@ratercero

 

I can't review the Query Editor in your PBIX because is a external source. 

 

To a easy solution can you send me a Excel file with part of the data (Combined the 4 sources and with the duplicates). 

 

Regards

 

Victor

 

 

 

 

 

 

 




Lima - Peru

@Vvelarde,

 

Here it is:

 

Excel Sample

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.