Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
jochendecraene
Helper IV
Helper IV

Preserve ID based on Max value of index

Hi,

 

I've got this table that contains ID's of calls, where most of the calls have sub ID's. I'm looking for a solution where I can only preserve the ID's with the highest index, so I can see the end status of every call. So I only want the yellow ones ...

 

Knipsel.JPG

 

Is this possible?

 

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

Since, you have already sorted Index column (if not sort on Index), See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RY9BCoMwFETvkrWL6rSaLm21eoaKDD1E72/mJzCLgcfjE/KOIw331KVZY+A/nV2xD1vhr9rRdrSdbCfbbJttnwVfGgOrxa3gW2Ng/QN62963Q8FFY2C7hS18q6BVY2CzCvpoDGwvKGjTCLdBQbtGuA3Z1m1Q0FcjWtt5AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Index = _t, ID1 = _t, ID2 = _t, Status = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index.1", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Select", each try if [ID1]<>#"Added Index"[ID1]{[Index.1]+1} then "Y" else "N" otherwise "Y"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Select] = "Y")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index.1", "Select"})
in
    #"Removed Columns"

 

View solution in original post

7 REPLIES 7
Vijay_A_Verma
Super User
Super User

Since, you have already sorted Index column (if not sort on Index), See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RY9BCoMwFETvkrWL6rSaLm21eoaKDD1E72/mJzCLgcfjE/KOIw331KVZY+A/nV2xD1vhr9rRdrSdbCfbbJttnwVfGgOrxa3gW2Ng/QN62963Q8FFY2C7hS18q6BVY2CzCvpoDGwvKGjTCLdBQbtGuA3Z1m1Q0FcjWtt5AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Index = _t, ID1 = _t, ID2 = _t, Status = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index.1", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Select", each try if [ID1]<>#"Added Index"[ID1]{[Index.1]+1} then "Y" else "N" otherwise "Y"),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Select] = "Y")),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index.1", "Select"})
in
    #"Removed Columns"

 

@Vijay_A_Verma 

the solution work fine on my smalle testset, but it takes to much processing power when I try to apply this to the real dataset. After 1 hour of refreshing only 350 rowes were loaded. Without this code it takes less dan 1 minute ...

Below are 2 approaches tested on 1 million records. First approach based on using buffered list to filter took 160 seconds whereas second approach was based on @AlexisOlson 's approach of using Records (for Max) in the column. The second approach is faster and took 100 seconds for 1 million records.

Excel file (contains VBA code for timing queries) used for this testing with both approaches can be downloaded from - https://1drv.ms/x/s!Akd5y6ruJhvhuXoDyaB0q3-Gj7QA?e=m2SvxC 

Edit - There is also an Inner Merge approach which is also quite fast. This is approach 3 below.

Approach 1

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type number}}),
    List = List.Buffer(#"Grouped Rows"[Max]),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each (List.Contains(List,[Index])))
in
    #"Filtered Rows"

 

Approach 2

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type nullable number}, {"All", each Table.Max(_, "Index"), type record}}),
    #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"ID1", "Max"}),
    #"Expanded All" = Table.ExpandRecordColumn(#"Removed Columns", "All", {"Index", "ID1", "ID2", "Status"}, {"Index", "ID1", "ID2", "Status"})
in
    #"Expanded All"

Approach 3

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type nullable number}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Index"}, #"Grouped Rows", {"Max"}, "Removed Columns", JoinKind.Inner),
    #"Removed Columns1" = Table.RemoveColumns(#"Merged Queries",{"Removed Columns"})
in
    #"Removed Columns1"

 

@Vijay_A_Verma  thnx

 

I can not open or download the excel file. I get the message the file is to big ....

Yeah, using table lookup logic inside of custom columns tends to be highly inefficient when the table is very large.

 

I'd recommend trying the last couple of methods I describe in my blog post here:

Select Distinct Rows Ordered by Another Column

@Vijay_A_Verma thank you very much!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors