Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello everybody,
I am trying to use the Group-By function in the PowerQuery but I am facing some issues. I have data that looks somewhat like the table below, and I would like to only keep the rows that has the max date per customer number. In other words, one line per customer with the latest date stamp.
Customer number | Date | Name | Current city | Status |
11111 | 01-01-2020 | Circle | New York | Open |
11111 | 01-01-2019 | Circle | New York | Closed |
22222 | 01-01-2019 | Square | San Fransisco | In progress |
22222 | 01-01-2020 | Square | Boston | Open |
22222 | 01-03-2021 | Square | Boston | Closed |
33333 | 01-01-2021 | Triangle | Los Angeles | In progress |
I only want to keep the following rows (in bold above):
Customer number | Date | Name | Current city | Status |
11111 | 01-01-2020 | Circle | New York | Open |
22222 | 01-03-2021 | Square | Boston | Closed |
33333 | 01-01-2021 | Triangle | Los Angeles | In progress |
I would like to create the group-by function to have (1) one row per customer number, and (2) choose the row that has the max date and (3) keep the current city, name and status from the row with the max date.
The reason that I am having problems with this is that if I choose for example "MAX" on both the date column and on the "city" column, I want the date column to have higher priority so Power BI does not keep the row based on the "MAX" city name.
Any ideas how I set up the group-by function to work like this?
Thanks in advance!
Solved! Go to Solution.
This should work
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bY8xC8JADIX/ynFzhbvr5KgFQRAd6iKlw1FDKZakJop/30unq/aRIQnvIy9NY73KFtb5TarggjNJaVEN3I2g3Rk+5kb8SO1lArRt8Y/5bQYtmWokgftMBdWCSkP9fEdWpo5oDhxRBukozUc0E1PPILJKB5fTe5IXYZ4x95fq96v+LF+pyi8oceUhYj9/dSIxO+xhBPnN134B", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [CountryCustomerCode = _t, #"Date (B)" = _t, Name = _t, #"Current city" = _t, #"Status " = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date (B)", type date}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"CountryCustomerCode"}, {{"AA", each let t =_ in Table.SelectRows(t, each [#"Date (B)"] = List.Max(t[#"Date (B)"])), type table }}),
#"Expanded AA" = Table.ExpandTableColumn(#"Grouped Rows", "AA", {"Date (B)", "Name", "Current city", "Status "}, {"Date (B)", "Name", "Current city", "Status "})
in
#"Expanded AA"
My advice is to avoid spaces in ColumnNames and Step/TableNames, otherwise you have to use #"".
Hi, the solution here actually works, it's a simple way to do it and doesn't crash my Power BI!
Hi, the solution here actually works, it's a simple way to do it and doesn't crash my Power BI!
Hi again, it seems like the above solution is making the report to crash, probably because it is a very big table (many rows) that I have.. do you have any proposal for how to make a lighter solution? Thanks in advance.
This should do the job.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bY8xC8JADIX/ynFzhbvr5KgFQRAd6iKlw1FDKZakJop/30unq/aRIQnvIy9NY73KFtb5TarggjNJaVEN3I2g3Rk+5kb8SO1lArRt8Y/5bQYtmWokgftMBdWCSkP9fEdWpo5oDhxRBukozUc0E1PPILJKB5fTe5IXYZ4x95fq96v+LF+pyi8oceUhYj9/dSIxO+xhBPnN134B", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Customer number" = _t, Date = _t, #"Name " = _t, #"Current city" = _t, Status = _t]),
ChangedType = Table.TransformColumnTypes(Source,{{"Date", type date}}),
Grouped = Table.ExpandTableColumn(Table.Group(ChangedType, {"Customer number"}, {{"A", each let t=_ in Table.SelectRows(t, each [Date]= List.Max(t[Date]))}}), "A", List.RemoveFirstN(Table.ColumnNames(Source),1))
in
Grouped
Hi @Jakinta , thanks for your response!
I am getting some errros though when I apply the code. In reality my "Customer number" column is called "CountryCustomerCode" and if I exchange "Customer number" with "CountryCustomerCode" I get an error:
Expression.Error: The field 'CountryCustomerCode' already exists in the record.
Also, my "Date" field is called "(B) Date" field, does that make a difference?
Lastly, the does the "CountryCustomerCode" need to be in the first column for this to work?
My code looks like this now (i have referenced another table):
let
Source = #"Main Database",
#"ChangedType" = Table.TransformColumnTypes(Source,{{"(B) Date", type date}}),
Grouped = Table.ExpandTableColumn(Table.Group(#"ChangedType", {"CountryCustomerCode"}, {{"A", each let t=_ in Table.SelectRows(t, each [Date]= List.Max(t[Date]))}}), "A", List.RemoveFirstN(Table.ColumnNames(Source),1))
in
Grouped
Yes, I supposed that your grouping column would be the 1st column according to sample you provided.
You can change List.RemoveFirst... part to
List.RemoveItems(Table.ColumnNames(Source), {"CountryCustomerCode"})
or provide proper sample data.
Or just do grouping first then expand later with columns you need. I merged those 2 steps...
Hi @Jakinta , thanks for your response! I can also just move the column to be the first one in my data - no problems.
I am not super skilled on the Power Query coding so I am not exactly sure how I would modify the code. Even though I move the column I get the same error.
Also if I try to modify the code I get the same errors (probably I am doing something wrong):
let
Source = #"(B) Main Database",
#"ChangedType" = Table.TransformColumnTypes(Source,{{"(B) Date", type date}}),
Grouped = Table.ExpandTableColumn(Table.Group(#"ChangedType", {"CountryCustomerCode"}, {{"A", each let t=_ in Table.SelectRows(t, each [Date]= List.Max(t[Date]))}}), "A",
List.RemoveItems(Table.ColumnNames(Source), {"CountryCustomerCode"}))
in
Grouped
So my sample data looks like this now:
CountryCustomerCode | Date (B) | Name | Current city | Status |
11111 | 01-01-2020 | Circle | New York | Open |
11111 | 01-01-2019 | Circle | New York | Closed |
22222 | 01-01-2019 | Square | San Fransisco | In progress |
22222 | 01-01-2020 | Square | Boston | Open |
22222 | 01-03-2021 | Square | Boston | Closed |
33333 | 01-01-2021 | Triangle | Los Angeles | In progress |
This should work
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bY8xC8JADIX/ynFzhbvr5KgFQRAd6iKlw1FDKZakJop/30unq/aRIQnvIy9NY73KFtb5TarggjNJaVEN3I2g3Rk+5kb8SO1lArRt8Y/5bQYtmWokgftMBdWCSkP9fEdWpo5oDhxRBukozUc0E1PPILJKB5fTe5IXYZ4x95fq96v+LF+pyi8oceUhYj9/dSIxO+xhBPnN134B", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [CountryCustomerCode = _t, #"Date (B)" = _t, Name = _t, #"Current city" = _t, #"Status " = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date (B)", type date}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"CountryCustomerCode"}, {{"AA", each let t =_ in Table.SelectRows(t, each [#"Date (B)"] = List.Max(t[#"Date (B)"])), type table }}),
#"Expanded AA" = Table.ExpandTableColumn(#"Grouped Rows", "AA", {"Date (B)", "Name", "Current city", "Status "}, {"Date (B)", "Name", "Current city", "Status "})
in
#"Expanded AA"
My advice is to avoid spaces in ColumnNames and Step/TableNames, otherwise you have to use #"".
Thank you very much, it seems to work!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.