Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Dear community,
I just can't figure out how to create a table just like the one below.
I want to identify anomalies in our customer database.
Each address should have a unique Zipcode but I found some having more than 1.
I have a dataset of more then 30K records, so it cannot be done manually.
How can I use PowerQuery to create this table???
Thank you for your support
Address #record in dataset #of different Zipcode
ABC 3 1
CDE 14 3
FGH 10 1
...
Solved! Go to Solution.
Hi @DanFromMontreal ,
I created a data sample:
Please simply use "Group by" to get the count and distinctCount:
Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
try this
let
Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIEYUOlWJ1opSQg0wiEjcDcZCDTGISNwVyYYiOoNIhvAsImcM3mQGxqCuamAplmIGyGImturhQbCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [address = _t, zipcode = _t, otherthink = _t]),
#"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"address", type text}, {"zipcode", Int64.Type}, {"otherthink", Int64.Type}}),
#"Raggruppate righe" = Table.Group(#"Modificato tipo", {"address"}, { {"ziplist", each List.Count(List.Distinct([zipcode]))}})
in
#"Raggruppate righe"
PS
what you have left out is to load an example table with the starting data and a table with the desired result
Hi @DanFromMontreal ,
I created a data sample:
Please simply use "Group by" to get the count and distinctCount:
Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Great explanation @v-eqin-msft
If I understand correctly, it works only if there is only 2 columns in my table (Address, Zipcode).
My table has 10 different columns. I had to remove all of them and keep only the Address & Zipcode to make it work.
What if I wanted to add a column to have a Distinct count from another column, such as Customer?
The aggregate function would not work
#Address #Record in dataset #Distinct Zipcode #Distinct Customer
Regards,
You should groupby Address and get count and list of ZipCode.
let
Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Pcy5DQAwCEPRXVynyUWGQRS59l8hmCLFk74lhComEjJlWFIsz0Il5vasVGPyuFGLeTw79ZjXU0j+q0EDZg8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [address = _t, zipcode = _t, otherthink = _t]),
#"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"address", type text}, {"zipcode", Int64.Type}, {"otherthink", Int64.Type}}),
#"Raggruppate righe" = Table.Group(#"Modificato tipo", {"address"}, {{"Conteggio", each Table.RowCount(_), Int64.Type}, {"ziplist", each _[zipcode]}})
in
#"Raggruppate righe"
Rocco,
Tested your solution and it does not fully answer my need.
I was able to regroup to obtain the number of record in dataset (conteggio).
But the part of having the number of DIFFERENT zipcode for each address, that, your solution does not provide.
Again, in my example, for the address CDE, there is 14 records and out of those 14 addresses, there is 3 different zipcode associated to it.
Am I missing something?
Thank you for your support.
try this
let
Origine = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIEYUOlWJ1opSQg0wiEjcDcZCDTGISNwVyYYiOoNIhvAsImcM3mQGxqCuamAplmIGyGImturhQbCwA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [address = _t, zipcode = _t, otherthink = _t]),
#"Modificato tipo" = Table.TransformColumnTypes(Origine,{{"address", type text}, {"zipcode", Int64.Type}, {"otherthink", Int64.Type}}),
#"Raggruppate righe" = Table.Group(#"Modificato tipo", {"address"}, { {"ziplist", each List.Count(List.Distinct([zipcode]))}})
in
#"Raggruppate righe"
PS
what you have left out is to load an example table with the starting data and a table with the desired result
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.