Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi Experts,
I am consolidating the data and need your ideas and how to do it
Problem statement: I have about 65+ columns of data (not ordered) and I need to consolidate to group them and the way is to find each column /List with every list and get a matching elements number and by analyzing each column, would give some decision making input what data source can be eliminated by bringing the missing elements and thus my idea to compress these 65 to as less as possible.
I would have to do vlookup/xlookup() against half the columns say about 40 columns to get this data manually.
Here is the sample in the picutre. Please can you give me ideas / M code which can do this
Thanks a lot
Hi, @Anonymous
Any update?
May I ask if your problem has been solved? Is the above post helpful to you?
If it does, could you please mark the post which help as Answered? It will help the others in the community find the solution easily if they face the same problem with you. Thank you.
Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Again thank you every one each one is tested all good, can you please give me hint what else to change to make the code working.
here is the changed Source data.
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t,Column21 = _t, Column22 = _t, Column23 = _t, Column24 = _t, Column25 = _t,Column26 = _t, Column27 = _t, Column28 = _t, Column29 = _t, Column30 = _t,Column31 = _t, Column32 = _t, Column33 = _t, Column34 = _t, Column35 = _t,Column36 = _t, Column37 = _t, Column38 = _t, Column39 = _t, Column40 = _t]),
Thanks a ton.
@Anonymous my solution is based on loop; so whether the data set has 40 or 400 columns the loop will work. I am not sure if you had a chance to try out the solution with the real data yet.
If you prefer mine, once you defined Source, copy everything from CT to
#"Removed Columns"
@smpa01 Thank you but I tried all the solutions 1st and last one works with many columns but strips to 4 columns as I see the follow up code of x and y etc but your code throws up error may be easy fix in the step AI. Please can you have a look.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t,Column21 = _t, Column22 = _t, Column23 = _t, Column24 = _t, Column25 = _t,Column26 = _t, Column27 = _t, Column28 = _t, Column29 = _t, Column30 = _t,Column31 = _t, Column32 = _t, Column33 = _t, Column34 = _t, Column35 = _t,Column36 = _t, Column37 = _t, Column38 = _t, Column39 = _t, Column40 = _t]),
CT = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
DH = Table.DemoteHeaders(CT),
CN = Record.ToList(DH{0}),
src=List.Generate(
()=>[i=0,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
each [i]<List.Count(CN),
each [i=[i]+1,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
each [j]
),
C2T = Table.FromList(src, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
AI = Table.AddIndexColumn(C2T, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(AI, "Custom", each let x = [Index],
a= [Column1],
c = CN{x},
y=
List.Generate(
()=>[i=0,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
each [i]<List.Count(src),
each [i=[i]+1,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
each [k= if [i]<>[j] then [m] else null]
) in y),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Column1", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
#"Expanded Custom" = Table.ExpandListColumn(#"Extracted Values", "Custom"),
#"Expanded Custom1" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom", {"k"}, {"k"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom1", each ([k] <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
#"Removed Columns"
Thank you
@Anonymous change to
Table.AddIndexColumn(C2T, "Index", 0, 1)
Thank you, it went past that error and struck at list.(Extracted Values)
Also the CT step is showing errors as I have 700 rows, it is showing error in every row with only 7 rows.
Table.AddIndexColumn(table as table, newColumnName as text, optional initialValue as nullable number, optional increment as nullable number, optional columnType as nullable type) as table
It went past this, syntax says you are right, not sure why that error has come
but now I have error saying "40 keys were specified, but 4 values were provided" details: List
@Anonymous provide a dummy dataset with 40 columns with maybe 5 rows each
Sure @smpa01 Thank you. But unfortunately today whole is in some meetings. I will do this later today. Thanks a ton again
@Anonymous no worries. Tag me back with @ when you can make that available.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
Custom1 = let
a=Table.ColumnNames(Source)
in
Table.FromColumns(
List.Transform(
a,
each List.Transform(
List.RemoveItems(a,{_}),
(x)=>
let
aa=List.RemoveItems(Table.Column(Source,x),{"",null}),
bb=List.RemoveItems(Table.Column(Source,_),{"",null}),
cc=List.Count(
List.PositionOfAny(
aa,
bb,
2
)
)
in
Number.ToText(cc,x&":0"&(if List.Count(aa)=cc or List.Count(bb)=cc then " & all match" else ""))
)
),
a
)
in
Custom1
@Anonymous can you try this
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
CT = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
DH = Table.DemoteHeaders(CT),
CN = Record.ToList(DH{0}),
src=List.Generate(
()=>[i=0,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
each [i]<List.Count(CN),
each [i=[i]+1,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
each [j]
),
C2T = Table.FromList(src, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
AI = Table.AddIndexColumn(C2T, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(AI, "Custom", each let x = [Index],
a= [Column1],
c = CN{x},
y=
List.Generate(
()=>[i=0,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
each [i]<List.Count(src),
each [i=[i]+1,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
each [k= if [i]<>[j] then [m] else null]
) in y),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Column1", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
#"Expanded Custom" = Table.ExpandListColumn(#"Extracted Values", "Custom"),
#"Expanded Custom1" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom", {"k"}, {"k"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom1", each ([k] <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
#"Removed Columns"
Hi, pal, ur dummy dataset benefits all others 😂
I knew List.Generate is overwhelmingly powerful; but ever since I managed to understand the intricacy of List.Accumulate/List.TransformMany, List.Generate became a benchwarmer on my end.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
Cols = List.Transform(Table.ToColumns(Source), each List.Select(_, each _<>"")),
Comparison =
let l = {0..List.Count(Cols)-1}
in
List.TransformMany(
l,
each List.RemoveItems(l, {_}),
(x,y) =>
{Text.Combine(Cols{x}, ","), Text.Combine(Cols{y}, ","), Text.Format("Column#{0} has #{1} matche(s) with Column#{2}", {x+1, List.Count(List.Intersect({Cols{x}, Cols{y}})), y+1})}
),
Result = Table.FromRows(Comparison, {"ColX", "ColY", "Res"})
in
Result
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
Hey @AlexisOlson Thank you once again sir. Yes I want to consolidate and reduce the data sets by adding some elements, Well we could combine all and have one data sets but the deciding factor would be data set differences. That would be not my call, I am only asked to find how many elements are same /different between column.
Thanks a lot as always.
Can you link to a file or at least post some data in a format that doesn't need to be manually keyed in?
@AlexisOlson oops the file was done in home computer and I can upload to git and give you a link. Unfortunately my bad luck cannot be done with restrictions in office and I will upload today evening or find some ways later in the day. Have a great day
What are you ultimately trying to get to? It's easier to consolidate all of the columns than to find the intersection count for each pair.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.