Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

comparing each column with every column to find matching number

Hi Experts,

I am consolidating the data and need your ideas and how to do it

Problem statement: I have about 65+ columns of data (not ordered) and I need to consolidate to group them and the way is to find each column /List with every list and get a matching elements number and by analyzing each column, would give some decision making input what data source can be eliminated by bringing the missing elements and thus my idea to compress these 65 to as less as possible.

 

I would have to do vlookup/xlookup() against half the columns say about 40 columns to get this data manually.

Here is the sample in the picutre. Please can you give me ideas / M code which can do this

consolidate columns.PNG

 

Thanks a lot

16 REPLIES 16
v-angzheng-msft
Community Support
Community Support

Hi, @Anonymous 

 

Any update?

 

May I ask if your problem has been solved? Is the above post helpful to you?

If  it does, could you please mark the post which help as Answered? It will help the others in the community find the solution easily if they face the same problem with you. Thank you.

 

 

 

Best Regards,
Community Support Team _ Zeon Zheng

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

@wdx223_Daniel @smpa01 @CNENFRNL Thank you ALL, Worked on my work file but this one stripped to 4 columns, Please I get help to change to process the data to AK columns. so I changed the Source to reflect 40 columns and I could not understand what else can I change. 
Only if the loop can read number of columns, this is exact solution I was hoping for. 

Again thank you every one each one is tested all good, can you please give me hint what else to change to make the code working. 

here is the changed Source data. 

Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t,Column21 = _t, Column22 = _t, Column23 = _t, Column24 = _t, Column25 = _t,Column26 = _t, Column27 = _t, Column28 = _t, Column29 = _t, Column30 = _t,Column31 = _t, Column32 = _t, Column33 = _t, Column34 = _t, Column35 = _t,Column36 = _t, Column37 = _t, Column38 = _t, Column39 = _t, Column40 = _t]),


Thanks a ton.

@Anonymous my solution is based on loop; so whether the data set has 40 or 400 columns the loop will work. I am not sure if you had a chance to try out the solution with the real data yet. 

 

If you prefer mine, once you defined Source, copy everything from CT to 

#"Removed Columns"
Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
Anonymous
Not applicable

@smpa01  Thank you but I tried all the solutions 1st and last one works with many columns but strips to 4 columns as I see the follow up code of x and y etc but your code throws up error may be easy fix in the step AI. Please can you have a look.

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t, Column6 = _t, Column7 = _t, Column8 = _t, Column9 = _t, Column10 = _t, Column11 = _t, Column12 = _t, Column13 = _t, Column14 = _t, Column15 = _t, Column16 = _t, Column17 = _t, Column18 = _t, Column19 = _t, Column20 = _t,Column21 = _t, Column22 = _t, Column23 = _t, Column24 = _t, Column25 = _t,Column26 = _t, Column27 = _t, Column28 = _t, Column29 = _t, Column30 = _t,Column31 = _t, Column32 = _t, Column33 = _t, Column34 = _t, Column35 = _t,Column36 = _t, Column37 = _t, Column38 = _t, Column39 = _t, Column40 = _t]),
    CT = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
    DH = Table.DemoteHeaders(CT),
    CN = Record.ToList(DH{0}),
    src=List.Generate(
                ()=>[i=0,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
                   each [i]<List.Count(CN),
                   each [i=[i]+1,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
                   each [j]
    ),
    C2T = Table.FromList(src, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    AI = Table.AddIndexColumn(C2T, "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(AI, "Custom", each let x = [Index],
    a= [Column1],
    c = CN{x},
y=
List.Generate(
           ()=>[i=0,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
           each [i]<List.Count(src),
           each [i=[i]+1,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
           each [k= if [i]<>[j] then [m] else null] 
    ) in y),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Column1", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
    #"Expanded Custom" = Table.ExpandListColumn(#"Extracted Values", "Custom"),
    #"Expanded Custom1" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom", {"k"}, {"k"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Custom1", each ([k] <> null)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
    #"Removed Columns"

AI_error.PNG

 

Thank you

@Anonymous  change to

Table.AddIndexColumn(C2T, "Index", 0, 1)

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
Anonymous
Not applicable

Thank you, it went past that error and struck at list.(Extracted Values)

Also the CT step is showing errors as I have 700 rows, it is showing error in every row with only 7 rows. 

mythbusternz_0-1639520843304.png

 

Table.AddIndexColumn(table as table, newColumnName as text, optional initialValue as nullable number, optional increment as nullable number, optional columnType as nullable type) as table

 

It went past this, syntax says you are right, not sure why that error has come 

but now I have error saying "40 keys were specified, but 4 values were provided" details: List 

@Anonymous provide a dummy dataset with 40 columns with maybe 5 rows each

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
Anonymous
Not applicable

Sure @smpa01  Thank you. But unfortunately today whole is in some meetings. I will do this later today. Thanks a ton again

@Anonymous  no worries. Tag me back with @ when you can make that available.

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
wdx223_Daniel
Super User
Super User

wdx223_Daniel_0-1639449184376.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    Custom1 = let
                 a=Table.ColumnNames(Source)
              in
                 Table.FromColumns(
                                   List.Transform(
                                                  a,
                                                  each List.Transform(
                                                                      List.RemoveItems(a,{_}),
                                                                      (x)=>
                                                                           let
                                                                              aa=List.RemoveItems(Table.Column(Source,x),{"",null}),
                                                                              bb=List.RemoveItems(Table.Column(Source,_),{"",null}),
                                                                              cc=List.Count(
                                                                                            List.PositionOfAny(
                                                                                                               aa,
                                                                                                               bb,
                                                                                                               2
                                                                                                              )
                                                                                           )
                                                                           in
                                                                             Number.ToText(cc,x&":0"&(if List.Count(aa)=cc or List.Count(bb)=cc then " & all match" else ""))
                                                                     )
                                                 ),
                                   a
                                  )
in
    Custom1
smpa01
Super User
Super User

@Anonymous  can you try this

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    CT = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}}),
    DH = Table.DemoteHeaders(CT),
    CN = Record.ToList(DH{0}),
    src=List.Generate(
                ()=>[i=0,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
                   each [i]<List.Count(CN),
                   each [i=[i]+1,j=List.RemoveItems(Table.Column(CT,CN{i}),{""})],
                   each [j]
    ),
    C2T = Table.FromList(src, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    AI = Table.AddIndexColumn(C2T, "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(AI, "Custom", each let x = [Index],
    a= [Column1],
    c = CN{x},
y=
List.Generate(
           ()=>[i=0,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
           each [i]<List.Count(src),
           each [i=[i]+1,j=x,k=src{i},l=List.Count(List.Intersect({a,k})),m=c&" has "&Text.From(l)&" matches with "&CN{i}],
           each [k= if [i]<>[j] then [m] else null] 
    ) in y),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Column1", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
    #"Expanded Custom" = Table.ExpandListColumn(#"Extracted Values", "Custom"),
    #"Expanded Custom1" = Table.ExpandRecordColumn(#"Expanded Custom", "Custom", {"k"}, {"k"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Custom1", each ([k] <> null)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
    #"Removed Columns"
Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
CNENFRNL
Community Champion
Community Champion

Hi, pal, ur dummy dataset benefits all others 😂

 

I knew List.Generate is overwhelmingly powerful; but ever since I managed to understand the intricacy of List.Accumulate/List.TransformMany, List.Generate became a benchwarmer on my end.

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSAWMnIHZWitWJBrMgPB0lF7CIM5yno+QKFgGxgMg/LxVIhpTngwUhYiAekMwoSk1FFgXzdZTc8kuLkIRBIpllQIWxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    Cols = List.Transform(Table.ToColumns(Source), each List.Select(_, each _<>"")),
    Comparison = 
        let l = {0..List.Count(Cols)-1}
        in 
            List.TransformMany(
                l,
                each List.RemoveItems(l, {_}),
                (x,y) => 
                    {Text.Combine(Cols{x}, ","), Text.Combine(Cols{y}, ","), Text.Format("Column#{0} has #{1} matche(s) with Column#{2}", {x+1, List.Count(List.Intersect({Cols{x}, Cols{y}})), y+1})}
            ),
    Result = Table.FromRows(Comparison, {"ColX", "ColY", "Res"})
in
    Result

 

 

 


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

Anonymous
Not applicable

Hey @AlexisOlson  Thank you once again sir. Yes I want to consolidate and reduce the data sets by adding some elements, Well we could combine all and have one data sets but the deciding factor would be data set differences. That would be not my call, I am only asked to find how many elements are same /different between column. 

Thanks a lot as always. 

Can you link to a file or at least post some data in a format that doesn't need to be manually keyed in?

Anonymous
Not applicable

@AlexisOlson oops the file was done in home computer and I can upload to git and give you a link. Unfortunately my bad luck cannot be done with restrictions in office and I will upload today evening or find some ways later in the day. Have a great day

AlexisOlson
Super User
Super User

What are you ultimately trying to get to? It's easier to consolidate all of the columns than to find the intersection count for each pair.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors