Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
jnixon
Helper III
Helper III

Comparing lists

Hi everyone!  I have a table that has two columns that contain lists of text items and would like to add a third column that would display whether the lists match:

  Table.JPG

The [List X]{0} differs from [List X]{1} and [List X]{2}, same for [List Y] records.

 

So I am trying to create a third column that returns for each record "True" if each list member in List X matches the corresponding list member in List Y. I'm not trying to compare the Distinct Members (e.g. using List.Compare).  Rather, if [List X]{0} = {1,2,3} and [List Y]{0} = {3,2,1}, then the [New Column]{0} should have "False".

 

Can someone advise the M code to add this column?  I've tried List.Accumulate, but could not figure it out.

Thanks!

Jeff

1 ACCEPTED SOLUTION
Jimmy801
Community Champion
Community Champion

Hello

as lists can be compared by themselves,  meaning {}={} you can use the following syntax

=Table.AddColumn(YourTable, "ListX=ListY", each [List X] = [List Y])

 

have fun

Jimmy

View solution in original post

16 REPLIES 16
Jimmy801
Community Champion
Community Champion

Hello

as lists can be compared by themselves,  meaning {}={} you can use the following syntax

=Table.AddColumn(YourTable, "ListX=ListY", each [List X] = [List Y])

 

have fun

Jimmy

Wow, worked like a charm!  i was definitely overthinking things.  I guess I'll put off fully understanding List.Accumulate to another day - no worries!

Thanks. 

Anonymous
Not applicable

Well that was easy. I did not even think to do that. Thanks for the tip!

Anonymous
Not applicable

The code below should handle it.

 

List.Accumulate(List.Positions([List X]), true, (s,c) => s and ([List X]{c} = [List Y]{c}))

Unfortunately, i'm getting this error:

"Expression.Error: There is an unknown identifier. Did you use the [field] shorthand for a _[field] outside of an 'each' expression?"

 

This happens with or without a Table.Addcolumn wrapper. I get it when i run any of the nested functions.  When I adapt the code to this:

 

    PositionList = List.Positions(Table.Column(ListInputTable,"List X"){0}),   
    AddCheckCol = Table.AddColumn(ListInputTable,"Duplicate", each
        List.Accumulate(PositionList, true, (s,c) => s and (Table.Column(ListInputTable,"List X"){c} =
        Table.Column(ListInputTable,"List Y"){c}))
        ),

 

I get a table of the desired structure:, but the result is not correct (should be 'TRUE' in Row 2): 

 

Table2.JPG

 

Any ideas?

 

Thanks for the assist,

Jeff

Anonymous
Not applicable

What is the data in the lists?

The table is a cross join of three lists: DataColA, B and C below.  List "DataColA" = List "DataColC" <> List "DataColB".  Essentially, this is supposed to check to see if any column in the input table is a duplicate of any other column in that table.

So there should be a True for Row 2 record and a False for the others:

Lists:

DataColADataColBDataColC
BenTestBen
DanDanDan
NormNormNorm
ValerieValerieValerie
   

References in data table:

 

Record

List XList Y

1

DataColADataColB
2DataColADataColC
3DataColBDataColC

 

Anonymous
Not applicable

Let me know if this is what you are looking for. I am sure that my creation of the cross lists is too cumbersome, but it shows the list.accumulate in action.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckrNU9JRCkktLgFSIE6sTrSSSyJIEEGCxPzyi3KBXGQKJByWmJNalJkKFMJkxcYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [DataColA = _t, DataColB = _t, DataColC = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"DataColA", type text}, {"DataColB", type text}, {"DataColC", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {}, {{"A", each List.Combine({[DataColA]}), type list}, {"B", each List.Combine({[DataColB]}), type list},{"C", each List.Combine({[DataColC]}), type list}}),
    RemoveC = Table.RemoveColumns(#"Grouped Rows",{"C"}),
    ListAB = Table.RenameColumns(RemoveC,{{"A", "List X"}, {"B", "List Y"}}),
    RemoveB = Table.RemoveColumns(#"Grouped Rows",{"B"}),
    ListAC = Table.RenameColumns(RemoveB,{{"A", "List X"}, {"C", "List Y"}}),
    RemoveA = Table.RemoveColumns(#"Grouped Rows",{"A"}),
    ListBC = Table.RenameColumns(RemoveA,{{"B", "List X"}, {"C", "List Y"}}),
    Combine = ListBC & ListAC & ListAB,
    AddCheckColumn = Table.AddColumn(Combine, "Custom", each List.Accumulate(List.Positions([List X]), true, (s,c) => s and ([List X]{c} = [List Y]{c})))
in
    AddCheckColumn

 

 

For the record, this indeed worked, using my "tblInput" as the Source.  And I'm going to have to understand what you did with that initial JSON function!  My guess is that you scraped the post and used some sort of converter to generate the binary?  Or perhaps it was a pointer to the post itself?  In any case, thanks for the help...

Anonymous
Not applicable

You are giving me too much credit. I just copied and pasted the data from the browser into Power BI as a separate table 😊.

 

Step1.PNG

 

Paste into here

Step2.PNG

 

From here, the rest of the query turned this data into your lists.

 

Regards,

Mike

OK Gentlemen, this is my final note on the post.  Thanks again for all your help.

 

Problem:  Dimensional entities in data warehouse have fields/columns that seem to be duplicates but can't say for sure.  How can we test?  Can be hard to say when table runs into thousands or more rows with few and subtle differences.

Solution:  (Thanks to you) Connect to the data, remove all columns except for those of interest, then use this function:

 
(tblInput)=>
let
    BufferedTable = Table.Buffer(tblInput),
    lstColNames = Table.ColumnNames(BufferedTable),
    Column1 = Table.FromColumns({lstColNames},{"Column 1"}),
    CrossJoin = Table.AddColumn(Column1,"Column2", each Column1),
    ExpandCrossJoin = Table.ExpandTableColumn(CrossJoin, "Column2", {"Column 1"}, {"Column 2"}),
    ChangeToText = Table.TransformColumnTypes(ExpandCrossJoin,{{"Column 2", type text}}),
    CompMatrix = Table.SelectRows(ChangeToText, each [Column 1] < [Column 2]),
    AddListX = Table.AddColumn(CompMatrix,"List X", each Table.Column(BufferedTable,[Column 1])),
    AddListY = Table.AddColumn(AddListX,"List Y", each Table.Column(BufferedTable,[Column 2])),
    AddCheckCol = Table.AddColumn(AddListY,"Duplicate?",each [List X]=[List Y]),
    OutputTable = Table.SelectColumns(AddCheckCol,{"Column 1", "Column 2", "Duplicate?"})
in
    OutputTable
 
Which turns the top table into the bottom table:
Finals.JPG

 

 

 

Jimmy801
Community Champion
Community Champion

Hello

don't get exactly what you want.

You have a list and want to check wheter in the list are duplicated entries?

What is your data structure? Maybe it would best to create a new thread.

Jimmy

Hi Jimmy, i was simply posting my original problem and the solution that i developed with your help.  Just to provide some context, code and example input/output for anyone that googles the blog with a similar need.  No need for any further assistance.  Thanks again,
Jeff

Jimmy801
Community Champion
Community Champion

okay.. that is nice @jnixon .... fine to see the result 🙂

have a nice evening

Jimmy

oops... 

for this thread I didn't use any Binary-function. 

simple wrote the sulution without testing. 

It was another user that posted the binary-solution 🙂

Jimmy801
Community Champion
Community Champion

Exactly... used the build-in data converter for tables. 

I could have use the #table function as well to enable data changes in the query itself

have a nice day

Jimmy

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors