cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
RokuCap
Frequent Visitor

Comparing Multiple Columns for Duplicate Rows to Identify Mismatching Data

I have appended tables from 2 databases only kept the rows with duplicate values (A & B). The goal is to compare the values (Field1, Field2 and Field3) for the duplicate values in the "ID" column and identify discrepancies. Then have a conditional formula to describe which columns do not match. In the case where there are multiple mismatching columns, then this will be concatenated in the "Comment": column text e.g. "Field1 Mismatch; Field2 Mismatch".

 

My initial attempt involved keeping the tables separate, merging them based on the ID value and then adding conditional check columns for each field. However, due to how large these tables are, I'm trying to minimize my use of the "Merge" function.

 
SourceIDField1Field2Field3Comment
Database1A1JapanYesField2 Mismatch
Database2A1CanadaYesField2 Mismatch
Database1B3AustraliaNoField1 Mismatch; Field2 Mismatch
Database2B1New ZealandNoField1 Mismatch; Field2 Mismatch
 

Thanks in advance!

 

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

1 REPLY 1
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
November 2022 Update

Check it Out!

Click here to read more about the November 2022 updates!

Microsoft 365 Conference â__ December 6-8, 2022

Microsoft 365 Conference - 06-08 December

Join us in Las Vegas to experience community, incredible learning opportunities, and connections that will help grow skills, know-how, and more.

Power BI Dev Camp Session 27

Ted's Dev Camp

This session walks through creating a new Azure AD B2C tenant and configuring it with user flows and custom policies.

Top Solution Authors
Top Kudoed Authors