cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
RokuCap
Frequent Visitor

Comparing Multiple Columns for Duplicate Rows to Identify Mismatching Data

I have appended tables from 2 databases only kept the rows with duplicate values (A & B). The goal is to compare the values (Field1, Field2 and Field3) for the duplicate values in the "ID" column and identify discrepancies. Then have a conditional formula to describe which columns do not match. In the case where there are multiple mismatching columns, then this will be concatenated in the "Comment": column text e.g. "Field1 Mismatch; Field2 Mismatch".

 

My initial attempt involved keeping the tables separate, merging them based on the ID value and then adding conditional check columns for each field. However, due to how large these tables are, I'm trying to minimize my use of the "Merge" function.

 
SourceIDField1Field2Field3Comment
Database1A1JapanYesField2 Mismatch
Database2A1CanadaYesField2 Mismatch
Database1B3AustraliaNoField1 Mismatch; Field2 Mismatch
Database2B1New ZealandNoField1 Mismatch; Field2 Mismatch
 

Thanks in advance!

 

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

1 REPLY 1
ImkeF
Super User
Super User

Hi @RokuCap ,
you can try to give it a go with grouping instead:

let
  Source = Table.FromRows(
    Json.Document(
      Binary.Decompress(
        Binary.FromText(
          "i45WckksSUxKLE41VNJRcgRiEO2VWJCYB6QjU4uVYnUQaoyQ1Dgn5iWmJGJRBJJ0AmJjkOLS4pKixJxMkDq/fAyznKBm+aWWK0SlJuYk5qVAFcYCAA==", 
          BinaryEncoding.Base64
        ), 
        Compression.Deflate
      )
    ), 
    let
      _t = ((type nullable text) meta [Serialized.Text = true])
    in
      type table [Source = _t, ID = _t, Field1 = _t, Field2 = _t, Field3 = _t]
  ), 
  #"Changed Type" = Table.TransformColumnTypes(
    Source, 
    {
      {"Source", type text}, 
      {"ID", type text}, 
      {"Field1", Int64.Type}, 
      {"Field2", type text}, 
      {"Field3", type text}
    }
  ), 
  #"Grouped Rows" = Table.Group(
    #"Changed Type", 
    {"ID"}, 
    {
      {
        "Mismatches", 
        each Text.Combine(
          List.Select(
            List.Transform(
              {"Field1", "Field2", "Field3"}, 
              (l) =>
                if List.Count(List.Distinct(Table.Column(_, l))) > 1 then "Mismatch " & l else null
            ), 
            (s) => s <> null
          ), 
          ", "
        )
      }, 
      {"All", each _}
    }
  ), 
  #"Expanded All" = Table.ExpandTableColumn(
    #"Grouped Rows", 
    "All", 
    {"Source", "Field1", "Field2", "Field3"}
  )
in
  #"Expanded All"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
Carousel_PBI_Wave1

2023 Release Wave 1 Plans

Power BI release plans for 2023 release wave 1 describes all new features releasing from April 2023 through September 2023.

Power BI Dev Camp Session 30

Ted's Dev Camp - January 26, 2023

This session will provide guidance and teach campers the skills required to build Power BI reports that support multiple languages.

Power BI Summit Carousel 2

Global Power BI Training

Make sure you register today for the Power BI Summit 2023. Don't miss all of the great sessions and speakers!

Thank you 2022 Review

2022 Monthly Feature Releases

We had a great 2022 with a ton of feature releases to help you drive a data culture.

Top Solution Authors
Top Kudoed Authors