Showing results for 
Search instead for 
Did you mean: 
vjnvinod Member

Power query tricky question

Dear community.


I have a table whose coloumns are below


Source.NameFiscal yearAccountAccountChannelAccountSegmentAccountSubSegmentAccountSectorAccountSubSector1Area1RegionGCSP1CountryServiceLineSubServiceLineTER_FYTD_USDTER_FYTD_ConstantRevenuePlanFY_ConstantNER_FYTD_ConstantMarginFYTD_ConstantMarginPFYTD_ConstantTER_PFYTD_ConstantSaleFYTD_constantPipelineWeightedOpen_constantRegion


I need to create additional coloumn with a condition that, if all of my row data matches except for category, then i want that to show it as duplicate

For instance, if there are  3 duplicate rows, then it should show 2 rows as duplicate and 1 row as original.


let me know how to achieve this in power query

adetogni Senior Member
Senior Member

Re: Power query tricky question

As a rule of thumb, each "row" in powerquery cannot reference to "other rows". Especially as there's not the concept of "next" and "previous" rows.
BUT if your objective is to remove duplicate rows, you can simply remove the Category column and then group data.

vjnvinod Member

Re: Power query tricky question



so you are telling there is no way we can achieve this?

my objective is to identify those rows which are duplicate and show that in different coloumn as "Duplicate" or "Original" but  i don't want to remove it for obvious reasons

adetogni Senior Member
Senior Member

Re: Power query tricky question

Well, then for doing something like that I suggest DAX as you can easily create a calculated column with a FILTER on all the rows of a group and then use a RANK to sort them. I don't know exactly how you can decide which one of the three is the original and which one is a duplicate, though.

ElenaN Member

Re: Power query tricky question



A solution would be the one attached below, however I am not that sure how doable it is considering you have a lot of columns by which to group by.


Steps would be to :

1. create an index for the rows in your table (ideal for the rows to be sorted so that the 1st row is the original one - in case it matters or there is a rule for this)

2. group by all columns except the category one and return count rows and also the previously created index from the new table created - so you know which rows are duplicated (the ones that have count > 1)

3. group again by the new count column and all the column except the category one and create min index inside each duplicated rows. 

4. create a new column by comparing initial index with min index inside a group to check which one is the 1st duplicated row and assign text either Duplicate or Original.

5. Remove not needed fields


    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUUoEYkelWJ1oJSMgKwmIncA8YyArGYidwTyYSicUHlAuFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}, {"Column2", type text}, {"Column3", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    #"Grouped Rows" = Table.Group(#"Added Index", {"Column1", "Column2"}, {{"Count", each Table.RowCount(_), type number}, {"Details", each _, type table}}),
    #"Subtracted from Column" = Table.TransformColumns(#"Grouped Rows", {{"Count", each _ - 1, type number}}),
    #"Expanded Details" = Table.ExpandTableColumn(#"Subtracted from Column", "Details", {"Column3", "Index"}, {"Details.Column3", "Details.Index"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Details",{{"Details.Index", "Duplicates Index"}}),
    #"Grouped Rows1" = Table.Group(#"Renamed Columns", {"Column1", "Column2", "Count"}, {{"Min", each List.Min([Duplicates Index]), type number}, {"Details", each _, type table}}),
    #"Expanded Details1" = Table.ExpandTableColumn(#"Grouped Rows1", "Details", {"Details.Column3", "Duplicates Index"}, {"Details.Details.Column3", "Details.Duplicates Index"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded Details1",{{"Min", "Min Index"}, {"Details.Duplicates Index", "Index"}}),
    #"Added Conditional Column" = Table.AddColumn(#"Renamed Columns1", "Is Duplicate", each if [Min Index] = [Index] then "Original" else "Duplicate"),
    #"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column",{"Index", "Min Index", "Count"})
    #"Removed Columns"


I tested on the following data set:







where Column1 and Column 2 can be duplicated rows, while Column3 can have different values (like your mentioned category column)


Hope it helps,


vjnvinod Member

Re: Power query tricky question



Can you help with the DAX? i am not familiar with DAX.

also on the duplicate, it doesn't matter at all, for instance you have 3 rows repeating the values, we can mark 2 rows as duplicate & it doesn't matter which row

vjnvinod Member

Re: Power query tricky question



Can you share the PBIX file?

ElenaN Member

Re: Power query tricky question

adetogni Senior Member
Senior Member

Re: Power query tricky question

Maybe Elena's solutions will work, anyway


in DAX create a calclulated column


Ordinal =
VAR thisColumn1=YourTable[Column1]
VAR thisColumn2=YourTable[Column2]
-- create a var for each column that you have to group by


     YourTable[Column1]=thisColumn1 &&

     -- && repeat for each variable 
     ), YourTable[Category]

This will create a number from 1 to N for each row of the same "group". Then you add a secondary conditional column where if this column is 1 then is the valid one otherwise is duplicate.


There is also another solution that should provide exactly your result

IsDuplicate =
VAR thisColumn1 = YourTable[Column1]
VAR thisColumn2 = YourTable[Column2] 
-- create a var for each column that you have to group by
    IF (
        CALCULATE (
            MIN ( YourTable[Category] ),
            FILTER (
                ALL ( YourTable ),
                YourTable[Column1] = thisColumn1
                    && YourTable[Column2] = thisColumn2
  -- repeat for all variables
            = MIN ( YourTable[Category] ),


These solutions however won't probably work if your categories might be the same as rankx and max will find the same values. If so follow Elena's suggestion and create an Index column, and RANKX on that one instead of the category

vjnvinod Member

Re: Power query tricky question

@ElenaN @adetogni


superhelpful & both the solution works


i have another  tricky question on the same topic


i have a table "cross-Domain extract file" and  3 coloumns stated below

1)  “ClientSectorDesc” 

2)   "Sector as per Account list"

3)   "Account Name_Lookup"


In the table "cross-Domain extract file"

Filter Banking &Capital market in coloumn “ClientSectorDesc”  &

Select Null in “Sector as per Account list” column - Replace Null with BCM

and also With same filters on Replace null in  coloumn“Account Name_Lookup” with Others.

after the changes made, unfilter everything (like we do in excel)


how do i achieve this, i wasn't sucessfull using this find and replace method



Helpful resources

Can You Solve These Challenge

Challenge: Can You Solve These?

Find out how to participate in the first Power BI 'Can You Solve These?' challenge.

Community News & Announcements

Community News & Announcements

Get your latest community news and announcements.

Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Community Kudopalooza

Win Power BI Swag with Community Kudopalooza!

Each week, complete activities and be qualified in the drawing for cool Power BI Swag.

Users Online
Currently online: 44 members 846 guests
Please welcome our newest community members: