cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Regular Visitor

Remove duplicate rows based on max value of a different column

Hi All,

 

I'm attempting to filter some of my data and I'm having problems getting it correct.  Through SQL I'm pulling some sales data in to a table.  Each sales order has it's own unique number (Sales Order Number) which can have multiple rows of data (a row for each item on the sales order).  As sales orders are updated our system will archive off the old version, applying a version number to each row of data, and create a new version number and apply that to each row of data.

 

I need to be able to extract all rows of a Sales Order Number which match the max value of the Version.  

 

An example of the data is below.

 

SALES ORDER NUMBERVERSIONITEM
100001Item 1
100001Item 2
100001Item 3
100002Item 1
100002Item 2
100002Item 3
100003Item 1
100003Item 2
100003Item 3
100011Item 1
100012Item 1
100021Item 3
100031Item 2
100031Item 4
100032Item 2
100032Item 4

 

Any help is appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Microsoft
Microsoft

Hi @CoreX,

 

You want all the rows. So we need to create a new table. There are two options. Please have a try.

Option 1: (Data source: Sales_Option1)

1. Add one column to the table, which will be the Max version.

LatestVersion =
CALCULATE (
    MAX ( 'Sales_Option1'[VERSION] ),
    ALLEXCEPT ( Sales_Option1, Sales_Option1[SALES ORDER NUMBER] )
)

2. Create a new table.

 

Result_Option1 =
FILTER ( Sales_Option1, Sales_Option1[VERSION] = Sales_Option1[LatestVersion] )

Remove duplicate rows based on max value of a different column.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Option 2: (Data source: Sales_Option2)

One step:

 

Result_Option2 =
FILTER (
    ADDCOLUMNS (
        Sales_Option2,
        "latestV", CALCULATE (
            MAX ( Sales_Option2[VERSION] ),
            ALLEXCEPT ( Sales_Option2, Sales_Option2[SALES ORDER NUMBER] )
        )
    ),
    Sales_Option2[VERSION] = [latestV]
)

Remove duplicate rows based on max value of a different column2.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
Highlighted
Microsoft
Microsoft

Hi @CoreX,

 

You want all the rows. So we need to create a new table. There are two options. Please have a try.

Option 1: (Data source: Sales_Option1)

1. Add one column to the table, which will be the Max version.

LatestVersion =
CALCULATE (
    MAX ( 'Sales_Option1'[VERSION] ),
    ALLEXCEPT ( Sales_Option1, Sales_Option1[SALES ORDER NUMBER] )
)

2. Create a new table.

 

Result_Option1 =
FILTER ( Sales_Option1, Sales_Option1[VERSION] = Sales_Option1[LatestVersion] )

Remove duplicate rows based on max value of a different column.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Option 2: (Data source: Sales_Option2)

One step:

 

Result_Option2 =
FILTER (
    ADDCOLUMNS (
        Sales_Option2,
        "latestV", CALCULATE (
            MAX ( Sales_Option2[VERSION] ),
            ALLEXCEPT ( Sales_Option2, Sales_Option2[SALES ORDER NUMBER] )
        )
    ),
    Sales_Option2[VERSION] = [latestV]
)

Remove duplicate rows based on max value of a different column2.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Highlighted

Option #1 worked like a charm.  Thank you so much!

Highlighted

Hey @v-jiascu-msft,

 

Do you have another way of doing this withing the query editor? I have a set of data that I am trying to remove duplicates based on the number of filled cells per row and I was able to get a number (like the version number) for each of my rows which will give me a max. But, once I get rid of the duplicates I want to be able to append this query with other ones, so I need it de-duped within the query editor. 

 

Any way you could help me with that? 

Highlighted

Hi @heimk008,

 

I would suggest you create a new thread in this forum to discuss your needs that is different from this one.

 

Best Regards,

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Create an end-to-end data and analytics solution

Learn how Power BI works with the latest Azure data and analytics innovations at the digital event with Microsoft CEO Satya Nadella.

Top Solution Authors
Top Kudoed Authors