Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
CoreX
Regular Visitor

Remove duplicate rows based on max value of a different column

Hi All,

 

I'm attempting to filter some of my data and I'm having problems getting it correct.  Through SQL I'm pulling some sales data in to a table.  Each sales order has it's own unique number (Sales Order Number) which can have multiple rows of data (a row for each item on the sales order).  As sales orders are updated our system will archive off the old version, applying a version number to each row of data, and create a new version number and apply that to each row of data.

 

I need to be able to extract all rows of a Sales Order Number which match the max value of the Version.  

 

An example of the data is below.

 

SALES ORDER NUMBERVERSIONITEM
100001Item 1
100001Item 2
100001Item 3
100002Item 1
100002Item 2
100002Item 3
100003Item 1
100003Item 2
100003Item 3
100011Item 1
100012Item 1
100021Item 3
100031Item 2
100031Item 4
100032Item 2
100032Item 4

 

Any help is appreciated!

1 ACCEPTED SOLUTION
v-jiascu-msft
Employee
Employee

Hi @CoreX,

 

You want all the rows. So we need to create a new table. There are two options. Please have a try.

Option 1: (Data source: Sales_Option1)

1. Add one column to the table, which will be the Max version.

LatestVersion =
CALCULATE (
    MAX ( 'Sales_Option1'[VERSION] ),
    ALLEXCEPT ( Sales_Option1, Sales_Option1[SALES ORDER NUMBER] )
)

2. Create a new table.

 

Result_Option1 =
FILTER ( Sales_Option1, Sales_Option1[VERSION] = Sales_Option1[LatestVersion] )

Remove duplicate rows based on max value of a different column.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Option 2: (Data source: Sales_Option2)

One step:

 

Result_Option2 =
FILTER (
    ADDCOLUMNS (
        Sales_Option2,
        "latestV", CALCULATE (
            MAX ( Sales_Option2[VERSION] ),
            ALLEXCEPT ( Sales_Option2, Sales_Option2[SALES ORDER NUMBER] )
        )
    ),
    Sales_Option2[VERSION] = [latestV]
)

Remove duplicate rows based on max value of a different column2.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

4 REPLIES 4
v-jiascu-msft
Employee
Employee

Hi @CoreX,

 

You want all the rows. So we need to create a new table. There are two options. Please have a try.

Option 1: (Data source: Sales_Option1)

1. Add one column to the table, which will be the Max version.

LatestVersion =
CALCULATE (
    MAX ( 'Sales_Option1'[VERSION] ),
    ALLEXCEPT ( Sales_Option1, Sales_Option1[SALES ORDER NUMBER] )
)

2. Create a new table.

 

Result_Option1 =
FILTER ( Sales_Option1, Sales_Option1[VERSION] = Sales_Option1[LatestVersion] )

Remove duplicate rows based on max value of a different column.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Option 2: (Data source: Sales_Option2)

One step:

 

Result_Option2 =
FILTER (
    ADDCOLUMNS (
        Sales_Option2,
        "latestV", CALCULATE (
            MAX ( Sales_Option2[VERSION] ),
            ALLEXCEPT ( Sales_Option2, Sales_Option2[SALES ORDER NUMBER] )
        )
    ),
    Sales_Option2[VERSION] = [latestV]
)

Remove duplicate rows based on max value of a different column2.jpg

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Best Regards!

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hey @v-jiascu-msft,

 

Do you have another way of doing this withing the query editor? I have a set of data that I am trying to remove duplicates based on the number of filled cells per row and I was able to get a number (like the version number) for each of my rows which will give me a max. But, once I get rid of the duplicates I want to be able to append this query with other ones, so I need it de-duped within the query editor. 

 

Any way you could help me with that? 

Hi @heimk008,

 

I would suggest you create a new thread in this forum to discuss your needs that is different from this one.

 

Best Regards,

Dale

Community Support Team _ Dale
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Option #1 worked like a charm.  Thank you so much!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.