Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Count unique values in a filtered column with earlier

Hey everybody, 

 

I don´t know why, but the forum decided to delete my post, so I am just typing it again... 

 

I am stuck with this problem since a while now, have googled and read at least 50 posts, but have not found help so far. I hope, somebody has an idea here.

 

The question I want to answer is quite simple. How many shipments are needed to fulfil one order. 
The dataset I am working with contains information about orders. If you order different product at the same time, for each of these product there will be a record - an order line. But they will get the same order id. Now due to stock levels and other reasons not all order lines of one order are shipped at the same time. Each order lines has an shipping id, which shows in which shipment the order was sent. 

 

So in a simplified version: 

order_id    shipping_id
1a
1b
1c
2a
2b
3

e

3e
3e
3e

 

The outcome should look like: 

order id    shipping id     shipping ids per order id
1a3
1b3
1c3
2a2
2b2
3e1
3e1
3e1

 

First, I tried to generate a table for each row which only shows the data with the exact same order id. To check, if my formula is correct, I counted the rows and controlled it manually. That worked: 

=COUNTX(FILTER('Outbound_3_ESC',EARLIER('Outbound_3_ESC'[order_id]) = 'Outbound_3_ESC'[order_id]),'Outbound_3_ESC'[order_id])

 

Then, I tried to count the distinct values for the shipping id in those tables: 

Shipments per order_id = CALCULATE(DISTINCTCOUNT(Outbound_3_ESC[shipment_id]),FILTER('Outbound_3_ESC',EARLIER(Outbound_3_ESC[order_id]) = 'Outbound_3_ESC'[order_id]))

The formula works, however the outcomes are wrong. I don´t know what is counted there.... I read in several posts, that this could happen, because I am doing it in a calculated column. And the solution in the forum always was to use a measure. However, in a measure the earlier function is not working... And there I am stucked.

Comments, feedback and solutions are well appreciated. I also am happy to answer any questions, which might appear regarding my problem description. 

Thanks in advance for any help! 
Jakob
1 ACCEPTED SOLUTION

Hi @Anonymous ,

Please try to create a calculated column as below to replace that measure and check if that is what you want:

Column = CALCULATE (
    DISTINCTCOUNT ( 'Outbound_3_ESC'[shipment_id] ),
    FILTER (
      ALLSELECTED(    'Outbound_3_ESC' ),
        'Outbound_3_ESC'[order_id] = EARLIER( 'Outbound_3_ESC'[order_id] )
    )
)

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

12 REPLIES 12
amitchandak
Super User
Super User

@Anonymous , Try as new column

calculate(distinctcount(Table[shipment id]), filter(Table, [order id] = earlier([order id])))

Anonymous
Not applicable

Hey 

 

Hi @Anonymous ,

Please try to create measures as below to replace the original calculated column:

Measure =
CALCULATE (
    DISTINCTCOUNT ( 'Outbound_3_ESC'[shipment_id] ),
    FILTER (
        ALLSELECTED ( 'Outbound_3_ESC' ),
        'Outbound_3_ESC'[order_id] = MAX ( 'Outbound_3_ESC'[order_id] )
    )
)
Measure 2 = SUMX(VALUES('Outbound_3_ESC'[order_id]),[Measure])

Count unique values in a filtered column with earlier.JPG

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hey Rena, 

 

thanks for your help. That worked! 

I want to show the average of the ratio. How could I bring that into a visual? 
Do I understand your formula right - Can I simply change the SUMX into AVERAGEX to get the averages into the visual? 

AVERAGEX(VALUES('Outbound_3_ESC'[order_id]),[Measure])

 

Hi @Anonymous ,

Whether the following table is what you want? If no, please provide more details and examples about your expected result.

desired result.JPG

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hey Rena,

 

The idea is to show the average number of shipments per order id in a column chart. Something like this: 

 Capture.PNG

 

 

 

If I enter the measure as value for the column chart it shows the sum of all measures separated into years. So I get the total number and not the average in the visualization. 

 

Thanks and best regards
Jakob

Hi @Anonymous ,

Sorry that I still not clear about your requirement. Could you please provide the correct average number of shipments per order id in the form of table just like the one in your original post? As checked your column chart, the date field be displayed as Axis fields. So where this date field is from? Whether it is from the same table with order id field?

The outcome should look like: 

order id    shipping id     shipping ids per order id
1a3
1b3
1c3
2a2
2b2
3e1
3e1
3e1


Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi @v-yiruan-msft , 

 

The column chart was just used to give an idea of how the visualization should look like in the end. 
The calculation for the measure works perfectly. However, I have problems to bring it into a visualization. 
For a calculated column and I can simply choose show as average. However, this is not possible for a measure. 

Basically, in the visualization, the shipping ids per order id should be shown as an overall average in a column chart. For the numbers here the average is 2. This should then be represented by one column with the value 2. 
Capture.PNG

The original data set contains more information about year and categories. I am planning to use the year as axis to compare how it developed over time. 

Thanks again for your help.
Best Regards

staryuz

Hi @Anonymous ,

Please try to create a calculated column as below to replace that measure and check if that is what you want:

Column = CALCULATE (
    DISTINCTCOUNT ( 'Outbound_3_ESC'[shipment_id] ),
    FILTER (
      ALLSELECTED(    'Outbound_3_ESC' ),
        'Outbound_3_ESC'[order_id] = EARLIER( 'Outbound_3_ESC'[order_id] )
    )
)

Best Regards

Rena

Community Support Team _ Rena
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hey @v-yiruan-msft , 

 

this is exactly what I wanted. Great! Thank you so much. 

 

I don´t understand exactly, why the other calculations came to false results. However, with this formula everything is correct and I can use it as planned in the charts. 

 

Thanks for your help and patience! 

Best Regards

staryuz

camargos88
Community Champion
Community Champion

Hi @Anonymous ,

 

Have you tried this:

Shipping id per order id = CALCULATE(DISTINCTCOUNT('Table'[shipping_id]), ALLEXCEPT('Table','Table'[order_id ]))



Did I answer your question? Mark my post as a solution!

Proud to be a Super User!



Anonymous
Not applicable

Hey camargos88,

 

thanks for the fast reply. 
The outcome is the same than with my approach - the calculated column shows wrong numbers. 
For example there is an order with 130 order lines, but only 4 shipment ids are used in these order lines. However, yours and my formula show 59.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.