cancel
Showing results for
Did you mean:
Microsoft

## DISTINCT over SUMMARIZE

I have a measure that follows the pattern as mentioned below

``````Test Adds =
CALCULATE (
SUMX (
DISTINCT ( SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number] ) ),
IF ( Sales[Sales Amount] > 0, 1, 0 )
)
)``````

I was wondering if I really need the DISTINCT over a SUMMARIZE call as the latter would already return unique combinations of order date and order number. Also, is there a better way to calculate the total number of orders for each day for which the sales amount > 0 ?

1 ACCEPTED SOLUTION
Super User
I do not see the need for DISTINCT, SUMMARIZE already makes things distinct.

Maybe, but don't know your data to be sure. Should just be able to use COUNTROWS and FILTER the table returned from SUMMARIZE but not sure that is more efficient or not.

Check out my External Tool for Power BI Desktop! Microsoft Hates Greg's Quick Measures
Check out my latest book!

12 REPLIES 12
Super User

Hi,

Here's what i would do:

1. Create a Calendar Table and build a relationship from the Order date column of your Sales Table to the Date column of your Calendar Table.  Write calculated column formlas to extract various time dimensions such as Year, Month, Quarter
2. Drag time dimensions from the Calendar Table.
3. Write these measures

Total sales = SUM(Sales[Sales amount])

Orders which recorded sales = SUMX(FILTER(SUMMARIZE(Sales,Sales[Order Number],Calendar[Date],"Sale",[Total Sales]),[Sale]>0),[Sale])

Hope this helps.

Regards,
Ashish Mathur
http://www.ashishmathur.com
Microsoft

@Ashish_MathurWhy would I want to do a sum of [Sales] measure ? I am just trying to get a count of number of orders for each day where sales > 0. I shouldn't have to do a sum on Sales , right ?

Super User

Hi,

Does this work?

=COUNTROWS(FILTER(SUMMARIZE(Sales,Sales[Order Number],Calendar[Date],"Sale",[Total Sales]),[Sale]>0))

Regards,
Ashish Mathur
http://www.ashishmathur.com
Microsoft

@Ashish_Mathur  Yes, that is matching with my current metric results, I have two other versions of the same metric, I am going to test the performance of each metric and get back

Super User

I doubt you need to summarize there. Also, the formula should one of the two as per need

CALCULATE (
SUMX (
SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number],"_1",sumx ( Sales,if(Sales[Sales Amount] > 0, 1, 0 ) )),
[_1]
)
)

CALCULATE (
SUMX (
SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number],"_1",sum ( Sales[Sales Amount]) ),
if([_1] > 0, 1, 0 )
)
)

Dashboard of My Blogs !! Connect on Linkedin

Learn Power BI Beginners !! Advance Power BI Concepts !! Power BI For Tableau User !! Learn Power BI in Hindi !!
Proud to be a Super User!
Microsoft

@amitchandak  Would that have an performance improvement over my current metric definition ? If so, can you please help me understand how it would improve things ?

Super User
I do not see the need for DISTINCT, SUMMARIZE already makes things distinct.

Maybe, but don't know your data to be sure. Should just be able to use COUNTROWS and FILTER the table returned from SUMMARIZE but not sure that is more efficient or not.

Check out my External Tool for Power BI Desktop! Microsoft Hates Greg's Quick Measures
Check out my latest book!

Microsoft

@Greg_Deckler  Yes, that's what I thought. I removed DISTINCT from the measure definition and the numbers seem to match. I am just trying to think of a case where I would need a DISTINCT over SUMMARIZE. It's impossible for me to test all possible combinations considering the dimensions I have in my dataset. In your experience, was there ever a case where you had to apply DISTINCT over a SUMMARIZE statement ? I am yet to do a performance comparison between both the measures.

Super User

No on DISTINCT with SUMMARIZE, because when you SUMMARIZE, whatever your are grouping on will be distinct. DISTINCT will make sure that entire rows are distinct in a table versus just a particular column, but the fact that you are using SUMMARIZE ensures that the table rows are distinct, it's built into the grouping.

@Ashish_Mathur put into a formula what I was discussing with COUNTROWS and FILTER. Now, whether that is an improvment? You'd have to test it. See my 4 part series on Performance Tuning DAX: https://community.powerbi.com/t5/Community-Blog/Performance-Tuning-DAX-Part-1/ba-p/976275

Check out my External Tool for Power BI Desktop! Microsoft Hates Greg's Quick Measures
Check out my latest book!

Microsoft

@Ashish_Mathur  and @Greg_Deckler  So I had three copies of the measures for which I tested the performance on my Power BI desktop. SUMX() was consistently better performing than COUNTROWS(FILTER() >0) pattern

And removing DISTINCT from SUMMARIZE also had some minor improvements on the run time. Maybe it would change in a huge dataset and the time difference would be considerable.

Community Support

Hi @karun_r ,

Thanks for sharing the results. And it seems that you have got the answer of the thread.  Can you please accept the helpful answers as solutions?  More people will benefit here.

Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Super User
Thanks for the information! That does not surprise me, the "X" functions seem to be very performant overall, as I note in my To *Bleep* with RANKX Quick Measure, https://community.powerbi.com/t5/Quick-Measures-Gallery/To-Bleep-with-RANKX/td-p/1042520

Check out my External Tool for Power BI Desktop! Microsoft Hates Greg's Quick Measures
Check out my latest book!

Announcements

#### Launching new user group features

Learn how to create your own user groups today!