Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all
I don't really know how to go about searching for the solution for this as I'm not quite sure what the keywords should be...
I'm designing a model based on customer payments and am looking to produce figures for customers who have had a payment fall through ('payment reversed'), but have also made a payment successfully in the same month ('payment not reversed'). Not bothered about the order of events at this stage. I have measures to calculate counts for each separately, but I'm not sure how to go about isolating the inner portion of the Venn diagram!
I have a solution which filters based on measure outputs but the performance is horrible (as I knew it would be). I would have thought I should be
1. pulling a list of account IDs that have a reversal in the month
2. storing that in a VAR
3. applying that list of account IDs as a filter in a CALCULATE(count of successful payments)
Unfortunately my career in DAX is pretty young so I'm getting coder's block when I try to take that many steps in one fell swoop! I'm getting a mental block particularly around populating a VAR with a column of reversed Account IDs rather than pulling a full table, and it seems like a PITA to try and take a particular column out of a table VAR.
Help!
Solved! Go to Solution.
[Accounts with both Successful and Failed Direct Debits in Month] := CALCULATE( var __accountsWithSuccessfulPayment = calculatetable ( summarize ( TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate] ), TRANSACTIONS[HasBeenReversed_Int] = 0 ) var __accountsWithFailedPayment = calculatetable ( summarize ( TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate] ), TRANSACTIONS[HasBeenReversed_Int] = 1 ) var __accountsWithBoth = intersect ( __accountsWithSuccessfulPayment, __accountsWithFailedPayment ) return countrows ( __accountsWithBoth ), KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentEntryTypeName] = "Payment Received"), KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentMethodGroup]= "Direct Debit"), KEEPFILTERS(DIMTRANSACTIONCodes[GlobalAccountType] = "Mortgage"), USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[SKeyDatePeriodEnd]) )
Here's you measure withouth GROUPBY.
Best
Darek
Hi
I've written replies to this twice and both have never actually been posted on here. My account has a post count of 2 but is only showing 1 that I can view. I even have a "First Reply" badge, despite no reply ever going public!
Really, REALLY annoying as I put quite a lot of effort into the 2nd attempt. So I'm just going to post this, and then edit it if it actually makes it into the thread...
Thanks
Edit - HA! I'm in.
Firstly, thanks for replying and apologies for being vague, not supplying code / a file. It was half 6 on a monday evening and I just wanted shot of work for the day after toiling with this for a good few hours, so I hoped my explanation would suffice for the timebeing. I cannot supply a file as this pulls from a tabular cube and the sheer quantity of VERY sensitive customer data involved makes anonymisation more of an undertaking than building the report in the first place. I could have thrown together a very basic equivalent Power BI sample though.
Secondly - @Cmcmahan has understood what I'm asking correctly. I don't want to know who has made more payments than failures, I want to know who has made both a payment AND a failed payment in the month, so we can isolate cases where they should probably be changing their payment date but are not being proactive about it.
This works -
DDs Success And Fail in Month:= CALCULATE( COUNTX( ADDCOLUMNS( VALUES(TRANSACTIONS[AccountID]) , "BothSuccessAndFail", CALCULATE(IF( 1 IN VALUES(TRANSACTIONS[HasBeenReversed_Int]) && 0 IN VALUES (TRANSACTIONS[HasBeenReversed_Int]) , 1 ) ) ) , [BothSuccessAndFail] ) , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[KeyDatePeriodEnd]) , KEEPFILTERS(DIMPAYMENTTYPES[PaymentEntryTypeName] = "Payment Received") , KEEPFILTERS(DIMPAYMENTTYPES[PaymentMethod]= "Direct Debit") , KEEPFILTERS(DIMPAYMENTCODES[AccountType] = "Mortgage") , DATESMTD(DIMDATES[Date])
DDs Success And Fail in Month 2:= COUNTX( SUMMARIZECOLUMNS( VALUES(TRANSACTIONS[AccountID]), VALUES('TRANSACTIONS'[AccountID]) , CALCULATETABLE( VALUES('TRANSACTIONS'[AccountID]) , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[KeyDatePeriodEnd]) , KEEPFILTERS(DIMPAYMENTTYPES[PaymentEntryTypeName] = "Payment Received") , KEEPFILTERS(DIMPAYMENTTYPES[PaymentMethod]= "Direct Debit") , KEEPFILTERS(DIMPAYMENTCODES[AccountType] = "Mortgage") ) , "BothSuccessAndFail", IF( 1 IN VALUES(TRANSACTIONS[HasBeenReversed_Int]) && 0 IN VALUES (TRANSACTIONS[HasBeenReversed_Int]) , 1 ) ) , [BothSuccessAndFail] ) )
Sorry again, I know this is a wall of text. In short - why is the summarizecolumns not working, and how do I build monthly constraints within the measure so I that a yearly figure = month 1 + month 2 + month 3 etc?
Thanks, this has been a huge help and my late reply is not intentional!!
Hi Darek
Thanks for your response. In which case can you advise how you would go about removing the IF logic from this measure? I would have thought the best approach would be to create a list of AccountIDs that have failed in the month and then use that to filter a list of AccountIDs that have had a payment in the month. No IF statements, no heavy formula work, just taking the inner part of the Venn diagram.
I'm not really sure how the model would be negatively affecting this situation, the model itself is fairly robustly designed albeit more for a SQL & SSRS environment than the fairly straightforward tabular cube I have built. If we can build a solution that is storage engine focussed then maybe we can better judge whether there is indeed a shortcoming in the model design.
Thanks and looking forward to hearing more from you!
[# Accounts (Pymnt & Rvrsl)] = CALCULATE( var __accountsWithPymnt = CALCULATE( VALUES( Transactions[AccountID] ), KEEPFILTERS( TRANSACTIONS[HasBeenReversed_Int] = 0 ) ) var __accountsWithPymntAndRvrsl = CALCULATE( VALUES( Transactions[AccountID] ) __accountsWithPymnt, KEEPFILTERS( TRANSACTIONS[HasBeenReversed_Int] = 1 ) ) var __countOfAcconts = COUNTROWS( __accountsWithPymntAndRvrsl ) return __countOfAccounts, KEEPFILTERS ( DIMPAYMENTTYPES[PaymentEntryTypeName] = "Payment Received" ), KEEPFILTERS ( DIMPAYMENTTYPES[PaymentMethod] = "Direct Debit" ), KEEPFILTERS ( DIMPAYMENTCODES[AccountType] = "Mortgage" ), USERELATIONSHIP ( DIMDATES[DateKey], TRANSACTIONS[KeyDatePeriodEnd] ) )
Try this one... You might need to re-consider the use of KEEPFILTERS depending on what you want to achieve. From this measure, it's then easy to create derivatives like, for instance, month-to-date:
[# Accounts (Pymnt & Rversl) MTD] = CALCULATE( [# Accounts (Pymnt & Rvrsl)], DATESMTD( DIMDATES[DateKey] ) )
Best
Darek
Thanks Darek.
I'd managed to put together something along the same lines, albeit messier, and the performance is far better than the IF statement equivalent. I'll give your version a go too, though I think it'll hit the same issues where it doesn't isolate payments & fails to the same month.
Would GROUPBY hit the same issues as with SUMMARIZE? I guess I could use SELECTCOLUMNS instead in the variable, I just naturally tend towards getting distinct values but there's no benefit.
Accounts with both Successful and Failed Direct Debits in Month:= VAR FailAccounts = CALCULATETABLE( GROUPBY(TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate]), TRANSACTIONS[HasBeenReversed_Int]=0, KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentEntryTypeName] = "Payment Received"), KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentMethodGroup]= "Direct Debit"), KEEPFILTERS(DIMTRANSACTIONCodes[GlobalAccountType] = "Mortgage") , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[SKeyDatePeriodEnd]) /*,DATESMTD(DIMDATES[DateValue])*/ ) RETURN CALCULATE(COUNTROWS(GROUPBY(TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate])) , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[SKeyDatePeriodEnd]) , TRANSACTIONS[HasBeenReversed_Int]=1 , KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentEntryTypeName] = "Payment Received") , KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentMethodGroup]= "Direct Debit") , KEEPFILTERS(DIMTRANSACTIONCodes[GlobalAccountType] = "Mortgage") /*, DATESMTD(DIMDATES[DateValue])*/ , FailAccounts )
[Accounts with both Successful and Failed Direct Debits in Month] := CALCULATE( var __accountsWithSuccessfulPayment = calculatetable ( summarize ( TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate] ), TRANSACTIONS[HasBeenReversed_Int] = 0 ) var __accountsWithFailedPayment = calculatetable ( summarize ( TRANSACTIONS, ACCOUNTS[AccountID], DIMDATES[MonthEndDate] ), TRANSACTIONS[HasBeenReversed_Int] = 1 ) var __accountsWithBoth = intersect ( __accountsWithSuccessfulPayment, __accountsWithFailedPayment ) return countrows ( __accountsWithBoth ), KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentEntryTypeName] = "Payment Received"), KEEPFILTERS(DIMTRANSACTIONTYPES[PaymentMethodGroup]= "Direct Debit"), KEEPFILTERS(DIMTRANSACTIONCodes[GlobalAccountType] = "Mortgage"), USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[SKeyDatePeriodEnd]) )
Here's you measure withouth GROUPBY.
Best
Darek
Thanks Darek. I'll go with this as the solution.
To be clear about what I wanted, I want to calculate figures which are accurate at month granularity, but which roll up so that a 12 month figure would equate to Month 1 + Month 2 + Month 3 etc.
Hence the need for the month end date in here (we do have a monthkey too, which is a six digit integer, so I've swapped to that though I don't think it makes much difference).
This has been a huge help, I've learnt a lot of really useful concepts and functions from this discussion. Especially about using VAR inside a calculate to maintain any filters throughout, and the use of INTERSECT too!
Thanks @Anonymous and @Cmcmahan
You should have posted some data... and pics...
This is the best I can do without almost any information:
[# Customers With Pmnt & Rvrsl] = var __accountsWithReversal = CALCULATETABLE ( SUMMARIZE ( Transactions, Accounts[AccountID] ), Transactions[Type] = "reversal" ) -- To figure out if a real payment -- was made during the period it's -- not enough to filter by Type = "payment". -- One has to make sure that the payment -- did not have a reversal in the period. -- So, we have to pull out the accounts -- where the number of "payments" is greater -- than the number of "reversals". This, of course, -- is a simplification. You have to figure out -- what it means for an account "to have a real -- payment within a selected period of time." var __accountsWithPaymentAndReversal = FILTER ( ADDCOLUMNS( __accountsWithReversal, "Diff", CALCULATE( COUNTROWS ( Transactions ), Transactions[Type] = "payment" ) - CALCULATE ( COUNTROWS ( Transactions ), Transactions[Type] = "reversal" ) ), [Diff] > 0 ) var __accountCount = COUNTROWS ( __accountsWithPaymentAndReversal ) RETURN __accountCount
Best
Dare
So to start with, can you share some sample of your data? We can make guesses about how it would look, but if you share what you actually have, that would be very useful, and get you a more specific answer faster. I'm guessing your table has columns similar to: {PaymentID, CustomerID, Date, Amount, Status}, but status might be a true/false flag for reversed payments, or have multiple text categories. If you could share that info, we can give you a better answer.
Also, when you say that you want to "produce figures" for these customers, what do you mean? Do you just want a count of customers that meet both these criteria in a given month? Do you just want a list of CustomerIDs? Do you want to do more advanced calculations based on other data associated with those payments?
Here's a measure that would give a count of IDs that have both a 1 and a 0 value in the Status column for the currently sliced time period:
CountFailAndSuccess =
COUNTX( SUMMARIZE( Payments, Payments[CustomerID], "BothSuccessAndFail", IF(1 IN VALUES(Payments[Status]) && 0 IN VALUES(Payments[Status]), 1) ), [BothSuccessAndFail] )
Please do not use SUMMARIZE to calculate values. SUMMARIZE is a VERY COMPLEX function (with bugs into the bargain) that should only be used for grouping. ONLY. If you want to suffer and scratch your head later down the line for hours on end, then you can ignore this advice.
Please read this to know WHY YOU SHOULD NEVER DO WHAT YOU'VE SUGGESTED: https://www.sqlbi.com/articles/all-the-secrets-of-summarize/
Best
Darek
While I appreciate the chance to learn, there's no need to be so hostile about it. It would also help if you could re-write the DAX to not use SUMMARIZE in a way that angers you so much.
After reading the article linked, here's a happier version of the DAX that doesn't include SUMMARIZE:
CountFailAndSuccess = COUNTX( ADDCOLUMNS( VALUES(Payments[CustomerID]), "BothSuccessAndFail", CALCULATE(IF(1 IN VALUES(Payments[Status]) && 0 IN VALUES(Payments[Status]), 1))), [BothSuccessAndFail] )
It looks like the issues with unpredictable results you would run into while using SUMMARIZE only really occured when you also used a CALCULATE expression within the calculated values, which my previous answer didn't.
In addition to all of this, in the article you linked, the VERY FIRST SENTENCE is:
The content of this article is obsolete as of January 2018. Recent versions of Excel 2016, Power BI, and Analysis Services have a SUMMARIZE behavior that is different from the one described in this article.
Since we're in a post-January 2018 world, the behavior of SUMMARIZE has changed and the entire article is a neat history lesson. However, we can still follow their current advice and use SUMMARIZECOLUMNS with newly calculated results just like we used SUMMARIZE before:
CountFailAndSuccess = COUNTX( SUMMARIZECOLUMNS(Payments[CustomerID], "BothSuccessAndFail", IF(1 IN VALUES(Payments[Status]) && 0 IN VALUES(Payments[Status]), 1)), [BothSuccessAndFail] )
Hi - sorry for the late reply!
Thanks very much for your help everyone, appreciate it a lot.
I know my original request was a bit vague, I'm working with a tabular cube (SSAS 2017) that contains a LOT of sensitive data and so knocking together a fake data / anonymized equivalent would be a huge piece of work.
@Cmcmahan's approach is what I'm looking for, rather than counts of payments minus counts of cancellations. It's a count of accounts which have both payments and cancellations in the same month, regardless of volumes of each.
I've posted the code I'm using at the moment, which is getting towards what I want...
DDs Success And Fail in Month:= CALCULATE( COUNTX( ADDCOLUMNS( VALUES(TRANSACTIONS[AccountID]) , "BothSuccessAndFail", CALCULATE(IF( 1 IN VALUES(TRANSACTIONS[HasBeenReversed_Int]) && 0 IN VALUES (TRANSACTIONS[HasBeenReversed_Int]) , 1 ) ) ) , [BothSuccessAndFail] ) , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[KeyDatePeriodEnd]) , KEEPFILTERS(DIMPAYMENTTYPES[PaymentEntryTypeName] = "Payment Received") , KEEPFILTERS(DIMPAYMENTTYPES[PaymentMethod]= "Direct Debit") , KEEPFILTERS(DIMPAYMENTCODES[AccountType] = "Mortgage") , DATESMTD(DIMDATES[DateValue]) )
(Edit: it only causes an error when I try to show values over time, rather than as a standalone unfiltered figure. The number it calculates is almost 3 times as high as it should be, so adding the date within the SummarizeColumns is obviously not having the desired effect)
DDs Success And Fail in Month 2:= COUNTX( SUMMARIZECOLUMNS( 'TRANSACTIONS'[AccountID], 'TRANSACTIONS'[KeyDatePeriodEnd] , CALCULATETABLE( VALUES('TRANSACTIONS'[AccountID]) , USERELATIONSHIP(DIMDATES[DateKey], TRANSACTIONS[KeyDatePeriodEnd]) , KEEPFILTERS(DIMPAYMENTTYPES[PaymentEntryTypeName] = "Payment Received") , KEEPFILTERS(DIMPAYMENTTYPES[PaymentMethod]= "Direct Debit") , KEEPFILTERS(DIMPAYMENTCODES[AccountType] = "Mortgage") ) , "BothSuccessAndFail", IF( 1 IN VALUES(TRANSACTIONS[HasBeenReversed_Int]) && 0 IN VALUES (TRANSACTIONS[HasBeenReversed_Int]) , 1 ) ) , [BothSuccessAndFail] ) )
"While I appreciate the chance to learn, there's no need to be so hostile about it." Hostile??? I don't have time for arguments, mate. Once again, please read the article in its entirety and also the discussions below (some are as young as 2 months) because you have clearly not understood it.
UPDATE 2018-01-24 : The content of this article is obsolete as of January 2018. Recent versions of Excel 2016, Power BI, and Analysis Services have a SUMMARIZE behavior that is different from the one described in this article. As noted below, using SUMMARIZE should be deprecated for aggregations and you should use SUMMARIZECOLUMNS instead. Read more in Introducing SUMMARIZECOLUMNS.
Please, when you do something, do it correctly. ALWAYS. Why is this important? Because if others follow your advice and the advice is wrong, they'll have a hard time understanding what's going on. Please save them grief and frustration.
Thanks.
Best
Darek
This does not sound overly complex 🙂 Nice exercise, in fact 🙂
Please share the file via OneDrive or Google Drive (set perms accordingly so I can access). If the file is too big, please supply a sample that is good enough to design a solution and illustrate any issues. If any data is sensitive, obfuscate.
Your calculations can be FAST but you have to build the right model. A good model is CRUCIAL. If it's **bleep**ty, your calcs will be **bleep**ty as well 🙂 Let me have a look at it.
Thanks and waiting for your file.
Best
Darek
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
42 | |
23 | |
21 | |
15 | |
15 |
User | Count |
---|---|
45 | |
31 | |
30 | |
18 | |
17 |