Solved: Re: DAX Question - Customers who purchase again af...

kcantor · ‎02-17-2016

I have a calculation I would like to build from two fact tables. I have a table that lists returns made and a table that lists purchases made. We can call them returns and sales for simplicity. My look up tables include a date table, item table, customer table, and a sales person table. The customer table ties both fact tables together by "sell-to #", the date table uses shipment date, and the item table links by product #.

I would like to determine the percent of customers who have a return who make a purchase (or purchases) after they make a return. I am looking to determine if we are keeping the customer or forever losing them. Can anyone offer any guidance?

I would like to build this using DAX as I am not a Query person but I do use the Query editor as necessary. My dates will only go back to the beginning of 2013 but I have some future dates that require my date table to go through the end of this year.

You help is greatly appreciated.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Greg_Deckler · ‎02-17-2016

Likely more elegant solutions than this, but you could create a calculated column:

SaleAfterReturn = IF(MAXX(FILTER(sales,sales[Customer]=customers[customer]),sales[Date])>MAXX(FILTER(returns,returns[Customer]=customers[customer]),returns[Date]),1,0)

And then a measure:

% Customers After Return = COUNTX(FILTER(customers,customers[SaleAfterReturn]=1),customers[SaleAfterReturn])/COUNT(customers[SaleAfterReturn])

I did this with a customer table:

customers table

customer

1

2

3

and a sales and returns table with Date and Customer. returns and sales and both related to the customers table.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

Greg_Deckler · ‎02-17-2016

Likely more elegant solutions than this, but you could create a calculated column:

SaleAfterReturn = IF(MAXX(FILTER(sales,sales[Customer]=customers[customer]),sales[Date])>MAXX(FILTER(returns,returns[Customer]=customers[customer]),returns[Date]),1,0)

And then a measure:

% Customers After Return = COUNTX(FILTER(customers,customers[SaleAfterReturn]=1),customers[SaleAfterReturn])/COUNT(customers[SaleAfterReturn])

I did this with a customer table:

customers table

customer

1

2

3

and a sales and returns table with Date and Customer. returns and sales and both related to the customers table.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

kcantor · ‎02-17-2016

@Greg_Deckler

Compared to the direction I was headed (using calculated columns) your solution is much more elegant and it works very will with all of my slicers. After a week full of Mondays, it was very refreshing to be able to plug this in and watch it work. Sometimes we need to work smarter instead of harder. PowerBI and PowerPivot are meant to enable that but today I was on the hard track and you simplified it for me. As soon as my head stops hurting, I will have to read through this again to be able to explain this to those who requested the report.

Thank you for saving me from yet another Monday.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Greg_Deckler · ‎02-17-2016

@kcantor - One thought I had on this is that you might want to replace the second MAXX with a MINX, that would give you the customers who made a purchase after their first return versus their last return.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

kcantor · ‎02-18-2016

@Greg_Deckler Once I got caught up, I did use MINX as well to compare the numbers returned. When sorting by return cause the difference is quite significant while returns per vendor are pretty similar. I will be persuing this in more depth today. Thank you again for the assistance.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

greggyb · ‎02-17-2016

I'm not sure if this is feasible. My first thought reading your original post was around the model. Specifically, would it make sense to combine the sale and return facts to a single table with a transaction type? This depends on the structure of these tables and what other data you might include.

My thought is that with both in one table, you can just sort the table by date and check the type of the latest transaction. I have some thoughts around the column and measure definitions, but I'd like to know if this is feasible before chasing those down.

Check out the Power BI User Group of Philadelphia
Next meeting: October 5, 2016 Virtual meeting

kcantor · ‎02-18-2016

@greggyb, while it certainly is feasible, it would make for a very large table. More specifically, I was only pulling in the return table to answer this single question. The results of this experiment have created some other questions, however, so I may go ahead and combine the two tables and see what other infomation I can find to benefit the team.

Currently, I am mining for correlation between cause of return and impact on repurchase potential as well as repurchase tendencies per product line and vendor.

Any ideas you might have would be greatly appreciated.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Greg_Deckler · ‎02-18-2016

One thought would be to toss the data into Azure ML and see if you can create an experiment around a 2 category hypothesis of "purchased after return" yes or no. You might quickly gain some valuable insights by looking at the weighting criteria that Azure ML generates when you do something like a 2 class linear support vector. You could also try K-Means Clustering to see if it can give you any insights as well.

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

greggyb · ‎02-18-2016

@kcantor, regarding table size, a good heuristic when thinking about the model is that vertical expansion (more rows to existing table) is much cheaper (storage space, RAM pressure, query efficiency) than horizontal expansion (adding columns to a table, or adding additional tables, especially similarly structured tables).

We can dive really deep on this sort of thing if you'd like, but the short version is that each column gets a dictionary. That dictionary maps the minimum sized pointer to the actual values that exist in that column. If you have a field with only two values, the size of an individual row will be 1 bit, and the value of that bit will map to the appropriate value for that field. Adding additional rows only impacts the size of this dictionary if new distinct values are added. So, if you have a sale table and a return table that share a bunch of fields, and also share values in those fields (mapping to the same dimensions), you're probably better off combining them, because then you only maintain one copy of each dictionary, rather than two. This is also why additional columns are expensive.

The above is simplified, but an accurate approximation of the primary storage/compression technology in the Tabular engine powering Power BI.

I don't know if this is helpful or not, but that's what led to my question.

Check out the Power BI User Group of Philadelphia
Next meeting: October 5, 2016 Virtual meeting

kcantor · ‎02-18-2016

@greggybUnfortunately, my return table only shares one field with the sales table and I use them to link the customer look up table. Mostly I cut more columns out than I actually bring in as data. The single calculated column suggested by @Greg_Deckler was more elegant than my previous line of reasoning which was creating 3 calculated columns prior to adding a measure. On the up side, IT is on board with PowerBI and PowerPivot so our tables are evolving to be a better match to the tools.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

greggyb · ‎02-18-2016

Gotcha, I was thinking that they'd share a number of dimensions and have similar fact fields, e.g. ProductKey, CustomerKey, DateKey, Quantity, UnitPrice all in common and maybe a couple fields that don't match between the two. Sounds like this isn't the case. I'd stick with smoupre's solution then.

Check out the Power BI User Group of Philadelphia
Next meeting: October 5, 2016 Virtual meeting

DAX Question - Customers who purchase again after a return.

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly