cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
thomasbassett Frequent Visitor
Frequent Visitor

Calculated Column based on two columns in table

Hello! I'm new to the community so apologies if this has already been answered.

 

I have a dataset containing lots of information on customer orders. I need to identify which customers are repeat purchasers. Customers are identified by their email address, and there is also a transaction ID column (as a customers email will appear multiple times not only for repeat purchases but also for transactions involving different products).

 

I want to create a calcaulated column in my table that says something like:

If (distinct count) of email address has equal to or more than 2 transaction ID (distinct count), call that customer "Repeat", and if not call them "One-Time".

 

If this is possible, I'd also then like to add another column that shows how many times these repeat customers have purchased.

 

Many thanks,

Tom

1 ACCEPTED SOLUTION

Accepted Solutions
Super User
Super User

Re: Calculated Column based on two columns in table

@thomasbassett Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)

 

Please try this as a "New Column" 

 

Flag = 
VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email]))
RETURN IF(_Cnt>1,"Repeat","OneTime")

image.png



Did I answer your question? Mark my post as a solution !

Proud to be a Datanaut !





13 REPLIES 13
Super User
Super User

Re: Calculated Column based on two columns in table

@thomasbassett Please post the sample data to work on, it will really help to suggest an accurate solution.


Did I answer your question? Mark my post as a solution !

Proud to be a Datanaut !





thomasbassett Frequent Visitor
Frequent Visitor

Re: Calculated Column based on two columns in table

hi @PattemManohar. I'm afraid I can't post the sample data as it contains real customer details. I have instead created a very basic example of the data with fake emails and transaction IDs, as well as the desired outcomes, in excel and attached a screengrab of it. I hope this helps! And thanks for the quick reply!

 

Repeat Customers Example Data.PNG

 

 

Super User
Super User

Re: Calculated Column based on two columns in table

@thomasbassett Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)

 

Please try this as a "New Column" 

 

Flag = 
VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email]))
RETURN IF(_Cnt>1,"Repeat","OneTime")

image.png



Did I answer your question? Mark my post as a solution !

Proud to be a Datanaut !





thomasbassett Frequent Visitor
Frequent Visitor

Re: Calculated Column based on two columns in table

@PattemManohar that seems to have worked perfectly! Thank you so much!

 

I also wanted to build a count column to show how many times these repeat purchases have happened for each customer. But on second though, I don't think that is possible as there will end up being multiple counts against each email, so when I try to use that data it will be inflated. Unless you know a way around this?

Super User
Super User

Re: Calculated Column based on two columns in table

@thomasbassett I guess you are looking one of these counts, if not could you please provide me an example of what counts you are after...

 

CntEmail = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntUniqTrans = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntTrans = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])))

image.png

 



Did I answer your question? Mark my post as a solution !

Proud to be a Datanaut !





thomasbassett Frequent Visitor
Frequent Visitor

Re: Calculated Column based on two columns in table

Hi @PattemManohar ! THANK YOU SO MUCH!! The second count worked!!

 

I'm still working on the same dataset and have encounted another issue I can't resolve!

 

For each customer, I need to caculate the number of days between each transaction. As the customer identifier (email address) will appear multiple time for transactions involving different products, this would need to be calculated as difference between transaction dates for each customer.

 

I'd therefore expect a customer's first transaction (and one-time purchasers) days between transactions to appear as zero. Then subsequent transactions should be the difference between the the date of that transaction and the previous one...

 

I've created another dummy data set again and attached the image here (apologies - I wasn't sure how to share the file with you!). Is this possible? I'm aware that the numbers will be duplicated, but hoping I can then use the average per transaction to clearly see how often customers are purchasing....

 

Repeat Customers Example Data - Copy.PNG

Super User
Super User

Re: Calculated Column based on two columns in table

@thomasbassett Please try this as a "New Column" for the TransactionDate difference between each transaction ID

 

TransDateDiff = 
VAR _CurrDate = Test104CntCheck[TransactionDate]
VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])-1))
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)

image.png



Did I answer your question? Mark my post as a solution !

Proud to be a Datanaut !





Community Support Team
Community Support Team

Re: Calculated Column based on two columns in table

Hi @thomasbassett,

 

However, if the [transaction ID] is not ordered in sequence, you should insert an index column in Query Editor mode.

 

Group data based on email and transaction ID.

1.PNG

 

Add an index column.

2.PNG

 

Expand the Table and remove unnecessary fields.

3.PNG

 

Then, the formula provided by PattemManohar should be modified to:

TransDateDiff = 
VAR _CurrDate = Test104CntCheck[TransactionDate]
VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[Index] = EARLIER(Test104CntCheck[Index])-1)) 
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)

By the way, since the original requirement has been worked out, would you please kindly mark the helpful reply as an answer so that it can benefit more users? If you still have any other question, please post a new thread.

 

Best regards,

Yuliana Gu

Community Support Team _ Yuliana Gu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
thomasbassett Frequent Visitor
Frequent Visitor

Re: Calculated Column based on two columns in table

Hi @v-yulgu-msft ! Apolgies - I've now marked @PattemManohar's earlier answer as the solution and will make sure all future queries that differ are posted in a new thread.

 

I believe the transaction IDs are ordered in sequence, so the oringinal formula should do the trick. However, when I enter this (and changed the aspects so it matches my dataset) it does not work due to the following error....

 

"DAX comparison operations do not support comparing values of type Text with values of type Number. Consider using the VALUE or FORMAT function to convert one of the values."

 

Is this because the DAX includes both text fields (email and transaction ID) and number (transaction date)? I could creata a new date column that is formatted as text but wanted to check with you (both) before proceeding. Would the formula still work if the number field is changed to text?