Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Calculated Column based on two columns in table

Hello! I'm new to the community so apologies if this has already been answered.

 

I have a dataset containing lots of information on customer orders. I need to identify which customers are repeat purchasers. Customers are identified by their email address, and there is also a transaction ID column (as a customers email will appear multiple times not only for repeat purchases but also for transactions involving different products).

 

I want to create a calcaulated column in my table that says something like:

If (distinct count) of email address has equal to or more than 2 transaction ID (distinct count), call that customer "Repeat", and if not call them "One-Time".

 

If this is possible, I'd also then like to add another column that shows how many times these repeat customers have purchased.

 

Many thanks,

Tom

1 ACCEPTED SOLUTION

@Anonymous Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)

 

Please try this as a "New Column" 

 

Flag = 
VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email]))
RETURN IF(_Cnt>1,"Repeat","OneTime")

image.png





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




View solution in original post

13 REPLIES 13
PattemManohar
Community Champion
Community Champion

@Anonymous Please post the sample data to work on, it will really help to suggest an accurate solution.




Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Anonymous
Not applicable

hi @PattemManohar. I'm afraid I can't post the sample data as it contains real customer details. I have instead created a very basic example of the data with fake emails and transaction IDs, as well as the desired outcomes, in excel and attached a screengrab of it. I hope this helps! And thanks for the quick reply!

 

Repeat Customers Example Data.PNG

 

 

@Anonymous Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)

 

Please try this as a "New Column" 

 

Flag = 
VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email]))
RETURN IF(_Cnt>1,"Repeat","OneTime")

image.png





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Anonymous
Not applicable

@PattemManohar that seems to have worked perfectly! Thank you so much!

 

I also wanted to build a count column to show how many times these repeat purchases have happened for each customer. But on second though, I don't think that is possible as there will end up being multiple counts against each email, so when I try to use that data it will be inflated. Unless you know a way around this?

@Anonymous I guess you are looking one of these counts, if not could you please provide me an example of what counts you are after...

 

CntEmail = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntUniqTrans = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntTrans = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])))

image.png

 





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Anonymous
Not applicable

Hi @PattemManohar ! THANK YOU SO MUCH!! The second count worked!!

 

I'm still working on the same dataset and have encounted another issue I can't resolve!

 

For each customer, I need to caculate the number of days between each transaction. As the customer identifier (email address) will appear multiple time for transactions involving different products, this would need to be calculated as difference between transaction dates for each customer.

 

I'd therefore expect a customer's first transaction (and one-time purchasers) days between transactions to appear as zero. Then subsequent transactions should be the difference between the the date of that transaction and the previous one...

 

I've created another dummy data set again and attached the image here (apologies - I wasn't sure how to share the file with you!). Is this possible? I'm aware that the numbers will be duplicated, but hoping I can then use the average per transaction to clearly see how often customers are purchasing....

 

Repeat Customers Example Data - Copy.PNG

Hi @Anonymous,

 

However, if the [transaction ID] is not ordered in sequence, you should insert an index column in Query Editor mode.

 

Group data based on email and transaction ID.

1.PNG

 

Add an index column.

2.PNG

 

Expand the Table and remove unnecessary fields.

3.PNG

 

Then, the formula provided by PattemManohar should be modified to:

TransDateDiff = 
VAR _CurrDate = Test104CntCheck[TransactionDate]
VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[Index] = EARLIER(Test104CntCheck[Index])-1)) 
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)

By the way, since the original requirement has been worked out, would you please kindly mark the helpful reply as an answer so that it can benefit more users? If you still have any other question, please post a new thread.

 

Best regards,

Yuliana Gu

Community Support Team _ Yuliana Gu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Anonymous
Not applicable

Hi @v-yulgu-msft ! Apolgies - I've now marked @PattemManohar's earlier answer as the solution and will make sure all future queries that differ are posted in a new thread.

 

I believe the transaction IDs are ordered in sequence, so the oringinal formula should do the trick. However, when I enter this (and changed the aspects so it matches my dataset) it does not work due to the following error....

 

"DAX comparison operations do not support comparing values of type Text with values of type Number. Consider using the VALUE or FORMAT function to convert one of the values."

 

Is this because the DAX includes both text fields (email and transaction ID) and number (transaction date)? I could creata a new date column that is formatted as text but wanted to check with you (both) before proceeding. Would the formula still work if the number field is changed to text?

@Anonymous Somewhere you are comparing Text and Number types. Make sure you are comparing Text with Text and Number with Number datatypes. Also, could you please post the expression that you are using (after changing as per your requirement). 





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Anonymous
Not applicable

@PattemManohar I've created a new tran date column that is just a copy of the original tran date but changed the format to text. I'm still getting the same error. When the error shows, it appears to be highlighting the final part of the formula (see screenshot). I've also placed the expression with my changes below.

 

trandatediff error.PNG

 

TransDateDiff =
VAR _CurrDate = 'Pre Gateway'[Tran Date (for TransDateDiff)]
VAR _PrevDate = CALCULATE(MIN('Pre Gateway'[Tran Date (for TransDateDiff)]),FILTER(ALL('Pre Gateway'),'Pre Gateway'[Email]=EARLIER('Pre Gateway'[Email]) && 'Pre Gateway'[Tran ID] = EARLIER('Pre Gateway'[Tran ID])-1))
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)

@Anonymous Could you please check and confirm that "Tran ID" column is of number datatype. If not, please change that to number data type and try. Leave other fields datatypes as is (Dates as date and email as text)





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Anonymous
Not applicable

@PattemManohar Ah okay we might have an issue here. In the sample data I presented I showed "Tran ID" as a number, but this is in fact was a manual field that I had to create as it didn't exist in the data. It is a combination of a customers email address and the transaction date. Therefore, I cannot make the type Number.

 

Is there a way to create a unique number based on a text field? If so, I could then use that...

@Anonymous Please try this as a "New Column" for the TransactionDate difference between each transaction ID

 

TransDateDiff = 
VAR _CurrDate = Test104CntCheck[TransactionDate]
VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])-1))
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)

image.png





Did I answer your question? Mark my post as a solution!

Proud to be a PBI Community Champion




Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.