Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello! I'm new to the community so apologies if this has already been answered.
I have a dataset containing lots of information on customer orders. I need to identify which customers are repeat purchasers. Customers are identified by their email address, and there is also a transaction ID column (as a customers email will appear multiple times not only for repeat purchases but also for transactions involving different products).
I want to create a calcaulated column in my table that says something like:
If (distinct count) of email address has equal to or more than 2 transaction ID (distinct count), call that customer "Repeat", and if not call them "One-Time".
If this is possible, I'd also then like to add another column that shows how many times these repeat customers have purchased.
Many thanks,
Tom
Solved! Go to Solution.
@Anonymous Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)
Please try this as a "New Column"
Flag = VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email])) RETURN IF(_Cnt>1,"Repeat","OneTime")
Proud to be a PBI Community Champion
Proud to be a PBI Community Champion
hi @PattemManohar. I'm afraid I can't post the sample data as it contains real customer details. I have instead created a very basic example of the data with fake emails and transaction IDs, as well as the desired outcomes, in excel and attached a screengrab of it. I hope this helps! And thanks for the quick reply!
@Anonymous Thanks for that. I am just looking for sample data (not actual data) to know exactly about your requirement rather than assuming. But anyway, I got what you are looking for and I've created sample data (as I can't copy the data that you posted as image)
Please try this as a "New Column"
Flag = VAR _Cnt = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),ALLEXCEPT(Test104CntCheck,Test104CntCheck[Email])) RETURN IF(_Cnt>1,"Repeat","OneTime")
Proud to be a PBI Community Champion
@PattemManohar that seems to have worked perfectly! Thank you so much!
I also wanted to build a count column to show how many times these repeat purchases have happened for each customer. But on second though, I don't think that is possible as there will end up being multiple counts against each email, so when I try to use that data it will be inflated. Unless you know a way around this?
@Anonymous I guess you are looking one of these counts, if not could you please provide me an example of what counts you are after...
CntEmail = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntUniqTrans = CALCULATE(DISTINCTCOUNT(Test104CntCheck[TransactionId]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email])))
CntTrans = CALCULATE(COUNTROWS(Test104CntCheck),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])))
Proud to be a PBI Community Champion
Hi @PattemManohar ! THANK YOU SO MUCH!! The second count worked!!
I'm still working on the same dataset and have encounted another issue I can't resolve!
For each customer, I need to caculate the number of days between each transaction. As the customer identifier (email address) will appear multiple time for transactions involving different products, this would need to be calculated as difference between transaction dates for each customer.
I'd therefore expect a customer's first transaction (and one-time purchasers) days between transactions to appear as zero. Then subsequent transactions should be the difference between the the date of that transaction and the previous one...
I've created another dummy data set again and attached the image here (apologies - I wasn't sure how to share the file with you!). Is this possible? I'm aware that the numbers will be duplicated, but hoping I can then use the average per transaction to clearly see how often customers are purchasing....
Hi @Anonymous,
However, if the [transaction ID] is not ordered in sequence, you should insert an index column in Query Editor mode.
Group data based on email and transaction ID.
Add an index column.
Expand the Table and remove unnecessary fields.
Then, the formula provided by PattemManohar should be modified to:
TransDateDiff = VAR _CurrDate = Test104CntCheck[TransactionDate] VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[Index] = EARLIER(Test104CntCheck[Index])-1)) RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)
By the way, since the original requirement has been worked out, would you please kindly mark the helpful reply as an answer so that it can benefit more users? If you still have any other question, please post a new thread.
Best regards,
Yuliana Gu
Hi @v-yulgu-msft ! Apolgies - I've now marked @PattemManohar's earlier answer as the solution and will make sure all future queries that differ are posted in a new thread.
I believe the transaction IDs are ordered in sequence, so the oringinal formula should do the trick. However, when I enter this (and changed the aspects so it matches my dataset) it does not work due to the following error....
"DAX comparison operations do not support comparing values of type Text with values of type Number. Consider using the VALUE or FORMAT function to convert one of the values."
Is this because the DAX includes both text fields (email and transaction ID) and number (transaction date)? I could creata a new date column that is formatted as text but wanted to check with you (both) before proceeding. Would the formula still work if the number field is changed to text?
@Anonymous Somewhere you are comparing Text and Number types. Make sure you are comparing Text with Text and Number with Number datatypes. Also, could you please post the expression that you are using (after changing as per your requirement).
Proud to be a PBI Community Champion
@PattemManohar I've created a new tran date column that is just a copy of the original tran date but changed the format to text. I'm still getting the same error. When the error shows, it appears to be highlighting the final part of the formula (see screenshot). I've also placed the expression with my changes below.
TransDateDiff =
VAR _CurrDate = 'Pre Gateway'[Tran Date (for TransDateDiff)]
VAR _PrevDate = CALCULATE(MIN('Pre Gateway'[Tran Date (for TransDateDiff)]),FILTER(ALL('Pre Gateway'),'Pre Gateway'[Email]=EARLIER('Pre Gateway'[Email]) && 'Pre Gateway'[Tran ID] = EARLIER('Pre Gateway'[Tran ID])-1))
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)
@Anonymous Could you please check and confirm that "Tran ID" column is of number datatype. If not, please change that to number data type and try. Leave other fields datatypes as is (Dates as date and email as text)
Proud to be a PBI Community Champion
@PattemManohar Ah okay we might have an issue here. In the sample data I presented I showed "Tran ID" as a number, but this is in fact was a manual field that I had to create as it didn't exist in the data. It is a combination of a customers email address and the transaction date. Therefore, I cannot make the type Number.
Is there a way to create a unique number based on a text field? If so, I could then use that...
@Anonymous Please try this as a "New Column" for the TransactionDate difference between each transaction ID
TransDateDiff =
VAR _CurrDate = Test104CntCheck[TransactionDate]
VAR _PrevDate = CALCULATE(MIN(Test104CntCheck[TransactionDate]),FILTER(ALL(Test104CntCheck),Test104CntCheck[Email]=EARLIER(Test104CntCheck[Email]) && Test104CntCheck[TransactionId] = EARLIER(Test104CntCheck[TransactionId])-1))
RETURN DATEDIFF(IF(_PrevDate=BLANK(),_CurrDate,_PrevDate),_CurrDate,DAY)
Proud to be a PBI Community Champion
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
112 | |
100 | |
76 | |
74 | |
49 |
User | Count |
---|---|
146 | |
108 | |
106 | |
90 | |
62 |