Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

returning customer count

Hey folks,

 

I know that this topic was already discussed, but I didn't get any of the solutions to work that were posted here before. I was able to solve my problem with 2 dax tables + measure, but I'm not happy with my solution and I guess that there is a better / simpler way to do it.

 

So let's take a look:

 

Imagine you have this type of data: an id, longitude, latitude (both does not really matter here) and date + time stamp.

 

did	lon	lat	ltime
1	8.89314365386963	49.2425270080566	2018-08-02 20:58:12
1	8.89325141906738	49.2425804138184	2018-08-02 20:53:54
1	8.89325523376465	49.2425384521484	2018-08-02 21:00:50
1	8.89333343505859	49.2424774169922	2018-08-04 18:26:42
2	8.89333820343018	49.2424545288086	2018-07-30 20:08:41
3	8.89305019378662	49.2425384521484	2018-08-03 13:31:26
3	8.8930549621582	49.2425498962402	2018-08-03 13:30:27
3	8.89305782318115	49.242618560791	2018-08-03 13:28:25
3	8.89305782318115	49.2426338195801	2018-08-03 13:27:24
3	8.89306926727295	49.2425994873047	2018-08-03 13:29:26
3	8.89320182800293	49.2427368164063	2018-08-03 13:26:24
3	8.89355945587158	49.2428245544434	2018-08-03 13:25:32
3	8.89356231689453	49.2427139282227	2018-08-03 13:23:31
3	8.8935661315918	49.2427444458008	2018-08-03 13:24:22

 

What I want to do is: count how many people (distinct did) do exist with a different timestamp (based on 2 different days). If we take a look at did 1 - the first 3 rows count as one visit, the 4th row is a different day (timestamp does not matter) so it also counts as a visit. So this guy visited 2 times ( or >1) and represtens a returning customer. did 2 visited only once, so it does not count as a returning customer. did 3 visited the place several times (or stayed there a bit longer), but it was on the same day, so it does not count as a returning customer. So basically for that given data the measure would return 1. We have only 1 returning customer. I hope you did understand the problem.

 

My solution: first dax table

 

countPerDay = CALCULATETABLE (
 SUMMARIZE (
 KraichgauNord;
 KraichgauNord[did];
 'Date'[Date];
 "returning customer"; COUNTROWS ( KraichgauNord )
 )
) 
the table returns the did, the date and the count of visists during that single day. So it's basically a group by did and date.
 
2nd step was to create the 2nd dax table:
 
returner = 
    SUMMARIZE (
       countPerDay ;
        countPerDay[did];
        "real returning customers"; COUNTROWS (countPerDay)
    )
The 2nd table groups the 1st table by did. If you have a count of 1, the person has been once there on a single day. If it has a value of > 1, the person visited twice or even more on DIFFERENT days.

And finally I've made a measure which looks like this:
ct_returner = 
var ct = CALCULATE(COUNTROWS(returner); returner[real returning customers] > 1)
return IF (ct = 0; 0; ct)
It seems to be so complicated.. I had to create 2 dax tables + measure for the simple count if a did exists more than once per day in a table.
 
Any ideas?
1 ACCEPTED SOLUTION
tex628
Community Champion
Community Champion

This is really confusing. 

I've made my own example dataset:

 

image.png

The calculated column "Column" is simply:

Column = 
VAR ID_ = [did_groups]
VAR DATE_ = [ltime]

return

IF(CALCULATE( SUM([lon]);
ALL(Table1);
Table1[did_groups] = ID_;
table1[ltime] < DATE_
)
>0
;1;0)

Which gives me 7.

Then i use the measure:

Measure = Calculate(distinctcount([did_groups]);Table1[Column]=1)

Which removes 1 and give's me 6. Which should be the correct number of returning customers in the dataset that I made.

😆


Connect on LinkedIn

View solution in original post

15 REPLIES 15
tex628
Community Champion
Community Champion

Convert ltime to date instead of datetime.
Calculated column: 

Column = 

VAR ID = [did]
VAR DATE = [ltime]

return

IF(CALCULATE( SUM([lon]);
ALL("YOURTABLE");
[did] = ID;
[ltime] < DATE
)
>0
;1:0)

Measure:

Measure = Calculate(distinctcount([did]);column=1)

Connect on LinkedIn
Anonymous
Not applicable

Thanks for your reply, but the calculated column doesn't seem to work. error.PNG

Edit: I've corrected ":" to ";", but there still seems to be a problem with the definition of variables.

tex628
Community Champion
Community Champion

Right, it doesnt let you name things to those specific names 😛 Just rename them and change the ":" at the end!


Connect on LinkedIn
Anonymous
Not applicable

Thanks, there are no syntax errors now., but the result is wrong so I guess that it does not work this way.. It counts way too many as returning customer.

tex628
Community Champion
Community Champion

Is it possible to for you take a picture of the timestamps and the calculated column? 

Like this? 

image.png
Edit: Ignore the 2's in the picture!


Connect on LinkedIn
Anonymous
Not applicable

Yea sure, it's either 0 or 1, there are no other numbers

 

Unbenannt2.PNG

tex628
Community Champion
Community Champion

Thats really weird. 

If you filter on a single returning [did] and then sort by acending dates, how does it look?


Connect on LinkedIn
Anonymous
Not applicable

I forgot to mention that there are "groups" like "Mc Donalds", "KFC" etc. So if a person (did) visits a KFC and next day Mc Donalds, with the same ID, its not a returning customer. It has to be in the same group within 2 different days. I can't take a sceenshot since the table is very big and the columns are too far away.

 

But this is how you can imagine in.

 

only did 1 would count as ONE returning customer in this table. So in this case the result of the measure would be 1 for the given data.

 

 

did     lon     lat     ltime     location
1       8.89314365386963        49.2425270080566        2018-08-02 20:58:12        KFC
1       8.89325141906738        49.2425804138184        2018-08-02 20:53:54        KFC
1       8.89325523376465        49.2425384521484        2018-08-02 21:00:50        KFC
1       8.89333343505859        49.2424774169922        2018-08-04 18:26:42        KFC
2       8.89333820343018        49.2424545288086        2018-07-30 20:08:41        KFC
3       8.89305019378662        49.2425384521484        2018-08-03 13:31:26        KFC
3       8.8930549621582          49.2425498962402        2018-08-03 13:30:27        KFC
3       8.89305782318115        49.242618560791          2018-08-03 13:28:25        KFC
3       8.89305782318115        49.2426338195801        2018-08-03 13:27:24        KFC
3       8.89306926727295        49.2425994873047        2018-08-03 13:29:26        KFC
3       8.89320182800293        49.2427368164063        2018-08-03 13:26:24        KFC
3       8.89355945587158        49.2428245544434        2018-08-03 13:25:32        KFC
3       8.89356231689453        49.2427139282227        2018-08-03 13:23:31        KFC
3       8.8935661315918          49.2427444458008        2018-08-03 13:24:22        KFC
4       8.8935661315918          49.2427444458008        2018-08-01 13:24:22        KFC
4       8.7635661315918          49.7827444458008        2018-08-04 13:24:22        Mc Donalds

 

 

tex628
Community Champion
Community Champion

In that case, make a new column with [did] and [groups]:

did_groups = [did] & [Groups]


Then replace [did] with the new column in the measure below.

This should reduce the amount of returning customers that ur getting.

Column = 

VAR ID = [did_group]
VAR DATE = [ltime]

return

IF(CALCULATE( SUM([lon]);
ALL("YOURTABLE");
[did_group] = ID;
[ltime] < DATE
)
>0
;1:0)

Connect on LinkedIn
Anonymous
Not applicable

It doesn't work...Still way too many returners. I also don't think that is that easy.

tex628
Community Champion
Community Champion

did     lon                     lat                     ltime                      location   Column
1       8.89314365386963        49.2425270080566        2018-08-02 20:58:12        KFC        0
1       8.89325141906738        49.2425804138184        2018-08-02 20:53:54        KFC        0
1       8.89325523376465        49.2425384521484        2018-08-02 21:00:50        KFC        0
1       8.89333343505859        49.2424774169922        2018-08-04 18:26:42        KFC        1
2       8.89333820343018        49.2424545288086        2018-07-30 20:08:41        KFC        0
3       8.89305019378662        49.2425384521484        2018-08-03 13:31:26        KFC        0
3       8.8930549621582         49.2425498962402        2018-08-03 13:30:27        KFC        0
3       8.89305782318115        49.242618560791         2018-08-03 13:28:25        KFC        0
3       8.89305782318115        49.2426338195801        2018-08-03 13:27:24        KFC        0
3       8.89306926727295        49.2425994873047        2018-08-04 13:29:26        KFC        1
3       8.89320182800293        49.2427368164063        2018-08-04 13:26:24        KFC        1
3       8.89355945587158        49.2428245544434        2018-08-04 13:25:32        Mc Donalds 0
3       8.89356231689453        49.2427139282227        2018-08-04 13:23:31        KFC        1
3       8.8935661315918         49.2427444458008        2018-08-04 13:24:22        Mc Donalds 1
4       8.8935661315918         49.2427444458008        2018-08-01 13:24:22        KFC        0
4       8.7635661315918         49.7827444458008        2018-08-04 13:24:22        Mc Donalds 0


The column should return the above, if it doesn't then i've done something wrong 😆

Edit: I changed the dates around a little, and this example should have 3 returning customers if i understand you correctly!


Connect on LinkedIn
Anonymous
Not applicable

The resultcolumn is wrong - e.g. DID 2, Group "Mc Donalds" is not a returning customer, since he was there on the same day. DID 3 has 3 times the flag 1, but it should count as 1 returning customer, because it does not matter how often he was there. It only matters that he was there on >1 different days. Then this did counts as ONE returning customer overall. No matter if he was there on 10 or 50 different days.

tex628
Community Champion
Community Champion

This is really confusing. 

I've made my own example dataset:

 

image.png

The calculated column "Column" is simply:

Column = 
VAR ID_ = [did_groups]
VAR DATE_ = [ltime]

return

IF(CALCULATE( SUM([lon]);
ALL(Table1);
Table1[did_groups] = ID_;
table1[ltime] < DATE_
)
>0
;1;0)

Which gives me 7.

Then i use the measure:

Measure = Calculate(distinctcount([did_groups]);Table1[Column]=1)

Which removes 1 and give's me 6. Which should be the correct number of returning customers in the dataset that I made.

😆


Connect on LinkedIn
Anonymous
Not applicable

I appreciate your help. Im going to validate it with my data. Unfortunately I don't have time at the moment, but will mark your answer as the solution since it seems to work with your example and will take a deeper look when I've got time 🙂

tex628
Community Champion
Community Champion

Just give me shout if it doesn't resolve itself! 🙂


Connect on LinkedIn

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.