Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hey folks,
I know that this topic was already discussed, but I didn't get any of the solutions to work that were posted here before. I was able to solve my problem with 2 dax tables + measure, but I'm not happy with my solution and I guess that there is a better / simpler way to do it.
So let's take a look:
Imagine you have this type of data: an id, longitude, latitude (both does not really matter here) and date + time stamp.
did lon lat ltime 1 8.89314365386963 49.2425270080566 2018-08-02 20:58:12 1 8.89325141906738 49.2425804138184 2018-08-02 20:53:54 1 8.89325523376465 49.2425384521484 2018-08-02 21:00:50 1 8.89333343505859 49.2424774169922 2018-08-04 18:26:42 2 8.89333820343018 49.2424545288086 2018-07-30 20:08:41 3 8.89305019378662 49.2425384521484 2018-08-03 13:31:26 3 8.8930549621582 49.2425498962402 2018-08-03 13:30:27 3 8.89305782318115 49.242618560791 2018-08-03 13:28:25 3 8.89305782318115 49.2426338195801 2018-08-03 13:27:24 3 8.89306926727295 49.2425994873047 2018-08-03 13:29:26 3 8.89320182800293 49.2427368164063 2018-08-03 13:26:24 3 8.89355945587158 49.2428245544434 2018-08-03 13:25:32 3 8.89356231689453 49.2427139282227 2018-08-03 13:23:31 3 8.8935661315918 49.2427444458008 2018-08-03 13:24:22
What I want to do is: count how many people (distinct did) do exist with a different timestamp (based on 2 different days). If we take a look at did 1 - the first 3 rows count as one visit, the 4th row is a different day (timestamp does not matter) so it also counts as a visit. So this guy visited 2 times ( or >1) and represtens a returning customer. did 2 visited only once, so it does not count as a returning customer. did 3 visited the place several times (or stayed there a bit longer), but it was on the same day, so it does not count as a returning customer. So basically for that given data the measure would return 1. We have only 1 returning customer. I hope you did understand the problem.
My solution: first dax table
countPerDay = CALCULATETABLE ( SUMMARIZE ( KraichgauNord; KraichgauNord[did]; 'Date'[Date]; "returning customer"; COUNTROWS ( KraichgauNord ) ) )the table returns the did, the date and the count of visists during that single day. So it's basically a group by did and date.
returner = SUMMARIZE ( countPerDay ; countPerDay[did]; "real returning customers"; COUNTROWS (countPerDay) )
ct_returner = var ct = CALCULATE(COUNTROWS(returner); returner[real returning customers] > 1) return IF (ct = 0; 0; ct)
Solved! Go to Solution.
This is really confusing.
I've made my own example dataset:
The calculated column "Column" is simply:
Column = VAR ID_ = [did_groups] VAR DATE_ = [ltime] return IF(CALCULATE( SUM([lon]); ALL(Table1); Table1[did_groups] = ID_; table1[ltime] < DATE_ ) >0 ;1;0)
Which gives me 7.
Then i use the measure:
Measure = Calculate(distinctcount([did_groups]);Table1[Column]=1)
Which removes 1 and give's me 6. Which should be the correct number of returning customers in the dataset that I made.
😆
Convert ltime to date instead of datetime.
Calculated column:
Column = VAR ID = [did] VAR DATE = [ltime] return IF(CALCULATE( SUM([lon]); ALL("YOURTABLE"); [did] = ID; [ltime] < DATE ) >0 ;1:0)
Measure:
Measure = Calculate(distinctcount([did]);column=1)
Thanks for your reply, but the calculated column doesn't seem to work.
Edit: I've corrected ":" to ";", but there still seems to be a problem with the definition of variables.
Right, it doesnt let you name things to those specific names 😛 Just rename them and change the ":" at the end!
Thanks, there are no syntax errors now., but the result is wrong so I guess that it does not work this way.. It counts way too many as returning customer.
Is it possible to for you take a picture of the timestamps and the calculated column?
Like this?
Edit: Ignore the 2's in the picture!
Yea sure, it's either 0 or 1, there are no other numbers
Thats really weird.
If you filter on a single returning [did] and then sort by acending dates, how does it look?
I forgot to mention that there are "groups" like "Mc Donalds", "KFC" etc. So if a person (did) visits a KFC and next day Mc Donalds, with the same ID, its not a returning customer. It has to be in the same group within 2 different days. I can't take a sceenshot since the table is very big and the columns are too far away.
But this is how you can imagine in.
only did 1 would count as ONE returning customer in this table. So in this case the result of the measure would be 1 for the given data.
did lon lat ltime location 1 8.89314365386963 49.2425270080566 2018-08-02 20:58:12 KFC 1 8.89325141906738 49.2425804138184 2018-08-02 20:53:54 KFC 1 8.89325523376465 49.2425384521484 2018-08-02 21:00:50 KFC 1 8.89333343505859 49.2424774169922 2018-08-04 18:26:42 KFC 2 8.89333820343018 49.2424545288086 2018-07-30 20:08:41 KFC 3 8.89305019378662 49.2425384521484 2018-08-03 13:31:26 KFC 3 8.8930549621582 49.2425498962402 2018-08-03 13:30:27 KFC 3 8.89305782318115 49.242618560791 2018-08-03 13:28:25 KFC 3 8.89305782318115 49.2426338195801 2018-08-03 13:27:24 KFC 3 8.89306926727295 49.2425994873047 2018-08-03 13:29:26 KFC 3 8.89320182800293 49.2427368164063 2018-08-03 13:26:24 KFC 3 8.89355945587158 49.2428245544434 2018-08-03 13:25:32 KFC 3 8.89356231689453 49.2427139282227 2018-08-03 13:23:31 KFC 3 8.8935661315918 49.2427444458008 2018-08-03 13:24:22 KFC 4 8.8935661315918 49.2427444458008 2018-08-01 13:24:22 KFC 4 8.7635661315918 49.7827444458008 2018-08-04 13:24:22 Mc Donalds
In that case, make a new column with [did] and [groups]:
did_groups = [did] & [Groups]
Then replace [did] with the new column in the measure below.
This should reduce the amount of returning customers that ur getting.
Column = VAR ID = [did_group] VAR DATE = [ltime] return IF(CALCULATE( SUM([lon]); ALL("YOURTABLE"); [did_group] = ID; [ltime] < DATE ) >0 ;1:0)
It doesn't work...Still way too many returners. I also don't think that is that easy.
did lon lat ltime location Column 1 8.89314365386963 49.2425270080566 2018-08-02 20:58:12 KFC 0 1 8.89325141906738 49.2425804138184 2018-08-02 20:53:54 KFC 0 1 8.89325523376465 49.2425384521484 2018-08-02 21:00:50 KFC 0 1 8.89333343505859 49.2424774169922 2018-08-04 18:26:42 KFC 1 2 8.89333820343018 49.2424545288086 2018-07-30 20:08:41 KFC 0 3 8.89305019378662 49.2425384521484 2018-08-03 13:31:26 KFC 0 3 8.8930549621582 49.2425498962402 2018-08-03 13:30:27 KFC 0 3 8.89305782318115 49.242618560791 2018-08-03 13:28:25 KFC 0 3 8.89305782318115 49.2426338195801 2018-08-03 13:27:24 KFC 0 3 8.89306926727295 49.2425994873047 2018-08-04 13:29:26 KFC 1 3 8.89320182800293 49.2427368164063 2018-08-04 13:26:24 KFC 1 3 8.89355945587158 49.2428245544434 2018-08-04 13:25:32 Mc Donalds 0 3 8.89356231689453 49.2427139282227 2018-08-04 13:23:31 KFC 1 3 8.8935661315918 49.2427444458008 2018-08-04 13:24:22 Mc Donalds 1 4 8.8935661315918 49.2427444458008 2018-08-01 13:24:22 KFC 0 4 8.7635661315918 49.7827444458008 2018-08-04 13:24:22 Mc Donalds 0
The column should return the above, if it doesn't then i've done something wrong 😆
Edit: I changed the dates around a little, and this example should have 3 returning customers if i understand you correctly!
The resultcolumn is wrong - e.g. DID 2, Group "Mc Donalds" is not a returning customer, since he was there on the same day. DID 3 has 3 times the flag 1, but it should count as 1 returning customer, because it does not matter how often he was there. It only matters that he was there on >1 different days. Then this did counts as ONE returning customer overall. No matter if he was there on 10 or 50 different days.
This is really confusing.
I've made my own example dataset:
The calculated column "Column" is simply:
Column = VAR ID_ = [did_groups] VAR DATE_ = [ltime] return IF(CALCULATE( SUM([lon]); ALL(Table1); Table1[did_groups] = ID_; table1[ltime] < DATE_ ) >0 ;1;0)
Which gives me 7.
Then i use the measure:
Measure = Calculate(distinctcount([did_groups]);Table1[Column]=1)
Which removes 1 and give's me 6. Which should be the correct number of returning customers in the dataset that I made.
😆
I appreciate your help. Im going to validate it with my data. Unfortunately I don't have time at the moment, but will mark your answer as the solution since it seems to work with your example and will take a deeper look when I've got time 🙂
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
99 | |
82 | |
70 | |
60 |
User | Count |
---|---|
149 | |
114 | |
107 | |
89 | |
67 |