Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello community,
I manage the data of an app and would need some help to optimize the classical running total
Sign up count =
CALCULATE (
DISTINCTCOUNT( 'Global table'[email] );
FILTER( ALL( 'Global table' ); 'Global table'[date] <= MAX( 'Global table'[date] ) )
)
I have a huge table that records important actions performed in the app and lots of table that details those events (1M5 rows growing every day), with the above formula, i get the right result however it is taking a very long time to load as I iterate through 1M5 rows.
The 'Global Table'
Event ID | Date | action | |
663 | rick | 14-juil | send |
2805 | rick | 14-juil | receive |
1034 | lea | 14-juil | send |
3008 | rick | 15-juil | Send |
9987 | paul | 16-juil | schedule |
8070 | daniel | 16-juil | send |
7987 | rick | 16-juil | send |
1419 | daniel | 17-juil | schedule |
9103 | paul | 17-juil | send |
6702 | rick | 17-juil | receive |
7184 | rick | 17-juil | receive |
3802 | daniel | 17-juil | send |
4326 | lea | 17-juil | schedule |
I wanted to change the context and optimize this therefore I created a new table ('Users Info') that summarizes the emails and takes MIN(DATE) (which corresponds to the first date we see the person in our DataBase, EG the "sign_up" date).
'Users Info' Table
Sign_Up | |
Rick | 14-juil |
Lea | 14-juil |
Paul | 16-juil |
Daniel | 16-juil |
I feel this table is the right way to go to optimize the running total but I'm running through several issues:
using:
Sign up =
CALCULATE(
COUNTROWS( 'Users Info' );
FILTER( 'Users Info'; 'Users Info'[Sign_up] <= MAX( 'Users Info'[Sign_up] )
)
)
My issue is linked to the X Axis I use:
If use the dates from my 'Global Table' on the X Axis, I simply get the amount of people that were in the DB on each day (I get 'active users' instead of 'Sign up' users), No matter the hierarhcy (I got some weekdate groups, formatted day+month etc...)
In my table I typically get 2 people on the 14th, 1 on the 15th, 3 on the 16th, 4 on the 17th
If I use the dates from my 'Users Info' table (therefore using the Sign_Up date column on the X axis), I get the right numbers for dates where some people signed up and some holes on the dates where no one signed up (as PBI does not find the date in the 'Users Info' table it ignores the formula...)
In my tables above I would typically get 2 people on the 14th of July, Blank on 15th, 3 people on 16th, 4 on the 17th.
I hope there is a solution or a getaround to my issue, I hope the post is clear enough, let me know if I need to precise some things, and feel free if you have totally different solution for my issue!
Best,
Leo
Solved! Go to Solution.
Hi @Anonymous,
Try to change you measure to make the filter on the Global Table [Date]
Sign up =
CALCULATE(
COUNTROWS( 'Users Info' );
FILTER( 'Users Info'; 'Users Info'[Sign_up] <= MAX( 'Global Table'[Date] )
)
)
Then place the 'Global Table'[Date] on the x-axis should work as expected.
Regards,
MFelix
Regards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsHi @Anonymous,
You can have a better performance making an adjustment to your current formula. Having a ALL statement for the full table you will get all columns and all rows try to only add the columns you need to interact with:
Sign up count = CALCULATE ( DISTINCTCOUNT( 'Global table'[email] ); FILTER( ALL( 'Global table'[date] ); 'Global table'[date] <= MAX( 'Global table'[date] ) ) )
This should work faster, try it and then tell me the result.
Regards,
MFelix
Regards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsHey @MFelix,
Thank you for your answer, selecting only one column is indeed better but I is still taking ages to load the graph (10-20 secs) whereas it is alone in a page... And still crushes my poor RAM and even power BI cloud returns error message as it takes too long to load..
I feel I'm very close to getting something with the summarized table, the calculation is performed well when someone signed up during the week and I just have to return a value when nobody signed up in the week...
Here is a visual example of my issue... A Company where we launched our product early in march (that explains the peak) and who then had quite a few additional users so most weeks are filled, in some cases, no one installed during the week and the result returned is blank.
The Date axis which I take from 'Users Info'[Sign_up] with a 7 days group (for weeks) is a bit of a cheat but not really viable, I feel like I have to user the date from 'Global Table'[Date] which I use in most graph and works but it returns this:
A not so running total yet some regular DISTINCTCOUNT for each week....
Hi @Anonymous,
Try to change you measure to make the filter on the Global Table [Date]
Sign up =
CALCULATE(
COUNTROWS( 'Users Info' );
FILTER( 'Users Info'; 'Users Info'[Sign_up] <= MAX( 'Global Table'[Date] )
)
)
Then place the 'Global Table'[Date] on the x-axis should work as expected.
Regards,
MFelix
Regards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsHey @MFelix,
Wow... Quite frustrating as I've spent a day scooting the internet to see if I could find someone with the same pattern.... I feel a bit dumb..
Thanks so much it worked, I have to precise for the people who might encounter the same pattern I had to remove the Bi-directionnal Cross-filtering between the two table (Users Info & Main table). Now it's one way: Users Info -> Main Table.
Thanks Felix,
L
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
101 | |
78 | |
75 | |
49 |
User | Count |
---|---|
145 | |
108 | |
107 | |
89 | |
61 |