Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I apologize if this is a duplicate post, but I'm pretty sure my first attempt got lost in the ether. Also, this ultimately is a DAX/Excel question, not a DAX/PowerBI question. I hope that's not a problem. This seems to be the place to go for DAX help.
I've been working quite hard today to coax my data into the right format for me to then wrap in a GROUPBY producing the results I'm looking for. I finally succeeded, but realized I have no good sense of how to integrate this results table into my existing Data Model. Something like LOOKUPVALUE would be appropriate, but I have no idea how to pull that out of a results set.
My data consists of pipeline history data linked to related event history data, all pulled using SQL. However, since not every pipeline history row relates to every distinct event it could be related to, I had to generate the missing event rows for each pipeline row and tell it to use the previous non-blank value for the Revenue column. Hope that makes sense.
So I'm starting with something like this:
pipeline_id | pipeline_event_id | amount | ||
PI-U00026907 | 6/5/2020 | 118344 | ||
6/5/2020 | 84097 | |||
PE-U00033145 | 11247 | |||
PE-U00033146 | 8000 | |||
PE-U00033147 | 14450 | |||
PE-U00033148 | 14450 | |||
PE-U00033149 | 8000 | |||
PE-U00033150 | 13500 | |||
PE-U00033151 | 14450 | |||
7/15/2020 | 19247 | |||
PE-U00033145 | 11247 | |||
PE-U00033146 | 8000 | |||
PE-U00033147 | 0 | |||
PE-U00033151 | 0 | |||
10/21/2020 | 15000 | |||
PE-U00033145 | 10000 | |||
PE-U00033146 | 5000 | |||
PE-U00033147 | 0 | |||
PE-U00033148 | 0 | |||
PE-U00033149 | 0 | |||
PE-U00033150 | 0 | |||
PE-U00033151 | 0 | |||
PI-U00027090 | 7/30/2020 | 4000 | ||
7/30/2020 | 2000 | |||
PE-U00033477 | 2000 | |||
11/2/2020 | 2000 | |||
PE-U00033477 | 2000 |
Here's my DAX code:
EVALUATE
VAR tbl =
CALCULATETABLE (
ADDCOLUMNS (
NATURALLEFTOUTERJOIN (
SUMMARIZE (
qryPipelineUS,
qryPipelineUS[pipeline_id],
qryPipelineUS[pipeline_event_id]
),
SUMMARIZE (
qryPipelineUS,
qryPipelineUS[pipeline_id],
qryPipelineUS[day_last_mod]
)
),
"Rev",
CALCULATE (
FIRSTNONBLANK ( qryPipelineUS[amount], 1 ),
FILTER (
ALLEXCEPT ( qryPipelineUS, qryPipelineUS[pipeline_event_id] ),
qryPipelineUS[day_last_mod] = CALCULATE ( MAX ( qryPipelineUS[day_last_mod] ) )
&& NOT ( ISBLANK ( qryPipelineUS[amount] ) )
&& qryPipelineUS[day_last_mod] <= EARLIEST ( qryPipelineUS[day_last_mod] )
)
)
),
qryPipelineUS[pipeline_id] IN { "PI-U00026907", "PI-U00027090" }
)
RETURN
GROUPBY (
tbl,
[pipeline_id],
[day_last_mod],
"Revenue", SUMX ( CURRENTGROUP (), [Rev] )
)
ORDER BY
[pipeline_id],
[day_last_mod]
It produces this:
pipeline_id | day_last_mod | Revenue | ||
PI-U00026907 | 6/5/2020 12:00:00 AM | 84097 | ||
PI-U00026907 | 7/15/2020 12:00:00 AM | 55197 | ||
PI-U00026907 | 10/21/2020 12:00:00 AM | 15000 | ||
PI-U00027090 | 7/30/2020 12:00:00 AM | 2000 | ||
PI-U00027090 | 11/2/2020 12:00:00 AM | 2000 |
I'm at a loss for what to do now, though. If I wrap my GROUPBY in a SUMX, DAX Studio complains about not being able to locate the [Revenue] column. Excel accepts it and cheerfully produces the wrong results, like this:
pipeline_id | pipeline_event_id | amount | ||
PI-U00026907 | 6/5/2020 | 2,777,292 | ||
6/5/2020 | 588,679 | |||
PE-U00033145 | 11247 | |||
PE-U00033146 | 8000 | |||
PE-U00033147 | 14450 | |||
PE-U00033148 | 14450 | |||
PE-U00033149 | 8000 | |||
PE-U00033150 | 13500 | |||
PE-U00033151 | 14450 | |||
7/15/2020 | 76,988 | |||
PE-U00033145 | 11247 | |||
PE-U00033146 | 8000 | |||
PE-U00033147 | 0 | |||
PE-U00033151 | 0 | |||
10/21/2020 | 105,000 | |||
PE-U00033145 | 10000 | |||
PE-U00033146 | 5000 | |||
PE-U00033147 | 0 | |||
PE-U00033148 | 0 | |||
PE-U00033149 | 0 | |||
PE-U00033150 | 0 | |||
PE-U00033151 | 0 | |||
PI-U00027090 | 7/30/2020 | 16,000 | ||
7/30/2020 | 6,000 | |||
PE-U00033477 | 4,000 | |||
11/2/2020 | 2000 | |||
PE-U00033477 | 2000 |
How should I go about using the data from my results set? I have no real sense of how to link it into my Data Model. Thank you in advance for any help you can offer.
Solved! Go to Solution.
I finally solved this. It was dirt simple. I'm embarrassed I didn't come up with it sooner. I just had to alter my SQL query to return a difference between the current and previous row and sum all that up in DAX. My measure wound up just being:
Revenue:=VAR MaxDate =
MAX ( qryPipelineUS[day_last_mod] )
RETURN
CALCULATE (
SUM ( qryPipelineUS[eveAmtDiff] ),
qryPipelineUS[day_last_mod] <= MaxDate
)
The T-SQL's a little more out of the ordinary, as you have to do partitioning and ROW windows, but it's simple enough once you get used to it.
MAX(eh.amount) OVER (
PARTITION BY ph.pipeline_id, eh.pipeline_event_id
ORDER BY ph.pipeline_id, eh.pipeline_event_id, eh.date_rec_copy
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
)
Well it's a powerbi forum but if you make it clear you're not using it but would like some help with DAX, you'll probably get help. I still don't know what the desired result is. Please show us.
After a little Photoshop work... here is the desired result. The 'Revenue' column is what the data should look like. 'TestRev' is what I'm actually getting. As you can see, 'Revenue' is showing the most recent value for any given 'pipeline_event_id' and it's that number which should roll up to the 'day_last_mod' row and the 'pipeline_id' row. So, just to walk through it, the first pipeline starts out with a total value of 84,097. The next update zeros out two of the events under that pipeline, so the total goes down to 55,197. (It also shows two events with the same value they had before. Just a quirk of the data.) Finally, all but two of the events get zero'd out and the two remaining events' values are reduced, resulting in a final value of 15,000. This is the number that rolls up to the pipeline_id level.
I finally solved this. It was dirt simple. I'm embarrassed I didn't come up with it sooner. I just had to alter my SQL query to return a difference between the current and previous row and sum all that up in DAX. My measure wound up just being:
Revenue:=VAR MaxDate =
MAX ( qryPipelineUS[day_last_mod] )
RETURN
CALCULATE (
SUM ( qryPipelineUS[eveAmtDiff] ),
qryPipelineUS[day_last_mod] <= MaxDate
)
The T-SQL's a little more out of the ordinary, as you have to do partitioning and ROW windows, but it's simple enough once you get used to it.
MAX(eh.amount) OVER (
PARTITION BY ph.pipeline_id, eh.pipeline_event_id
ORDER BY ph.pipeline_id, eh.pipeline_event_id, eh.date_rec_copy
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
)
"How should I go about using the data from my results set?"
- the question is what is the desired result?
Basically, I'd like to be able to see the data in my Pivot Table the same way as it is in the small results table I posted, aggregating the way I described at the [day_last_mod] and [pipeline_id] levels, and even higher up, i.e., showing the sum of only the most recent values for each pipeline_event_id. Is there anything I can provide that would help?
I'm not at all sure, by the way, that I'm not over-complicating this. Maybe there's a simple measure like
CALCULATE (
LASTNONBLANK ( qryPipelineUS[amount], 1 ),
VAR earlyMod = MAX( qryPipelineUS[day_last_mod] )
RETURN
FILTER (
ALLEXCEPT ( qryPipelineUS, qryPipelineUS[pipeline_event_id] ),
qryPipelineUS[day_last_mod] <= earlyMod
)
)
...that would do the job. I think I may not have my data sorted properly for LASTNONBLANK to correctly, so I'm trying that this morning.
Resorting the data wasn't the problem. Oh well.
You've mentioned DAX Studio, Excel. Are you actually using powerbi?
No, I was under the impression this forum covered DAX usage in general. I'm very sorry if I'm mistaken. Can you suggest a more suitable forum?
User | Count |
---|---|
47 | |
27 | |
23 | |
18 | |
15 |
User | Count |
---|---|
55 | |
34 | |
18 | |
17 | |
15 |