Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
volfied
Frequent Visitor

Integrating DAX result sets into Data Model

I apologize if this is a duplicate post, but I'm pretty sure my first attempt got lost in the ether. Also, this ultimately is a DAX/Excel question, not a DAX/PowerBI question. I hope that's not a problem. This seems to be the place to go for DAX help.

 

I've been working quite hard today to coax my data into the right format for me to then wrap in a GROUPBY producing the results I'm looking for. I finally succeeded, but realized I have no good sense of how to integrate this results table into my existing Data Model. Something like LOOKUPVALUE would be appropriate, but I have no idea how to pull that out of a results set.

 

My data consists of pipeline history data linked to related event history data, all pulled using SQL. However, since not every pipeline history row relates to every distinct event it could be related to, I had to generate the missing event rows for each pipeline row and tell it to use the previous non-blank value for the Revenue column. Hope that makes sense.

 

So I'm starting with something like this:

 

pipeline_id pipeline_event_id amount
PI-U00026907 6/5/2020 118344
  6/5/2020 84097
  PE-U00033145 11247
  PE-U00033146 8000
  PE-U00033147 14450
  PE-U00033148 14450
  PE-U00033149 8000
  PE-U00033150 13500
  PE-U00033151 14450
  7/15/2020 19247
  PE-U00033145 11247
  PE-U00033146 8000
  PE-U00033147 0
  PE-U00033151 0
  10/21/2020 15000
  PE-U00033145 10000
  PE-U00033146 5000
  PE-U00033147 0
  PE-U00033148 0
  PE-U00033149 0
  PE-U00033150 0
  PE-U00033151 0
PI-U00027090 7/30/2020 4000
  7/30/2020 2000
  PE-U00033477 2000
  11/2/2020 2000
  PE-U00033477 2000

 

Here's my DAX code:

 

 

 

 

 

 

EVALUATE
VAR tbl =
    CALCULATETABLE (
        ADDCOLUMNS (
            NATURALLEFTOUTERJOIN (
                SUMMARIZE (
                    qryPipelineUS,
                    qryPipelineUS[pipeline_id],
                    qryPipelineUS[pipeline_event_id]
                ),
                SUMMARIZE (
                    qryPipelineUS,
                    qryPipelineUS[pipeline_id],
                    qryPipelineUS[day_last_mod]
                )
            ),
            "Rev",
                CALCULATE (
                    FIRSTNONBLANK ( qryPipelineUS[amount], 1 ),
                    FILTER (
                        ALLEXCEPT ( qryPipelineUS, qryPipelineUS[pipeline_event_id] ),
                        qryPipelineUS[day_last_mod] = CALCULATE ( MAX ( qryPipelineUS[day_last_mod] ) )
                            && NOT ( ISBLANK ( qryPipelineUS[amount] ) )
                                && qryPipelineUS[day_last_mod] <= EARLIEST ( qryPipelineUS[day_last_mod] )
                    )
                )
        ),
        qryPipelineUS[pipeline_id] IN { "PI-U00026907", "PI-U00027090" }
    )
RETURN
    GROUPBY (
        tbl,
        [pipeline_id],
        [day_last_mod],
        "Revenue", SUMX ( CURRENTGROUP (), [Rev] )
    )
ORDER BY
    [pipeline_id],
    [day_last_mod]

 

 

 

 

 

 

It produces this:

 

pipeline_id day_last_mod Revenue
PI-U00026907 6/5/2020 12:00:00 AM 84097
PI-U00026907 7/15/2020 12:00:00 AM 55197
PI-U00026907 10/21/2020 12:00:00 AM 15000
PI-U00027090 7/30/2020 12:00:00 AM 2000
PI-U00027090 11/2/2020 12:00:00 AM 2000

 

I'm at a loss for what to do now, though. If I wrap my GROUPBY in a SUMX, DAX Studio complains about not being able to locate the [Revenue] column. Excel accepts it and cheerfully produces the wrong results, like this:

 

pipeline_id pipeline_event_id amount
PI-U00026907 6/5/2020 2,777,292
  6/5/2020 588,679
  PE-U00033145 11247
  PE-U00033146 8000
  PE-U00033147 14450
  PE-U00033148 14450
  PE-U00033149 8000
  PE-U00033150 13500
  PE-U00033151 14450
  7/15/2020 76,988
  PE-U00033145 11247
  PE-U00033146 8000
  PE-U00033147 0
  PE-U00033151 0
  10/21/2020 105,000
  PE-U00033145 10000
  PE-U00033146 5000
  PE-U00033147 0
  PE-U00033148 0
  PE-U00033149 0
  PE-U00033150 0
  PE-U00033151 0
PI-U00027090 7/30/2020 16,000
  7/30/2020 6,000
  PE-U00033477 4,000
  11/2/2020 2000
  PE-U00033477 2000

 

How should I go about using the data from my results set? I have no real sense of how to link it into my Data Model. Thank you in advance for any help you can offer.

1 ACCEPTED SOLUTION

I finally solved this. It was dirt simple. I'm embarrassed I didn't come up with it sooner. I just had to alter my SQL query to return a difference between the current and previous row and sum all that up in DAX. My measure wound up just being:

Revenue:=VAR MaxDate =
            MAX ( qryPipelineUS[day_last_mod] )
        RETURN
            CALCULATE (
                SUM ( qryPipelineUS[eveAmtDiff] ),
                qryPipelineUS[day_last_mod] <= MaxDate
            )

 

The T-SQL's a little more out of the ordinary, as you have to do partitioning and ROW windows, but it's simple enough once you get used to it. 

MAX(eh.amount) OVER (
PARTITION BY ph.pipeline_id, eh.pipeline_event_id
ORDER BY ph.pipeline_id, eh.pipeline_event_id, eh.date_rec_copy
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
)

 

View solution in original post

8 REPLIES 8
HotChilli
Super User
Super User

Well it's a powerbi forum but if you make it clear you're not using it but would like some help with DAX, you'll probably get help. I still don't know what the desired result is. Please show us.

After a little Photoshop work... here is the desired result. The 'Revenue' column is what the data should look like. 'TestRev' is what I'm actually getting. As you can see, 'Revenue' is showing the most recent value for any given 'pipeline_event_id' and it's that number which should roll up to the 'day_last_mod' row and the 'pipeline_id' row. So, just to walk through it, the first pipeline starts out with a total value of 84,097. The next update zeros out two of the events under that pipeline, so the total goes down to 55,197. (It also shows two events with the same value they had before. Just a quirk of the data.) Finally, all but two of the events get zero'd out and the two remaining events' values are reduced, resulting in a final value of 15,000. This is the number that rolls up to the pipeline_id level.

 

pipeline_bad.png

I finally solved this. It was dirt simple. I'm embarrassed I didn't come up with it sooner. I just had to alter my SQL query to return a difference between the current and previous row and sum all that up in DAX. My measure wound up just being:

Revenue:=VAR MaxDate =
            MAX ( qryPipelineUS[day_last_mod] )
        RETURN
            CALCULATE (
                SUM ( qryPipelineUS[eveAmtDiff] ),
                qryPipelineUS[day_last_mod] <= MaxDate
            )

 

The T-SQL's a little more out of the ordinary, as you have to do partitioning and ROW windows, but it's simple enough once you get used to it. 

MAX(eh.amount) OVER (
PARTITION BY ph.pipeline_id, eh.pipeline_event_id
ORDER BY ph.pipeline_id, eh.pipeline_event_id, eh.date_rec_copy
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
)

 

HotChilli
Super User
Super User

"How should I go about using the data from my results set?"

- the question is     what is the desired result? 

Basically, I'd like to be able to see the data in my Pivot Table the same way as it is in the small results table I posted, aggregating the way I described at the [day_last_mod] and [pipeline_id] levels, and even higher up, i.e., showing the sum of only the most recent values for each pipeline_event_id. Is there anything I can provide that would help?

 

I'm not at all sure, by the way, that I'm not over-complicating this. Maybe there's a simple measure like 

 

    CALCULATE (
        LASTNONBLANK ( qryPipelineUS[amount], 1 ),
        VAR earlyMod = MAX( qryPipelineUS[day_last_mod] )
        RETURN
            FILTER (
                ALLEXCEPT ( qryPipelineUS, qryPipelineUS[pipeline_event_id] ),
                        qryPipelineUS[day_last_mod] <= earlyMod
            )
    )

...that would do the job. I think I may not have my data sorted properly for LASTNONBLANK to correctly, so I'm trying that this morning.

Resorting the data wasn't the problem. Oh well.

You've mentioned DAX Studio, Excel.  Are you actually using powerbi?

No, I was under the impression this forum covered DAX usage in general. I'm very sorry if I'm mistaken. Can you suggest a more suitable forum?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors