Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Simeon
Frequent Visitor

Power BI DAX Measure - Summarize Across multiple Dimensions and Facts.

Hi everyone, hope you're well.

 

I have a general question regarding the best way to write a Power BI DAX mesure in a specific set of cases. I've been looking around online and can't quite find a solution to my problem.

 

Say that we have a hypothetical Star schema as per the image below with 3 dimensions for a hypothetical retail company.

The retail company sells Items (SKUs) through its Stores (Locations).

Therefore we have 3 dimensions: Date, SKU and Location.

We also have 2 facts: Daily Opening Inventory (Fact_Inventory), and Daily Minimum Required Stock (Fact_Inventory_Req).

All relationships are One-Directional, One-to-Many.

 

Basic Star Schema.jpg

 

I want to write a measure which counts the number of occurences where an Item in a Store had an Inventory Level below the Required Safety Stock (each day). I am struggling to find the best and most optimal way to write such a measure.

 

At first glance, I would use SUMMARIZECOLUMNS, however it turns out that this cannot be used in Power BI measures.

 

Therefore, currently my solution is to write something as per below:

 

Measure =

VAR SummarizedTable =

    SUMMARIZE(
        CROSSJOIN(
            VALUES( 'Dim_Date'[Date] )
            ,VALUES( 'Dim_SKU'[SKU Key] )
            ,VALUES( 'Dim_Location'[Location Key] )
        )
        ,"Opening Inventory", SUM( 'Fact_Inventory'[Inventory_Qty] )
        ,"Required Inventory", SUM( 'Fact_Inventory_Req'[Inventory_Qty_Req] )
    )
 
VAR Result = 
    COUNTROWS(
        FILTER(
            SummarizedTable
            ,[Opening Inventory] > [Required Inventory]
        )
    )
 
RETURN Result
 
This measure is accurate, but it's too slow, especially if we have a large number of items and stores, and we're evaluating the measure over a long period like a year.
 
One workaround I've found is to combine the two fact tables into one fact table, and use an interative function over the single fact table, but in other similar cases I have fact tables in different grains. Also, I want to avoid using workarounds as a crutch for poor DAX skills.
Another thing I've done is add a boolean calculated column to the Dim_Item table which returns TRUE if a SKU exists in the Fact_Inventory_Req table, and added this as a page filter (= TRUE), so the measure only evaluates for SKUs in the Fact_Inventory_Req fact table, but it's still slow.
 
My main questions are:
Is there a more optimal way of writing such a masure?
Is perhaps my star schema not correct?
Will SUMMARIZECOLUMNS become available for measures in the future?
 
Many thanks for your help! 🙂
2 ACCEPTED SOLUTIONS
OwenAuger
Super User
Super User

Hi @Simeon 

Interesting scenario, and you have hit on some good ideas

If the two fact tables are aggregated to the same grain (e.g. Date/Location/SKU) you could indeed precompute the result for each row as a binary value and sum that. However it sounds like this is not always the case.

 

To answer your specific questions:

  • To more optimally write such a measure that has to iterate over combinations of dimensions, I would generally use SUMMARIZE to group the fact table by the required dimensions (rather than CROSSJOIN), then iterate over the result. SUMMARIZE produces these comabinations very efficiently with the storage engine. Also, I would not suggest using SUMMARIZE to create extension columns (see this article).
  • Your star schema looks logical and correct to me!
  • Last I checked, SUMMARIZECOLUMNS cannot generally be used in measures (in particular within iterators) but it has been suggested that this is going to change (see this video featuring Jeffrey Wang).

Here is how I would suggest writing the measure, based on what you've posted, including two measures for the inventory sums that I would create for readability.

 

Oh, I have assumed that you would only count combinations that exist in 'Fact_Inventory_Req' - is this a correct assumption or not? If not, some adjustments would be needed.

 

Opening Inventory Sum =
SUM ( 'Fact_Inventory'[Opening Inventory] )
Required Inventory Sum =
SUM ( 'Fact_Inventory_Req'[Inventory_Qty_Rec] )
Measure =
-- Combinations of Date/SKU/Location existing in Fact_Inventory_Req
VAR SummarizedTable =
    SUMMARIZE (
        'Fact_Inventory_Req',
        'Dim_Date'[Date],
        'Dim_SKU'[SKU Key],
        'Dim_Location'[Location Key]
    )
VAR Result =
    SUMX (
        SummarizedTable,
        IF ( [Opening Inventory Sum] > [Required Inventory Sum], 1 )
    )
RETURN
    Result

 

Interested in whether this is an improvement or not. It may depend on how sparse combinations in your fact tables are.

 

Regards


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

OwenAuger
Super User
Super User

Glad that was some help 🙂

 

For the multi-fact version, I would go with this pattern:

DISTINCT (
    UNION (
        SUMMARIZE (
            'Fact_Inventory_Req',
            'Dim_Date'[Date],
            'Dim_SKU'[SKU Key],
            'Dim_Location'[Location Key]
        ),
        SUMMARIZE (
            'Fact_Inventory',
            'Dim_Date'[Date],
            'Dim_SKU'[SKU Key],
            'Dim_Location'[Location Key]
        )
    )
)

It will be easier when SUMMARIZECOLUMNS is (reliably) available for use in measures.

 

All the best!


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

View solution in original post

5 REPLIES 5
OwenAuger
Super User
Super User

Glad that was some help 🙂

 

For the multi-fact version, I would go with this pattern:

DISTINCT (
    UNION (
        SUMMARIZE (
            'Fact_Inventory_Req',
            'Dim_Date'[Date],
            'Dim_SKU'[SKU Key],
            'Dim_Location'[Location Key]
        ),
        SUMMARIZE (
            'Fact_Inventory',
            'Dim_Date'[Date],
            'Dim_SKU'[SKU Key],
            'Dim_Location'[Location Key]
        )
    )
)

It will be easier when SUMMARIZECOLUMNS is (reliably) available for use in measures.

 

All the best!


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Hi Owen,

I really cannot thank you enough for this!!  I've been struggling for a full day with virtually the same scenario - a measure using inputs from 2 fact tables linked to the same two dimension tables - and your solution worked like a charm.
Thank-you so much (and thanks also to Simeon for posting the question in the first place!)

😊

Ah yes, the Union makes sense. Indeed, Looking forward to getting SUMMARIZECOLUMNS.

Thanks a lot for your help! 🙂

OwenAuger
Super User
Super User

Hi @Simeon 

Interesting scenario, and you have hit on some good ideas

If the two fact tables are aggregated to the same grain (e.g. Date/Location/SKU) you could indeed precompute the result for each row as a binary value and sum that. However it sounds like this is not always the case.

 

To answer your specific questions:

  • To more optimally write such a measure that has to iterate over combinations of dimensions, I would generally use SUMMARIZE to group the fact table by the required dimensions (rather than CROSSJOIN), then iterate over the result. SUMMARIZE produces these comabinations very efficiently with the storage engine. Also, I would not suggest using SUMMARIZE to create extension columns (see this article).
  • Your star schema looks logical and correct to me!
  • Last I checked, SUMMARIZECOLUMNS cannot generally be used in measures (in particular within iterators) but it has been suggested that this is going to change (see this video featuring Jeffrey Wang).

Here is how I would suggest writing the measure, based on what you've posted, including two measures for the inventory sums that I would create for readability.

 

Oh, I have assumed that you would only count combinations that exist in 'Fact_Inventory_Req' - is this a correct assumption or not? If not, some adjustments would be needed.

 

Opening Inventory Sum =
SUM ( 'Fact_Inventory'[Opening Inventory] )
Required Inventory Sum =
SUM ( 'Fact_Inventory_Req'[Inventory_Qty_Rec] )
Measure =
-- Combinations of Date/SKU/Location existing in Fact_Inventory_Req
VAR SummarizedTable =
    SUMMARIZE (
        'Fact_Inventory_Req',
        'Dim_Date'[Date],
        'Dim_SKU'[SKU Key],
        'Dim_Location'[Location Key]
    )
VAR Result =
    SUMX (
        SummarizedTable,
        IF ( [Opening Inventory Sum] > [Required Inventory Sum], 1 )
    )
RETURN
    Result

 

Interested in whether this is an improvement or not. It may depend on how sparse combinations in your fact tables are.

 

Regards


Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
Twitter
LinkedIn

Hi @Owen , thanks for replying to quickly!

 

Yes, it's much faster indeed! I didn't realise that in SUMMARIZE the "Group By" columns could come from other tables - I (wrongly) assumed that it could only summarize the selected table (in the first parameter).

 

Actually now that you mention it, there are a few cases where the combinations from multiple fact tables need to be considered. Could you please show me the required adjustments? Let me know if I should write a new thread for it.

 

Much appreciated 🙂

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.