cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
florinbarnea
Frequent Visitor

Optimize SUMX with SUMMARIZE

Hello! 

I have a Pricing analysis report where I need to calculate the impact of pricing changes. The dax formulas work and produce the desired result however performance is really bad. 

My Fact table is a simple table with orders at the product level. I have two years worth of data. 

The idea is to calculate the average unit price of this year and last year at an aggregated level (Region, MaterialId, Market etc. )

I have to do it at this level otherwise the calculation doesn't provide the correct values at subtotal levels. This measure runs in about 2-3 seconds in a simple visual. When plotted against Month, the visual takes about 10-15 seconds to load. I can provide Server Timings data if needed. The question is very similar to this one: Solved: Optimize SUMX / SUMMARIZE Measure - Microsoft Power BI Community however I have tried the suggestions in that post and it hasn't helped. 

 

I have also tried with ADDCOLUMNS(SUMMARIZE()), SUMMARIZE() with variables, and the latest below with GROUPBY(). I don't understand context transition so I don't know if it's even possible to optimize this. The Fact table has 1.4 million rows and the biggest Dim table is the Material one with 760,000 rows. 

 

The model is in SSAS and I am connecting from PBI with direct query. 

 

Below is the price impact measure:

 

 

 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    FILTER(ALL(Dim_Date),YEAR(Dim_Date[FiscalDate])=maxYear && Dim_Date[FiscalDate]<=maxDate)
VAR _py =
    FILTER(ALL(Dim_Date), YEAR(Dim_Date[FiscalDate])=maxYear-1)

RETURN SUMX(
        GROUPBY(
            CALCULATETABLE(Fact_Pricing, FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
            Fact_Pricing[Matched],
		    Dim_Company[RegionTxt],
            Fact_Pricing[CorporateMarket],
            Fact_Pricing[CompanySurrId],
            Fact_Pricing[MaterialSurrId],
            Fact_Pricing[DocumentCurrency],
            Fact_Pricing[LocalCurrency],
            Dim_ExchangeRate[BIRate]
        ),
		VAR _YTDBillingQuantity = CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd)
		VAR _NetBillingUnitPriceLC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd ), _YTDBillingQuantity, 0 )
		VAR _NetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd ), _YTDBillingQuantity, 0 )
		VAR _PYNetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py ), CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py), 0 )
		RETURN (_NetBillingUnitPriceLC-DIVIDE(_NetBillingUnitPriceLC,(1 + DIVIDE((_NetBillingUnitPriceDC - _PYNetBillingUnitPriceDC),_PYNetBillingUnitPriceDC,0)))) * _YTDBillingQuantity
        )

 

 

 

 I would really appreciate any help, I'm hoping I'm missing something obvious, or I need to change the approach completely. 

Fact Table extract (I removed a bunch of surrogate id's.):

PostingDateIdSalesOrderNumberSalesOrderItemNumberSalesQuantityLocalCurrencySalesBaseLCDocumentCurrencyNetBillingRevenueLCNetBillingRevenueDCNetBillingRevenueUSDCorporateMarket
202001213439732703CAD1,655.10CAD1,655.101655.11655.1MarketExample1
202001213439732701CAD551.7CAD551.7551.7551.7MarketExample1
202001273475186301USD47.01USD47.0147.0147.01MarketExample1
202001283489573302INR6,000.00INR6,000.0060006000MarketExample1
202001283445313604CAD2,206.80CAD2,206.802206.82206.8MarketExample1
2020012834630183032USD14,105.92USD14,105.9214105.9214105.92MarketExample1
202001313495473108USD34.4USD34.434.434.4MarketExample1
202002203463601503EUR99.36EUR99.3699.3699.36MarketExample1
2020030435234701101EUR258.32EUR257.35257.35257.35MarketExample1
1 ACCEPTED SOLUTION

Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out  DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times). 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    DATESYTD(Dim_Date[FiscalDate])
VAR _py =
    PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)

RETURN SUMX(
        ADDCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                SUMMARIZE(
                    Fact_Pricing,
                    Fact_Pricing[Matched],
		            Dim_Company[RegionTxt],
                    Fact_Pricing[CorporateMarket],
                    Fact_Pricing[CompanySurrId],
                    Fact_Pricing[MaterialSurrId],
                    Fact_Pricing[DocumentCurrency],
                    Fact_Pricing[LocalCurrency]
                ), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
		    "_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
            "_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
            "_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
            "_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
            "_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
            ),
         "_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
        ),
		[_PriceImpact]
        )

 

 

 

View solution in original post

2 REPLIES 2
v-easonf-msft
Community Support
Community Support

Hi,  @florinbarnea 

The content after 'Return' in the code seems too long and it is recommended to use multiple variables for encapsulation.

I don't know much about performance optimization. You can check if these posts could help:

GROUPBY vs SUMMARIZE in #dax #powerbi #powerpivot 

https://docs.microsoft.com/en-us/dax/groupby-function-dax 

 

Best Regards,
Community Support Team _ Eason

Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out  DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times). 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    DATESYTD(Dim_Date[FiscalDate])
VAR _py =
    PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)

RETURN SUMX(
        ADDCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                SUMMARIZE(
                    Fact_Pricing,
                    Fact_Pricing[Matched],
		            Dim_Company[RegionTxt],
                    Fact_Pricing[CorporateMarket],
                    Fact_Pricing[CompanySurrId],
                    Fact_Pricing[MaterialSurrId],
                    Fact_Pricing[DocumentCurrency],
                    Fact_Pricing[LocalCurrency]
                ), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
		    "_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
            "_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
            "_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
            "_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
            "_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
            ),
         "_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
        ),
		[_PriceImpact]
        )

 

 

 

View solution in original post

Helpful resources

Announcements
2022 Release Wave 1 760x460.png

2022 Release Wave 1 Plan

Power Platform release plan for the 2022 release wave 1 describes all new features releasing from April 2022 through September 2022.

Power BI December 2021 Update_carousel 768x460.jpg

Check it Out!

Click here to read more about the December 2021 Updates!

Jan 2022 Dev Camp 768x460 copy.png

Power BI Dev Camp- January 27th, 2022

Mark your calendars and join us for our next Power BI Dev Camp!

Top Solution Authors