Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Optimize SUMX with SUMMARIZE

Hello! 

I have a Pricing analysis report where I need to calculate the impact of pricing changes. The dax formulas work and produce the desired result however performance is really bad. 

My Fact table is a simple table with orders at the product level. I have two years worth of data. 

The idea is to calculate the average unit price of this year and last year at an aggregated level (Region, MaterialId, Market etc. )

I have to do it at this level otherwise the calculation doesn't provide the correct values at subtotal levels. This measure runs in about 2-3 seconds in a simple visual. When plotted against Month, the visual takes about 10-15 seconds to load. I can provide Server Timings data if needed. The question is very similar to this one: Solved: Optimize SUMX / SUMMARIZE Measure - Microsoft Power BI Community however I have tried the suggestions in that post and it hasn't helped. 

 

I have also tried with ADDCOLUMNS(SUMMARIZE()), SUMMARIZE() with variables, and the latest below with GROUPBY(). I don't understand context transition so I don't know if it's even possible to optimize this. The Fact table has 1.4 million rows and the biggest Dim table is the Material one with 760,000 rows. 

 

The model is in SSAS and I am connecting from PBI with direct query. 

 

Below is the price impact measure:

 

 

 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    FILTER(ALL(Dim_Date),YEAR(Dim_Date[FiscalDate])=maxYear && Dim_Date[FiscalDate]<=maxDate)
VAR _py =
    FILTER(ALL(Dim_Date), YEAR(Dim_Date[FiscalDate])=maxYear-1)

RETURN SUMX(
        GROUPBY(
            CALCULATETABLE(Fact_Pricing, FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
            Fact_Pricing[Matched],
		    Dim_Company[RegionTxt],
            Fact_Pricing[CorporateMarket],
            Fact_Pricing[CompanySurrId],
            Fact_Pricing[MaterialSurrId],
            Fact_Pricing[DocumentCurrency],
            Fact_Pricing[LocalCurrency],
            Dim_ExchangeRate[BIRate]
        ),
		VAR _YTDBillingQuantity = CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd)
		VAR _NetBillingUnitPriceLC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd ), _YTDBillingQuantity, 0 )
		VAR _NetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd ), _YTDBillingQuantity, 0 )
		VAR _PYNetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py ), CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py), 0 )
		RETURN (_NetBillingUnitPriceLC-DIVIDE(_NetBillingUnitPriceLC,(1 + DIVIDE((_NetBillingUnitPriceDC - _PYNetBillingUnitPriceDC),_PYNetBillingUnitPriceDC,0)))) * _YTDBillingQuantity
        )

 

 

 

 I would really appreciate any help, I'm hoping I'm missing something obvious, or I need to change the approach completely. 

Fact Table extract (I removed a bunch of surrogate id's.):

PostingDateIdSalesOrderNumberSalesOrderItemNumberSalesQuantityLocalCurrencySalesBaseLCDocumentCurrencyNetBillingRevenueLCNetBillingRevenueDCNetBillingRevenueUSDCorporateMarket
202001213439732703CAD1,655.10CAD1,655.101655.11655.1MarketExample1
202001213439732701CAD551.7CAD551.7551.7551.7MarketExample1
202001273475186301USD47.01USD47.0147.0147.01MarketExample1
202001283489573302INR6,000.00INR6,000.0060006000MarketExample1
202001283445313604CAD2,206.80CAD2,206.802206.82206.8MarketExample1
2020012834630183032USD14,105.92USD14,105.9214105.9214105.92MarketExample1
202001313495473108USD34.4USD34.434.434.4MarketExample1
202002203463601503EUR99.36EUR99.3699.3699.36MarketExample1
2020030435234701101EUR258.32EUR257.35257.35257.35MarketExample1
1 ACCEPTED SOLUTION
Anonymous
Not applicable

Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out  DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times). 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    DATESYTD(Dim_Date[FiscalDate])
VAR _py =
    PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)

RETURN SUMX(
        ADDCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                SUMMARIZE(
                    Fact_Pricing,
                    Fact_Pricing[Matched],
		            Dim_Company[RegionTxt],
                    Fact_Pricing[CorporateMarket],
                    Fact_Pricing[CompanySurrId],
                    Fact_Pricing[MaterialSurrId],
                    Fact_Pricing[DocumentCurrency],
                    Fact_Pricing[LocalCurrency]
                ), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
		    "_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
            "_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
            "_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
            "_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
            "_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
            ),
         "_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
        ),
		[_PriceImpact]
        )

 

 

 

View solution in original post

2 REPLIES 2
v-easonf-msft
Community Support
Community Support

Hi,  @Anonymous 

The content after 'Return' in the code seems too long and it is recommended to use multiple variables for encapsulation.

I don't know much about performance optimization. You can check if these posts could help:

GROUPBY vs SUMMARIZE in #dax #powerbi #powerpivot 

https://docs.microsoft.com/en-us/dax/groupby-function-dax 

 

Best Regards,
Community Support Team _ Eason

Anonymous
Not applicable

Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out  DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times). 

PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
    DATESYTD(Dim_Date[FiscalDate])
VAR _py =
    PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)

RETURN SUMX(
        ADDCOLUMNS(
            ADDCOLUMNS(
                CALCULATETABLE(
                SUMMARIZE(
                    Fact_Pricing,
                    Fact_Pricing[Matched],
		            Dim_Company[RegionTxt],
                    Fact_Pricing[CorporateMarket],
                    Fact_Pricing[CompanySurrId],
                    Fact_Pricing[MaterialSurrId],
                    Fact_Pricing[DocumentCurrency],
                    Fact_Pricing[LocalCurrency]
                ), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
		    "_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
            "_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
            "_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
            "_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
            "_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
            ),
         "_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
        ),
		[_PriceImpact]
        )

 

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.