Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello!
I have a Pricing analysis report where I need to calculate the impact of pricing changes. The dax formulas work and produce the desired result however performance is really bad.
My Fact table is a simple table with orders at the product level. I have two years worth of data.
The idea is to calculate the average unit price of this year and last year at an aggregated level (Region, MaterialId, Market etc. )
I have to do it at this level otherwise the calculation doesn't provide the correct values at subtotal levels. This measure runs in about 2-3 seconds in a simple visual. When plotted against Month, the visual takes about 10-15 seconds to load. I can provide Server Timings data if needed. The question is very similar to this one: Solved: Optimize SUMX / SUMMARIZE Measure - Microsoft Power BI Community however I have tried the suggestions in that post and it hasn't helped.
I have also tried with ADDCOLUMNS(SUMMARIZE()), SUMMARIZE() with variables, and the latest below with GROUPBY(). I don't understand context transition so I don't know if it's even possible to optimize this. The Fact table has 1.4 million rows and the biggest Dim table is the Material one with 760,000 rows.
The model is in SSAS and I am connecting from PBI with direct query.
Below is the price impact measure:
PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
FILTER(ALL(Dim_Date),YEAR(Dim_Date[FiscalDate])=maxYear && Dim_Date[FiscalDate]<=maxDate)
VAR _py =
FILTER(ALL(Dim_Date), YEAR(Dim_Date[FiscalDate])=maxYear-1)
RETURN SUMX(
GROUPBY(
CALCULATETABLE(Fact_Pricing, FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
Fact_Pricing[Matched],
Dim_Company[RegionTxt],
Fact_Pricing[CorporateMarket],
Fact_Pricing[CompanySurrId],
Fact_Pricing[MaterialSurrId],
Fact_Pricing[DocumentCurrency],
Fact_Pricing[LocalCurrency],
Dim_ExchangeRate[BIRate]
),
VAR _YTDBillingQuantity = CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd)
VAR _NetBillingUnitPriceLC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd ), _YTDBillingQuantity, 0 )
VAR _NetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd ), _YTDBillingQuantity, 0 )
VAR _PYNetBillingUnitPriceDC = DIVIDE ( CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py ), CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py), 0 )
RETURN (_NetBillingUnitPriceLC-DIVIDE(_NetBillingUnitPriceLC,(1 + DIVIDE((_NetBillingUnitPriceDC - _PYNetBillingUnitPriceDC),_PYNetBillingUnitPriceDC,0)))) * _YTDBillingQuantity
)
I would really appreciate any help, I'm hoping I'm missing something obvious, or I need to change the approach completely.
Fact Table extract (I removed a bunch of surrogate id's.):
PostingDateId | SalesOrderNumber | SalesOrderItemNumber | SalesQuantity | LocalCurrency | SalesBaseLC | DocumentCurrency | NetBillingRevenueLC | NetBillingRevenueDC | NetBillingRevenueUSD | CorporateMarket |
20200121 | 3439732 | 70 | 3 | CAD | 1,655.10 | CAD | 1,655.10 | 1655.1 | 1655.1 | MarketExample1 |
20200121 | 3439732 | 70 | 1 | CAD | 551.7 | CAD | 551.7 | 551.7 | 551.7 | MarketExample1 |
20200127 | 3475186 | 30 | 1 | USD | 47.01 | USD | 47.01 | 47.01 | 47.01 | MarketExample1 |
20200128 | 3489573 | 30 | 2 | INR | 6,000.00 | INR | 6,000.00 | 6000 | 6000 | MarketExample1 |
20200128 | 3445313 | 60 | 4 | CAD | 2,206.80 | CAD | 2,206.80 | 2206.8 | 2206.8 | MarketExample1 |
20200128 | 3463018 | 30 | 32 | USD | 14,105.92 | USD | 14,105.92 | 14105.92 | 14105.92 | MarketExample1 |
20200131 | 3495473 | 10 | 8 | USD | 34.4 | USD | 34.4 | 34.4 | 34.4 | MarketExample1 |
20200220 | 3463601 | 50 | 3 | EUR | 99.36 | EUR | 99.36 | 99.36 | 99.36 | MarketExample1 |
20200304 | 3523470 | 110 | 1 | EUR | 258.32 | EUR | 257.35 | 257.35 | 257.35 | MarketExample1 |
Solved! Go to Solution.
Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times).
PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
DATESYTD(Dim_Date[FiscalDate])
VAR _py =
PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)
RETURN SUMX(
ADDCOLUMNS(
ADDCOLUMNS(
CALCULATETABLE(
SUMMARIZE(
Fact_Pricing,
Fact_Pricing[Matched],
Dim_Company[RegionTxt],
Fact_Pricing[CorporateMarket],
Fact_Pricing[CompanySurrId],
Fact_Pricing[MaterialSurrId],
Fact_Pricing[DocumentCurrency],
Fact_Pricing[LocalCurrency]
), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
"_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
"_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
"_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
"_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
"_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
),
"_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
),
[_PriceImpact]
)
Hi, @Anonymous
The content after 'Return' in the code seems too long and it is recommended to use multiple variables for encapsulation.
I don't know much about performance optimization. You can check if these posts could help:
GROUPBY vs SUMMARIZE in #dax #powerbi #powerpivot
https://docs.microsoft.com/en-us/dax/groupby-function-dax
Best Regards,
Community Support Team _ Eason
Thank you for your input. I have optimised to the below. This took it from 11-15 seconds to 6-7 seconds. I will try to factor out DIVIDE([_YTDLC],[_YTDQ],0) and DIVIDE([_PYDC],[_PYQ],0) in separate columns as they happen twice in the final calculation, but I was hoping my entire approach was wrong and I can get some major performance increase. When checking the Server Timings output, I have a lot of rows (50k) materialised multiple times (5 times).
PriceImpactLC:=
VAR maxDate = MAX( Dim_Date[FiscalDate] )
VAR maxYear = YEAR( maxDate )
VAR _datesytd =
DATESYTD(Dim_Date[FiscalDate])
VAR _py =
PARALLELPERIOD(Dim_Date[FiscalDate],-1,YEAR)
RETURN SUMX(
ADDCOLUMNS(
ADDCOLUMNS(
CALCULATETABLE(
SUMMARIZE(
Fact_Pricing,
Fact_Pricing[Matched],
Dim_Company[RegionTxt],
Fact_Pricing[CorporateMarket],
Fact_Pricing[CompanySurrId],
Fact_Pricing[MaterialSurrId],
Fact_Pricing[DocumentCurrency],
Fact_Pricing[LocalCurrency]
), FILTER(ALL(Dim_Date), Dim_Date[FiscalDate]<=maxDate)),
"_YTDQ", CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_datesytd),
"_PYQ",CALCULATE(SUM(Fact_Pricing[SalesQuantity]),_py),
"_YTDLC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueLC]), _datesytd),
"_YTDDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _datesytd),
"_PYDC", CALCULATE( SUM(Fact_Pricing[NetBillingRevenueDC]), _py )
),
"_PriceImpact", (DIVIDE([_YTDLC],[_YTDQ],0)-DIVIDE(DIVIDE([_YTDLC],[_YTDQ],0),1+DIVIDE(DIVIDE([_YTDDC],[_YTDQ],0)-DIVIDE([_PYDC],[_PYQ],0),DIVIDE([_PYDC],[_PYQ],0),0),0))*[_YTDQ]
),
[_PriceImpact]
)
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
96 | |
77 | |
63 | |
55 |
User | Count |
---|---|
143 | |
109 | |
89 | |
84 | |
66 |