Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
LasseL
Helper I
Helper I

Compute effective approach; sumx, summarize, sum - case of transactional sales data and exch. rates

Dear community,

 

Main issue; finding the most compute effective approach.

 

Current Datamodel;

LasseL_0-1603097456015.png

 

Background info;

  • Sales Revenue and Cost is transactional data, each line representing information about actual sales at a particular date.
  • Sales Revenue lines are in different currencies.
  • Table has approx. 11 million lines.
  • Exchange Rates contains information about Currency Exchange Rates from and to currency at particular dates. Date is a "valid from date", and is not consistent in terms of an exchange rate for all dates in Sales Revenue and Cost (most commonly only last date of each month).

 

Task;

  1. Establish a consolidated report in a consolidate currency by choice of the user.
  2. Use latest valid currency exchange rate "before" Sales Revenue and Cost" Date.

 

My first approach;

  • Solution; Using Power Query with Table.SelectRows to add columns to Sales Revenue... table with selected reporting currencies and calculated amount based on that. I.e. perform the transformation at the time of loading the data.

 

    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Table.Max(Table.SelectRows(Rates, (Rate) => Rate[Date]<=[Date] and Rate[Currency]=[Currency]),"Date")
),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom", "Custom", {"Rate"}, {"Rate"})​

 

  • Problem; with a clean load Power BI services could load around 100k lines per second, with above solution it was reduced to only 10 lines per second and with 11 million lines it would take days to finish.

Second approach;

  • Solution; Using DAX for the transformation via pure SUMX(Sales Revenue...) with calculate conditions for the conversion

 

Sales Revenue = 
SUMX(
    'Sales Revenue and Cost',
    'Sales Revenue and Cost'[Line Amount] * 
    Calculate(
        min('Exchange Rates'[Rate]),
        filter('Exchange Rates', 
        'Exchange Rates'[From Currency]='Sales Revenue and Cost'[Company Currency] && 
        'Exchange Rates'[Date]<='Sales Revenue and Cost'[Date] &&
        'Exchange Rates'[To Currency]=SELECTEDVALUE('Reporting Currency'[Currency])
        )   
    )
)​

 

  • Problem; Solutions provides the perfect answer and best feature of being dynamically able to select "another" reporting currency, however, publishing this to services it is evident that it is very ineffective, slow and at times power bi services post an error stating too much memory consumption - i.e. not a "production able" solution.

Third approach, and where I am stuck:

  • Solution; I've read that SUMX is not to be used like I did cause it does a calculation for every single row every time, and that people have come around it by using SUMX VALUES, summarize or a combination. I am at the point of trying out SUMX VALUES but can't get it to play out correct and am wondering if it is the right solution at all, 2 measures combined to "Sales Revenue";
  • Sales Revenue = 
    SUMX(
        VALUES('Sales Revenue and Cost'[Currency]),
        [Sales Revenue LCY] * [Sales Revenue Reporting Currency Rate]
    )
    
    Sales Revenue LCY = SUM('Sales Revenue and Cost'[Line Amount])
    
    Sales Revenue Reporting Currency Rate = 
    Calculate(
            min('Exchange Rates'[Rate]),
            filter('Exchange Rates', 
            'Exchange Rates'[From Currency]=MAX('Sales Revenue and Cost'[Currency]) && 
            'Exchange Rates'[Date]<=MAX('Sales Revenue and Cost'[Date]) &&
            'Exchange Rates'[To Currency]=SELECTEDVALUE('Reporting Currency'[Currency])
            )   
        )
  • Also I am concerned if the method secures that I am using the correct currency exchange rate "before" the Sales Revenue posting date - as a minimum I would like that a monthly level of summarized amount it is translated by the exchange rate "before".

 

I am stuck, quite beginners level and have spent so many hours to the point of much frustration - any hint of the right direction would be greatly appreciated!

 

Best regards,

Lasse

11 REPLIES 11
LasseL
Helper I
Helper I

I continued a bit on my own.

 

Current trial;

Sales Revenue = 
VAR ExchangeRate =  Calculate(
                        min('Exchange Rates'[Rate]),
                        filter('Exchange Rates', 
                        'Exchange Rates'[From Currency]=MIN('Sales Revenue and Cost'[Currency]) && 
                        'Exchange Rates'[Date]<=MAX('Sales Revenue and Cost'[Date]) &&
                        'Exchange Rates'[To Currency]=SELECTEDVALUE('Reporting Currency'[Currency])
                        )   
                    )

RETURN

SUMX(
    SUMMARIZE('Sales Revenue and Cost','Sales Revenue and Cost'[Currency]),
    [Sales Revenue LCY] * ExchangeRate
)

With [Sales Revenue LCY] being;

Sales Revenue LCY = SUM('Sales Revenue and Cost'[Line Amount])

 

It seems to perform better compute and memorize-wise, and a visual with Year-Month calculates correct Reporting Currency Amount but totals the columns incorrectly with total sum of LCY times "Last Exchange Rate"...

And I still have my doubts if this is best practice at all 😞

 

Hi @LasseL ,

Based on your description, the current trial will have a better performance beacuse it uses variable and summazire a table in the sumx() function.

 "but totals the columns incorrectly with total sum of LCY times "Last Exchange Rate""

What is your expected output? Could you please consider sharing more details like sample file, screenshots etc. about this issue for further discussion? 

Sample data and expected output would help tremendously.
Please see this post regarding How to Get Your Question Answered Quickly:
https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490

 

Best Regards,

Yingjie Li

 

Dear Yingjie,

 

Thank you so much for getting back.

 

For confidential reasons I can't share the actual PBIX and data that I am working in, but I tried to reproduce a demo dataset (excel) and a PBIX fil with similar setup.

 

Attached;

What I would expect is the totals of Sales to be correct with latest exchange rate when choosing either DKK or EUR as a reporting currency;

Screenshot 2020-10-22 134815.png

 

Does this help?

Hi @LasseL ,

Seems like I could not download the sample due to the access. Perhaps you can modify the visit access and consider re-sharing the sample file.

 

Best Regards,

Yingjie Li

Hi again Yingjie,

 

My sincere appologies, I will try another means; https://we.tl/t-264shFE8H6

 

Does this work?

 

Best regards,

Lasse

Hi @LasseL ,

So your expected result is like this?

cy.png

If so, try to modify like this:

Sales Revenue =
VAR ExchangeRate =
    CALCULATE (
        MIN ( Rates[Rate] ),
        FILTER (
            Rates,
            Rates[From Currency] = MIN ( Orders[Currency] )
                && Rates[Date] <= MAX ( Orders[Date] )
                && Rates[To Currency] IN DISTINCT ( 'ReportingCur'[Currency] )
        )
    )
RETURN
    SUMX (
        SUMMARIZE ( Orders, Orders[Currency] ),
        [Sales Revenue LCY] * ExchangeRate
    )

 

Best Regards,
Yingjie Li

If this post helps then please consider Accept it as the solution to help the other members find it more quickly.

Wow, what a quick response, and what a small correct with big impact!

 

Thank you so much, helped a lot.

 

I still see some issues though;

 

1) It seems it is not consistent in picking the last valid exchange rate from "before" the order date, sometimes yes, a few times it jumps?

 

LasseL_1-1603435174133.png

 

2) and more important, if I change reporting currency from EUR to DKK i get some strange summarizations;

 

LasseL_2-1603435399082.png

 

Any ideas on why, what did the IN DISTINCT exactly do 🙂

Hi @LasseL ,

I find that you have disabled the interactions between the slicer and the two tables under the sample file so the value looks like 'strange', just resume the interaction and the value shoule be normal.

interaction.png

About IN statement, it creates a logical OR condition between each row being compared to a table and the Distinct statement removes dupulicate values and only return unique values.

You can refer these articles if you are interested about them:

  1. The IN operator in DAX 
  2. Using “IN” Operator in DAX 
  3. DISTINCT (column) 

 

Attached the modified sample file in the below, hopes to help you.

 

Best Regards,
Yingjie Li

If this post helps then please consider Accept it as the solution to help the other members find it more quickly.

Thanks a lot for filter-correction, the explanation and definition!

 

How come the sum of sales value does not reflect the actual sum of each line, in this case DKK sum should be ~14+311 = ~325DKK

 

LasseL_0-1603452603233.png

 

Hi @LasseL ,

I re-write the measure like this:

Sales Revenue = 
VAR tab =
    SUMMARIZE (
        'Orders',
        'Calendar'[Date],
        'Orders'[Currency],
        'Orders'[Price],
        "REp",
            CALCULATE (
                MIN ( Rates[Rate] ),
                FILTER (
                    Rates,
                    Rates[From Currency] IN DISTINCT ( Rates[From Currency] )
                        && 'Rates'[To Currency] IN DISTINCT ( ReportingCur[Currency] )
                )
            )
    )
RETURN
    SUMX ( tab, [REp] * [Price] )

 Now the sum value should be correct:

sum.png

See the sample file in the below, hopes to help you.

 

Best Regards,
Yingjie Li

If this post helps then please consider Accept it as the solution to help the other members find it more quickly.

Hi again @v-yingjl ,

 

On the positive side, yes, now it gives an (almost) perfect calculation (missing the original correct look up of the exchange rate  closest (before) to the transaction date - but I can fix that.

 

On the negative side, then I believe we are back to the poor performance of the original solution of a line by line calculation (SUMX) which was the approach that proved ineffective from the beginning when working on the big dataset, i.e. original measure was;

Sales Revenue = 
SUMX(
    'Sales Revenue and Cost',
    'Sales Revenue and Cost'[Line Amount] * 
    Calculate(
        min('Exchange Rates'[Rate]),
        filter('Exchange Rates', 
        'Exchange Rates'[From Currency]='Sales Revenue and Cost'[Company Currency] && 
        'Exchange Rates'[Date]<='Sales Revenue and Cost'[Date] &&
        'Exchange Rates'[To Currency]=SELECTEDVALUE('Reporting Currency'[Currency])
        )   
    )
)​

 

I just tested your measure against the big dataset, and the performance is dreadfully slow and I hit again memory errors from the MS datacenter.

 

However, I feel you are on to "something", is there a way we can get around the line by line calculation for each transaction and aggregate to a higher level, e.g. grouped by Calendar[Year-Month] and Orders[Currency]; thinking next evolution of your last measure to something like;

Sales Revenue = 
VAR tab =
    SUMMARIZE (
        'Orders',
        'Calendar'[Year-Month],
        'Orders'[Currency],
        'Orders'[Price],
        "REp",
            CALCULATE (
                MIN ( Rates[Rate] ),
                FILTER (
                    Rates,
                    Rates[From Currency] IN DISTINCT ( Rates[From Currency] )
                        && 'Rates'[To Currency] IN DISTINCT ( ReportingCur[Currency] )
                        && 'Rates'[Date] <= MAX(Orders[Date])
                        && MAX('Rates'[Date])
                )
            )
    )
RETURN
    SUMX ( tab, [REp] * [Price] )

 

Still, can't hit the correct exchange rate and sum of converted price is still not right - but at least it seems to run faster as the sumx is performed on a higher aggregated level?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.