Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Otavio_Hurtado
Frequent Visitor

Optimize Query with data calculations

I have a database which is grouped by sales data per location, month, product_id and sales_segment. 
The thing is I want to compare previous year's sales data, where normally using SQL I would create a subquery and populate new columns to my original database with the previous year's data. 
Since I couldn't my query to recursively search itself on like a sumifs would on Excel, I've tried the same approach I would take on SQL by duplicating my database and using Table.SelectRows to search my duplicate database for the previous year's sales data like this:
PS: PYDATE is a column with the previous year's date it should filter, for example: Current Month: "SEP22" PYDATE will contain "SEP21" for that row.

Table.AddColumn(#"Renamed Columns1", "Sales (Q) PY", each List.Sum(Table.SelectRows(
DuplicateDatabase,
(NestedTable) =>
NestedTable[Country] = [Country] and
NestedTable[DATE] = [PYDATE] and
NestedTable[Sales_Segment] = [Sales_Segment] and
NestedTable[ProductID] = [ProductID] and
NestedTable[Location] = [Location]
)[#"Sales (Q)"]))

Now the problem I'm having is that my solution works, but it takes forever to run since I have a large dataset. 
I imagine having it run on a measure on power bi will be even slower than my current approach.
Is there a more optimal way for calculating this data ?

DATASET EXAMPLE:

COUNTRY | DATE | SALES_SEGMENT | PRODUCT_ID | LOCATION | SALES(QTY) | PYDATE | PY_SALES(QTY)
AB | OCT22 | D | A1 | ABC | 100 | OCT21 | ?
AB | OCT21 | D | A1 | ABC | 50 | OCT20 | ?
CD | SEP22 | O | B2 | DEF | 90 | SEP21 | ?



3 REPLIES 3
Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Otavio_Hurtado - 500 million rows will be okay as long as the Cardinality of the Foreign Key dimension columns are small, so don't load too many unique values!!  In the example above, if you have a Row ID for each transaction, then remove it and aggregrate in SQL before loading to Power BI (add a Count of Transaction Rows)

You may want to consider limiting the history in query to reduce size. And consider incrementally loading data improve load times.

Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Otavio_Hurtado - I think that DAX can handle this much better than SQL and Power Query.  But how large is your Datastet?

To acheive this you need to create the following Table:

COUNTRY   DATE   SALES_SEGMENT   PRODUCT_ID   LOCATION   SALES(QTY)
AB 01/10/2022   D   A1   ABC                     100.00
AB 01/10/2022   D   A1   ABC                       50.00
CD 01/09/2022   O   B2   DEF                       90.00
AB 01/10/2021   D   A1   ABC  ?? 
AB 01/10/2021   D   A1   ABC  ?? 
CD 01/09/2021   O   B2   DEF  ?? 

 

The trick is to replace the Month with Start of Month date.  This will allow you add a Calendar Table and then use the Time-Intelligence functions in DAX.  This will allow you to avoid the expensive Joins required by SQL and Power Query.

About 24Million rows, I was also recommended that approach by a colleague, will try that.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors