I have a database which is grouped by sales data per location, month, product_id and sales_segment.
The thing is I want to compare previous year's sales data, where normally using SQL I would create a subquery and populate new columns to my original database with the previous year's data.
Since I couldn't my query to recursively search itself on like a sumifs would on Excel, I've tried the same approach I would take on SQL by duplicating my database and using Table.SelectRows to search my duplicate database for the previous year's sales data like this:
PS: PYDATE is a column with the previous year's date it should filter, for example: Current Month: "SEP22" PYDATE will contain "SEP21" for that row.
Table.AddColumn(#"Renamed Columns1", "Sales (Q) PY", each List.Sum(Table.SelectRows(
DuplicateDatabase,
(NestedTable) =>
NestedTable[Country] = [Country] and
NestedTable[DATE] = [PYDATE] and
NestedTable[Sales_Segment] = [Sales_Segment] and
NestedTable[ProductID] = [ProductID] and
NestedTable[Location] = [Location]
)[#"Sales (Q)"]))
Now the problem I'm having is that my solution works, but it takes forever to run since I have a large dataset.
I imagine having it run on a measure on power bi will be even slower than my current approach.
Is there a more optimal way for calculating this data ?
DATASET EXAMPLE:
COUNTRY | | DATE | | SALES_SEGMENT | | PRODUCT_ID | | LOCATION | | SALES(QTY) | | PYDATE | | PY_SALES(QTY) |
AB | | OCT22 | | D | | A1 | | ABC | | 100 | | OCT21 | | ? |
AB | | OCT21 | | D | | A1 | | ABC | | 50 | | OCT20 | | ? |
CD | | SEP22 | | O | | B2 | | DEF | | 90 | | SEP21 | | ? |
Hi @Otavio_Hurtado - 500 million rows will be okay as long as the Cardinality of the Foreign Key dimension columns are small, so don't load too many unique values!! In the example above, if you have a Row ID for each transaction, then remove it and aggregrate in SQL before loading to Power BI (add a Count of Transaction Rows)
You may want to consider limiting the history in query to reduce size. And consider incrementally loading data improve load times.
Hi @Otavio_Hurtado - I think that DAX can handle this much better than SQL and Power Query. But how large is your Datastet?
To acheive this you need to create the following Table:
COUNTRY | DATE | SALES_SEGMENT | PRODUCT_ID | LOCATION | SALES(QTY) |
AB | 01/10/2022 | D | A1 | ABC | 100.00 |
AB | 01/10/2022 | D | A1 | ABC | 50.00 |
CD | 01/09/2022 | O | B2 | DEF | 90.00 |
AB | 01/10/2021 | D | A1 | ABC | ?? |
AB | 01/10/2021 | D | A1 | ABC | ?? |
CD | 01/09/2021 | O | B2 | DEF | ?? |
The trick is to replace the Month with Start of Month date. This will allow you add a Calendar Table and then use the Time-Intelligence functions in DAX. This will allow you to avoid the expensive Joins required by SQL and Power Query.
About 24Million rows, I was also recommended that approach by a colleague, will try that.