Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello guys!
I've recently enter the big data BI world and have a very specific question.
This is my model:
I have a fact table with last 4 years (2020-2023) of data (2 billion rows) and an aggregated table with the last 2 years (2022-2023) of data (20M rows). So, the aggregated table as less years of data than the facts table.
When I use the date slicer and pick a date prior to 2022, I get a blank visual because the measure always hits the AGG (which does not have data for this time period) and not the facts table as expected. Have you guys faced a common issue? Is there a workaround?
Thank you in advance.
Solved! Go to Solution.
Should I then have hot and cold logic?
Fact_hot_table: last 2 years of data
AGG_hot_table: last 2 years of data
Fact_cold_table: all the data prior to the last 2 years
After, I need to change metrics accordingly to this logic. So if in the slicer I pick a date prior do 2022, for instance, the measure will be calculated on the COLD table.
You can do that, but it defeats the purpose of having an aggraegate table. The aggregate table is to make DAX Measures faster to aggregate (Sum, Count, CALCUATE(), etc.). Rather than measuring on a table with millions or billions of rows, the measure is on an pre-aggregated table to display faster.
You get an understanding of the users of the visuals and try to 'aggregate' those common visuals. The visuals going off the detail table will take longer and the end user knows this and accepts the process.
You could just add those years to the aggregate table to help speed thos up 🙂
Thank you for your suggestion, but the problem is that the granularity needed for the use cases is terrific. Meaning that if import all the data, I max out the 1Gb storage space of the PBI Pro License.
I am as well looking at future states, because the tendency is for the table to have an higher volume of records per week as time goes by. If needed I will create a perspective where users can see data aggregated by month, instead of daily, enabling your suggestion.
The engine does not support this scenerio. You must have the same data in aggregate and detail tables. The engine looks at the attributes to group by, not volume of data.
Thank you for your answer @3CloudThomas. I'm grouping by Date field, so I expected the engine to recognize the lack of data in the AGG table and hit the facts instead. Is that assumption wrong?
Correct, the idea with Aggregate data is to substitute the Dimension with lower cardnality and leave the large cardnality (or data that is aggregated less often) to the detail row. It does not look at rows in a dimension that do not have data in the aggregate, then go to the detail table.
Should I then have hot and cold logic?
Fact_hot_table: last 2 years of data
AGG_hot_table: last 2 years of data
Fact_cold_table: all the data prior to the last 2 years
After, I need to change metrics accordingly to this logic. So if in the slicer I pick a date prior do 2022, for instance, the measure will be calculated on the COLD table.
Should I then have hot and cold logic?
--> I have never done this, so I can say if it would work as designed. Aggregate tables was not created for this scenario, sorry.
The idea of the aggregate table is reduce the size of data in a table to return calculated measures that users need to group by and filter.
User | Count |
---|---|
91 | |
73 | |
68 | |
63 | |
55 |
User | Count |
---|---|
96 | |
89 | |
73 | |
61 | |
58 |