Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello
I have been developping a lot of pretty large datasets with tables typically containing 20M+ rows
I often need to reconcile data between tables
For example I have an item master with columns like owner / source / part number and others
And I have a pricing table with owner / source / part and others
Two methods are currently working for me
1- Create a calculated column in DAX to aggregate the three columns. Then make a relationship between the two tables and use related function
2- Use the lookupvalue in expression
My opinion is that option 1 is better for performance. When having many lookupvalue the refresh time is taking longer and can fail due to memory constraints. I upgraded from 16Gb to 32 Gb to 64 Gb and I don't want to go on. On the service however it is rrefreshing ok in less than 40 minutes.
Drawback with option 1 is the size of the pbix
I did some tests. My original pbix is 1Gb. One of the table is containing 23 M rows and 40 columns.
As soon as I create a key owner&source&part then the size goes to 1.5 Gb. When I create a second key on the other table it would go to 1.9 Gb.
Loading to Service is not really the issue since I am using parameters or ALM Toolkit but still I don't want to end up with a pbix of 10 Gb
Advantage of the option 2 is that the size of pbox remains quite reasonnable
But when having a lot of lookupvalues the refresh would fail due to memory issues
On the service once loaded the refresh time is quite similar from what I have seen which is a little surprising
I would appreciate some guidance from any of you as I believe many people must have the same headache
Thanks in advance
Stéphane
Needs more details. Are you on shared capacity or Premium? If Premium, which SKU? Gen1 or Gen2 ?
My opinion is that option 1 is better for performance
I would agree. Beware though - linking via high cardinality columns can be extremely costly, especially in composite data models.
Thanks for your reply
Not sure to be able to answer your question. I was referring to my pbix on Power BI Desktop which I afterwards load to the service.
I am clearly not doing linking on high cardinality for composite model as it doesn't work
if you do this only on your desktop then you are limited by the available memory, disk space, and processing power on your PC. Power BI uses the Vertipaq engine which is a columnar data store with in-column compression. The final size of your PBIX will depend on the compressibility of the source data.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
95 | |
80 | |
68 | |
59 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |