Solved: Relationship Performance: Mapping based on String ...

LeoST · ‎05-03-2022

Hi all,

we've a big dataset (10GB) using the star schema and are trying to optimize the performance as much as we can.

Right now the tables are all mapped based on a key that is a string and a combination of various different fields.

Now the question is: would the report perform any better if we convert that relativly long string into a number? We're getting right now some out of memory errors when we have a visual with fields from different tables of teh dataset.

Thanks!

SpartaBI · ‎05-04-2022

@LeoST it's not relevant for the relationship itself becasue the columns in the model are anyway stored with indexes that replace the original values in the column (hash encoding) and the relationship is based on those indexes. The original columns values are stored in dictionary tables that maps to the indexes. There are also more techniqes the vertipac engine use to compress the data like run lengh encoding and value encoding. The latter is not relevant for string columns.
However, your situation applies that this big string key column also exist in both the dim and fact, which means this is a very heavy column that takes a lot of memory. It will anyway better to replace this column with a number to reduce the size of the model, which will of course help in dodging memory limit sceanrios and could help also the performance in general.
To coclude, there are probably other major reasons you have performanc issues like how you created your model and measures. It's a journey to find them all, and you of course need to start fixing the ones that cost the major impact and move to the others. This is a discipline of its own.

View solution in original post

SpartaBI · ‎05-04-2022

@LeoST it's not relevant for the relationship itself becasue the columns in the model are anyway stored with indexes that replace the original values in the column (hash encoding) and the relationship is based on those indexes. The original columns values are stored in dictionary tables that maps to the indexes. There are also more techniqes the vertipac engine use to compress the data like run lengh encoding and value encoding. The latter is not relevant for string columns.
However, your situation applies that this big string key column also exist in both the dim and fact, which means this is a very heavy column that takes a lot of memory. It will anyway better to replace this column with a number to reduce the size of the model, which will of course help in dodging memory limit sceanrios and could help also the performance in general.
To coclude, there are probably other major reasons you have performanc issues like how you created your model and measures. It's a journey to find them all, and you of course need to start fixing the ones that cost the major impact and move to the others. This is a discipline of its own.

Relationship Performance: Mapping based on String or Number?

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024

Jumpstart your career with the Fabric Career Hub

Relationship Performance: Mapping based on String or Number?

Helpful resources

New forum boards available in Real-Time Intelligence.

Power BI Monthly Update - May 2024