Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
LeoST
Frequent Visitor

Relationship Performance: Mapping based on String or Number?

Hi all,

 

we've a big dataset (10GB) using the star schema and are trying to optimize the performance as much as we can.

Right now the tables are all mapped based on a key that is a string and a combination of various different fields.

Now the question is: would the report perform any better if we convert that relativly long string into a number? We're getting right now some out of memory errors when we have a visual with fields from different tables of teh dataset.

 

Thanks!

1 ACCEPTED SOLUTION
SpartaBI
Community Champion
Community Champion

@LeoST it's not relevant for the relationship itself becasue the columns in the model are anyway stored with indexes that replace the original values in the column (hash encoding) and the relationship is based on those indexes. The original columns values are stored in dictionary tables that maps to the indexes. There are also more techniqes the vertipac engine use to compress the data like run lengh encoding and value encoding. The latter is not relevant for string columns. 
However, your situation applies that this big string key column also exist in both the dim and fact, which means this is a very heavy column that takes a lot of memory. It will anyway better to replace this column with a number to reduce the size of the model, which will of course help in dodging memory limit sceanrios and could help also the performance in general.
To coclude, there are probably other major reasons you have performanc issues like how you created your model and measures. It's a journey to find them all, and you of course need to start fixing the ones that cost the major impact and move to the others. This is a discipline of its own. 

View solution in original post

1 REPLY 1
SpartaBI
Community Champion
Community Champion

@LeoST it's not relevant for the relationship itself becasue the columns in the model are anyway stored with indexes that replace the original values in the column (hash encoding) and the relationship is based on those indexes. The original columns values are stored in dictionary tables that maps to the indexes. There are also more techniqes the vertipac engine use to compress the data like run lengh encoding and value encoding. The latter is not relevant for string columns. 
However, your situation applies that this big string key column also exist in both the dim and fact, which means this is a very heavy column that takes a lot of memory. It will anyway better to replace this column with a number to reduce the size of the model, which will of course help in dodging memory limit sceanrios and could help also the performance in general.
To coclude, there are probably other major reasons you have performanc issues like how you created your model and measures. It's a journey to find them all, and you of course need to start fixing the ones that cost the major impact and move to the others. This is a discipline of its own. 

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors