cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
CristianAv
Resolver II
Resolver II

Replace a Long Text Key Field with an Integer

Hi

 

I'm new to power bi, and I have previous experience with Qlikview.

In Qlikview, there is a function called Autonumber. Basically, what is do is to replace a long key with a number, reducing the amount of memory required to store a long key.

So I think it could be replicated in PowerBi doing something like this:

a) Reference all the tables that has the LongTextKey. (Table1 and Table2)

b) Remove all the other fields that are not the LongTextKey in the referenced tables (Table1Ref and Table2Ref)

c) Append all this tables into one new table: NewKeyTable 

d) Remove duplicated rows in the NewKeyTable

e) Add an index (IndexKey) to this new table NewKeyTable.

f) In Table1 and Table2, do a merge with the LongTextKey

g) In Table1 and Table2, expand the merged table, but only the created field IndexKey in step e).

h) In Table1 and Table2, drop the LongTextKey

k) In the model, use the new indexKey instead of the droped LongTextKey

h) Disable load for all the created tables

 

Is this feasible in PowerBi, does it makes sense?

If it is feasible, I think this should be implemented in PowerBi as a Easy and Quick option, as it is in Qlikview

Also, if feasible this concepts allows to hide sensitive keys to the final user. It is also specially useful when the fact table or some big table don't have a key and it must be constructed based on many others fields (sometimes 8+ text fields)

 

I've done a test with about 100.000 records, and this are the results of the size of fields, that I've got from DaxStudio:

TableFieldTypeSize (kb)
NewKeyTableLongTextKeyDBTYPE_WSTR1026,943359375
NewKeyTableIndexKeyDBTYPE_R80,1171875

 

Thanks!

 

4 REPLIES 4
ImkeF
Super User II
Super User II

Hi @CristianAv ,

Power BI is doing memory optimization automatically in the background (the VertiPaq engine uses dictionary encoding for long fields). 

However, this doesn't cover the problem of sensitive keys. Therefore you'd have to implement this logic in the query editor and make sure that only the surrogate keys are loaded into the data model.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Thanks

As you told, it uses a dictionary encoding internally, but in this dictionary encoding, PowerBi still needs to retain all the text distinct values for the long field.

In this proposed way, PowerBi don't need to store this dictinct values, as we don't care the values, only the relationships between the tables, reducing filesize.

 

Regards!

True, but in the light of all the other improvements to be made, I doubt that this feature will get any priority without many votes for it in the idea section: https://ideas.powerbi.com/forums/265200-power-bi-ideas 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
MBAS on Demand

2021 Release Wave 2 Plan

Power Platform release plan for the 2021 release wave 2 describes all new features releasing from October 2021 through March 2022.

July 2021 Update 768x460.png

Check it out!

Click here to read more about the July 2021 Updates

Power Query PA Forum 768x460.png

Check it out!

Did you know that you can visit the Power Query Forum in Power BI and now Power Apps

Top Solution Authors
Top Kudoed Authors