I'm working on a dataset which size became about 600 MB and consumes three times more RAM when loaded. That's not what I'd expect so I've downloaded the great tool Vertipaq Analyzer (from @AlbertoFerrari, @marcorusso ) and tried to find out which table/column consumes the most RAM. Till now it's boring staff, but then I saw that Int64 columns don't use value encoding but dictionary encoding. And it also means that the size used by these columns explodes.
After some experiments I have found out that if I have an Int64 column with 225 000 rows, Vertipaq still uses a value encoding, but if it is ~ 240 000+ rows it uses dictionary encoding and the consumed memory is 8x bigger than expected, see the screenshot below.
you can't force it, but you can give a hint to the Vertipaq Analyzer, that it should use Value instead of Hash. It can be set by using Tabular Editor in a PBIT file. You can also write a simple C# code which does the transformation for all columns, which supports Value encoding and so on. I will write a blog post about that, the topic is waiting in a queue 😉
You can use for example following snippet in the Tabular Editor:
// for all Int64 columns set EncodingHint
foreach(var column in Model.Tables.SelectMany(t => t.Columns))
if(column.DataType == DataType.Int64)
column.EncodingHint = EncodingHintType.Value;