Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a data source with a table that is ~15 [GB] in size (uncompressed). Unfortunately, I need to pull the entire table. My pbix file somehow has compressed the data down to 150 [MB]. I published my pbix file successfully to the service. But, when I attempt to refresh the corresponding dataset in the service, I get the following error message:
The amount of uncompressed data on the gateway client has exceeded the limit of 10 GB for a single table. Please consider reducing the use of highly repetitive strings values through normalized keys, removing unused columns, or upgrading to Power BI Premium.
I understand that I need to limit the result set of my query to <= 10 [GB]. However, that means that, moving forward, I need to determine the size of my result set before I publish, as even though I can successfully publish the compressed size (150 [MB]), the subsequent refresh in the service will fail. Is there some feature in Power BI Desktop that can show me the uncompressed data size per table? Or, do I have to determine this myself, outside of the Desktop client?
I am not aware of anything in the desktop that will give you the uncompressed size.
I know that I need to do something to resolve the uncompressed data volume obstacle. However, that is not my issue. My issue is... I need / want something in the Desktop client that warns me that the resulting uncompressed table is > 10 [GB]. I need / want this because, except in the very rare scenario that I exceed the 1 [GB] compressed limit, in general, I can publish a pbix file successfully for an uncompressed table that is > 10 [GB], only to be burned later on in the refresh of the dataset. It would be nice to know that I'm going to have this problem in advance in the Desktop client. Otherwise, I have to convert the M code to SQL code and run the SQL code in a SQL client to see if I'm exceeding the 10 [GB] uncompressed limit in the result set (table in the pbix file). Since I work on data warehouse sources, my result sets are generally going to be > 10 [GB]. And, I don't want to purchase Premium to utilize the incremental refresh feature.