Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Coffeeaddict
Frequent Visitor

dataflows and service storage query

Hi,

 

I have a question as to what is stored within the power BI service storage space and what is processed in the memory but not stored.

 

I'm in the process of creating several large sequencial dataflows, 

The first dataflow beind a upload from a source (lets call  this input). 

After this there can be muliple dataflows feeing from the input dataflow and i want to understand just want data is the storage space used for.

 

for example lets say "input" has 3 entities each of 10gb

 

transform 1 is fed from input using 2 of the entities (entity_1+entity_2), these are merged into another entity "tranform_1_output" which is 7gb in size

 

transform 2 if fead from transform 1 output and input entity 3, in this dataflow input_entity_3 is referenced and grouped into a a new entity (trans_2_groupings) which is 3gb in size this is merged as new with the transform 1 output to create a final entity (Desired_output) which is 5gb in size.

 

My questions is how much storage space is used in the service.  

 

I am thinking there are 2 options.

 

the first being all entities have enable load ticked:

             Input - 3 entities 5gb each = 15Gb

             Transform 1 - 2 intial entities (5Gb) each (10gb) + output entity (7gb) = 17gb

             Transform 2 - tranform 1 output (7Gb) + input_entity_3 (5Gb) + trans_2_grouping (3Gb) + Desired_output (5GB) = 20GB

       Total Storage = 52Gb

 

Option 2 where apart form the input dataflow all other dataflows only have enable load on the final entity

             Input = 15Gb

             Transform 1 = 7Gb

             Transform 2 = 5Gb

        Total Storage = 27Gb

 

then again both these may be wrong but if someone knows can you let me know thanks

 

(this is just an example dataflow sizes and compexity are much bigger)

 

2 REPLIES 2
v-easonf-msft
Community Support
Community Support

Hi , @Coffeeaddict 

Azure Data Lake storage is Microsoft cloud storage that can store structured data (like tables) and unstructured data (like files).

Maybe you  can  get some information   about powerbi dataflow  storage in   Azure Data Lake Storage resources by using Storage Explorer.

 

You can refer to these related posts.

power-bi-dataflows-faq 

common-data-model/model-json 

https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-in-storage-explorer

 

Best Regards,
Community Support Team _ Eason

 

Thanks for the reply, 

 

I was unable to find out any information about which elements of a dataflow increase storage size and i do not have access to the azure storage explorer to conduct any testing if unticking enable load just processes those entities in memory or still stores the data within the azure storeage enviroment.

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors