cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Frequent Visitor

dataflows and service storage query

Hi,

 

I have a question as to what is stored within the power BI service storage space and what is processed in the memory but not stored.

 

I'm in the process of creating several large sequencial dataflows, 

The first dataflow beind a upload from a source (lets call  this input). 

After this there can be muliple dataflows feeing from the input dataflow and i want to understand just want data is the storage space used for.

 

for example lets say "input" has 3 entities each of 10gb

 

transform 1 is fed from input using 2 of the entities (entity_1+entity_2), these are merged into another entity "tranform_1_output" which is 7gb in size

 

transform 2 if fead from transform 1 output and input entity 3, in this dataflow input_entity_3 is referenced and grouped into a a new entity (trans_2_groupings) which is 3gb in size this is merged as new with the transform 1 output to create a final entity (Desired_output) which is 5gb in size.

 

My questions is how much storage space is used in the service.  

 

I am thinking there are 2 options.

 

the first being all entities have enable load ticked:

             Input - 3 entities 5gb each = 15Gb

             Transform 1 - 2 intial entities (5Gb) each (10gb) + output entity (7gb) = 17gb

             Transform 2 - tranform 1 output (7Gb) + input_entity_3 (5Gb) + trans_2_grouping (3Gb) + Desired_output (5GB) = 20GB

       Total Storage = 52Gb

 

Option 2 where apart form the input dataflow all other dataflows only have enable load on the final entity

             Input = 15Gb

             Transform 1 = 7Gb

             Transform 2 = 5Gb

        Total Storage = 27Gb

 

then again both these may be wrong but if someone knows can you let me know thanks

 

(this is just an example dataflow sizes and compexity are much bigger)

 

2 REPLIES 2
Highlighted
Microsoft
Microsoft

Re: dataflows and service storage query

Hi , @Coffeeaddict 

Azure Data Lake storage is Microsoft cloud storage that can store structured data (like tables) and unstructured data (like files).

Maybe you  can  get some information   about powerbi dataflow  storage in   Azure Data Lake Storage resources by using Storage Explorer.

 

You can refer to these related posts.

power-bi-dataflows-faq 

common-data-model/model-json 

https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-in-storage-explorer

 

Best Regards,
Community Support Team _ Eason

 

Highlighted
Frequent Visitor

Re: dataflows and service storage query

Thanks for the reply, 

 

I was unable to find out any information about which elements of a dataflow increase storage size and i do not have access to the azure storage explorer to conduct any testing if unticking enable load just processes those entities in memory or still stores the data within the azure storeage enviroment.

 

 

Helpful resources

Announcements
August 2020 Community Challenge: Can You Solve These?

August 2020 Community Challenge: Can You Solve These?

We're excited to announce our first cross-community 'Can You Solve These?' challenge!

Community Blog

Community Blog

Visit our Community Blog for articles, guides, and information created by fellow community members.

Upcoming Events

Upcoming Events

Wondering what events you could join or have an event to promote yourself? Check out our Upcoming Events.

Community Summit Australia – Join Online!

Community Summit Australia – Join Online!

Be a part of the leading Microsoft Business Applications digital event, curated for the APAC community.

Top Kudoed Authors