Solved: Re: Pricing in details

gborkowski · ‎10-24-2023

Hello,

Not sure if I am using a correct forum (please feel free to move my post if needed), but I was wondering about Microsoft Fabric pricing, in details.

I have a pretty lightweight data engineering process that I want to create. For this, I need a Data Factory (to copy data) and Databricks (for data science purposes + data cleaning and transformation). From pricing perspective, it seems like that:

- F2 costs 0.36$/h and it's billed per second

- This pricing contains EVERY fabric item (such as Data Factory, Synapse Data Engineering, Data Science etc)

- It can be paused and resumed

Having that in mind, is it true that if Fabric will run for 10 minutes each day, I will pay 10 * (0.36/60) = 6 cents per day? For whole process? Plus maybe a lakehouse cost but capacity is not yet provided. I am amazed because:

- Data Factory/Synapse + Databricks alone costs more

- In Synapse Data Engineering I can specify Spark Node Pools: what's the point of doing that if I am billed for Microsoft Fabric use, not Spark Cluster compute? 🤔

I would be glad for any details/confirmation as I am planning to get a real quote of how much my application will cost to run

AndyDDC · ‎10-26-2023

But you won't be able to configure a Spark cluster in a fabric workspace that's allocated to an F2 SKU to use 10 nodes. Any workloads in an F2 will be constrained to a single node running 4 v-cores.

In the image below, I have a workspace allocated to an F2 SKU, even though the default Spark cluster is a medium - it can only run 4 v-cores (rather than the 8 that a medium can run). I'm also not able to change the maximum node size to anything greater than 1. When I ran a notebook, it could only run in a single node and with 4 v-cores.

If I change the sku to an F8, I can now increase the node size but only to 2 nodes (each medium node is 8 v-cores, so 2 nodes would be 16 v-cores which is the maximum that an F8 sku can use)

Does that help clarify?

View solution in original post

AndyDDC · ‎10-25-2023

Hi, you'll be billed for how long the Fabric Capacity is running for overall and at what SKU, it's not flexible in terms of only running (and billing) when a job is running (eg data factory or synapse pipelines). So you could probably set-up a process to start/stop the fabric capacity using the api. If you only need the pipelines functionality then data factory seems like a good choice.

gborkowski · ‎10-26-2023

Hi,

That's what I asked for in the post - if Microsoft Fabric (not pipeline) will run for 10 mins a day (as I will pause and resume Microsoft Fabric), will it cost me 6 cents per day? From what @v-gchenna-msft said - no, there will be some additional cost, i.e. based on the Spark pool nodes.

I know alternatives for Microsoft Fabric, but since we will be doing a big transformation using Fabric, I wanted to test it out on my own project as well.

AndyDDC · ‎10-26-2023

There is no additional cost for Spark compute in Fabric. You pay a single fee for the F sku size (or a premium capacity) and the storage cost (pricing is based on azure data lake gen2 pricing).

"so you are billed for the “spark compute”, but in CU not in terms of spark cluster" is saying that you are billed by the CUs (capacity units) which is based on the Fabric sku used. If you run an F2 then you're billed for the whole amount of that F2 sku for the time it runs.

gborkowski · ‎10-26-2023

I can't really understand the Spark pools part. If I have F2 SKU and I can choose between 1 and 10 Spark nodes (as an example) and I will be paying the same amount for both 1 and 10 nodes, doesn't it make sense to always choose 10 nodes instead of 1? In Azure Databricks, if you use 10 nodes, you pay 10x more than for just 1 node, that's why it doesn't make much sense for me

AndyDDC · ‎10-26-2023

But you won't be able to configure a Spark cluster in a fabric workspace that's allocated to an F2 SKU to use 10 nodes. Any workloads in an F2 will be constrained to a single node running 4 v-cores.

In the image below, I have a workspace allocated to an F2 SKU, even though the default Spark cluster is a medium - it can only run 4 v-cores (rather than the 8 that a medium can run). I'm also not able to change the maximum node size to anything greater than 1. When I ran a notebook, it could only run in a single node and with 4 v-cores.

If I change the sku to an F8, I can now increase the node size but only to 2 nodes (each medium node is 8 v-cores, so 2 nodes would be 16 v-cores which is the maximum that an F8 sku can use)

Does that help clarify?

AndyDDC · ‎10-26-2023

FYI as a bit of a test, I quickly changed an F sku to be F256 (!!!) and I could have a maximum of 64 nodes! As an F64 is allowed 128 V-cores, an F256 is allowed 512 v-cores (so 64 medium size nodes @ 8 v-cores per node)

Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn

gborkowski · ‎10-26-2023

Okay, so for capacity/SKU I choose, I can have maximum of X nodes and X V-cores per node, I get it. Thank you for great checks!

AndyDDC · ‎10-27-2023

FYI just to add a bit more to this...

Apparently you will be able to use more than 4 cores in an F2 SKU. This uses the "bursting" functionality which will give you more compute than you have currently allocated. You will still only be able to use 4 cores for a Spark job, but you'll be able to run (i believe) up to 3x more (so 3 parellel workloads, each using 4 cores).

Only problem is that "smoothing" is then used to average out your usage when your capcacity is not doing anything. Basically when you burst, you "borrow" extra compute, but then "smoothing" is when you give it back when your capacity isn't doing anything. However, if you plan to pause as soon as jobs complete, you won't be able to "smooth" and you'll be billed for the extra compute you used.

gborkowski · ‎10-25-2023

Hmm, will it be something like CU pricing (for using Fabric) + Synapse pricing (for Spark pools, cluster sizes, data factory integration runtimes etc)?

Will be looking forward to this as it will be surely helpful! And of course thank you for a quick response

v-gchenna-msft · ‎10-24-2023

Hi @gborkowski ,

Thanks for using Fabric Community.

It’s a bit more complicated than that. We will publish a document with more guidance on pricing. That will be comming soon.

For the question about spark pools: first you can have pools with specific libraries preloaded, node size etc… so you are billed for the “spark compute”, but in CU not in terms of spark cluster. Depending on the type of work you’re doing, choosing the right nose size is still relevant. Also you have high-concurrency nodes that are helpful in some situations.

jeffocs · ‎05-13-2024

How about the data warehouse? do i need to pay additional for the data warehouse / Azure SQL Database?

If Microsoft Fabric include the data warehouse, can our other workload in azure connecting to the data warehouse?

Currently we using Data Factory, Azure SQL Database, and Power BI. Exploring if there are any benefits for moving the workload to Microsoft Fabric.

v-gchenna-msft · ‎05-14-2024

Hi @jeffocs ,

Thanks for using Fabric Community.
I suggest you to please raise a request here to get a detailed explaination - Microsoft Fabric - Pricing | Microsoft Azure

Hope this is helpful.

Pricing in details

Helpful resources

New forum boards available in Real-Time Intelligence.

Fabric Monthly Update - May 2024

Jumpstart your career with the Fabric Career Hub

Pricing in details

Helpful resources

New forum boards available in Real-Time Intelligence.

Fabric Monthly Update - May 2024