Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
huzi-m
New Member

Does it matter how granular we make our partitions?

So I'm looking to setup incremental refresh for a fairly large dataset which is 6gb in size. I was wondering does it affect performance depending on how many partitions we create for the 'archived' data. For example if we're looking to archive 5 years worth of data, this could be done as years, quarters, months and even days. Choosing years we'd have 5 partitions, whereas choosing months we'd have 60.

 

Choosing months gives the advantage of refreshing a particular archived month if needed, rather than the whole year. But I'm not sure if there's any disadvantages? Perhaps someone has done some testing? Would be interested in seeing the results if so.

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Make your partitions as big as you can, but not bigger.  I have scenarios where the partition size is dictated by the source system performance.  For example the source system conks out after 500M rows, which covers about 2.5 months.  So - monthly partitions with about 200M rows each it is (to be on the safe side).  Yes, it's 60 partitions, but they are guaranteed to work.  Quarterly partitions would have been risky/pointless.

 

Your situation may vary, but it most likely will also be dictated by the capabilities of the source system. If that can easily handle yearly queries then use year partitions.

View solution in original post

5 REPLIES 5
lbendlin
Super User
Super User

Make your partitions as big as you can, but not bigger.  I have scenarios where the partition size is dictated by the source system performance.  For example the source system conks out after 500M rows, which covers about 2.5 months.  So - monthly partitions with about 200M rows each it is (to be on the safe side).  Yes, it's 60 partitions, but they are guaranteed to work.  Quarterly partitions would have been risky/pointless.

 

Your situation may vary, but it most likely will also be dictated by the capabilities of the source system. If that can easily handle yearly queries then use year partitions.

  • Right, but do you have any evidence of that being so? For example say my source can handle yearly partitions. Why should I choose yearly over monthly. Perhaps there's a blog post I can refer to.

you should choose the minimum possible number of partitions.

Please can you provide some evidence for this?

Occam's Razor.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors