Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
RickScanlon
Frequent Visitor

Guidance for Huge Datasets in 2022, directquery with SQL Server, Datamart, etc?

Hi, would love some guidance.  I'm working with very large datasets (from 30 to 500GB). Because there are so many new developments at Azure and Power Platform, it's getting a little confusing ...

 

I have a Power BI Premium (PAYG) account.  Generally, I use a composite model.  I'll use Directquery to Azure SQL Database for the huge tables, which I optimize with columnstore clustered index in SSMS.  I'll use import for the smaller files so I can transform in the desktop Power Query window if necessary.  Although, I could store all tables in Azure SQL Database if required.  

 

I wonder: Assuming clean, indexed data and simple queries and schema, what is the best path for optimal data model performance?  For instance:

 

  • Should I keep large tables in SQL Server and increase compute (e.g., more DTUs) in my Azure account and rely on columnstore?
  • Should I import data to Datamart?
  • Should I import into an Azure blob container and then to PBI Dataflows
  • Other?

I have full admin control over Azure SQL DB and the PBI Premium accounts, so anything is possible.  Don't want to overspend but have some budget.  Any guidance greatly appreciated. 

1 ACCEPTED SOLUTION

@RickScanlon ,  Please consider following

Datamart is limited in terms of space by your Power bi premium type. Please check limit might be 100GB.

Datamart -SQL access is read-only, Transformation can only be done using Power query.

Recently released, so have to wait how it goes.

 

View solution in original post

3 REPLIES 3
amitchandak
Super User
Super User

@RickScanlon , If the zipped data size is below 100GB in power bi, you can consider import mode.

 

You can consider a mixed mode, where you can have both imported tables and a direct query tables. Direct query tables can be supported with aggregated tables or dual storage.

 

Azure SQL should do a good job. If you opt for a direct query, make sure there is no need for a calculated column of complex measure.

Thanks Amit!  This makes sense.  I need to enable auto aggregations.  Is Datamart not a good option in this use case? 

@RickScanlon ,  Please consider following

Datamart is limited in terms of space by your Power bi premium type. Please check limit might be 100GB.

Datamart -SQL access is read-only, Transformation can only be done using Power query.

Recently released, so have to wait how it goes.

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.