Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
rlussky
Helper I
Helper I

Best Practices for Large Data Importing from SQL

Hello, I have experience with Power BI, but currently am working with importing huge sets of data from SQL for the first time. My question is what is the best practice for working with multiple huge sets up data (transactional data with hundreds of millions of rows). I haven't been able to find a resource that has talked about importing this much data explicitly?

 

1) Write a long SQL statement with many joins on these tables, then load using the statement?

2) Bring in the tables, filter in Power Query as best I can, and map in the Modeling section of Desktop?

3) Bring in the tables, filter in Power Query as best I can, and merge in Power Query?

 

Thanks!

1 ACCEPTED SOLUTION
AlexisOlson
Super User
Super User

You'll still likely want to load in your fact and dimension tables into your model separately as a star schema, rather than merging into a big monster table. One option when working with large tables is to use DirectQuery rather than Import to load your tables.

 

From your three listed options, #2 is probably the closest to best practice, assuming you don't have to do lots of data manipulation other than some filtering. If you do need lots of data manipulation, then you may want to do that in SQL first and have clean tables (or views) to load into Power BI.

View solution in original post

3 REPLIES 3
AlexisOlson
Super User
Super User

You'll still likely want to load in your fact and dimension tables into your model separately as a star schema, rather than merging into a big monster table. One option when working with large tables is to use DirectQuery rather than Import to load your tables.

 

From your three listed options, #2 is probably the closest to best practice, assuming you don't have to do lots of data manipulation other than some filtering. If you do need lots of data manipulation, then you may want to do that in SQL first and have clean tables (or views) to load into Power BI.

Thanks for your response! There is a good amount of maniuplation to be done. I tried loading using a long SQL query and the update was awful. I also tried loading multiple tables and filtering best I could, and this still took some good time PRIOR to manipulation.

 

So if I just created a new table or view in SSMS based on the query I was trying to run, I could then use that to access in Power BI? This makes some sense since it only took about 6 minutes to run in SSMS compared to 4+ hours in Power BI (I canceled it).

Yes. Power Query is powerful but it's often better to push SQL manipulations upstream before trying to load the data into Power BI.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.