Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Jkaelin
Resolver I
Resolver I

Can Power Query handle this much data?

Good morning,

 

I have one dimension table.  8 fact tables.  Each table has approximately 300,000 records.  

 

Before I create this data model, can power query efficiently join (Left Outter Join) all 8 tables into 1 large flattened table?  Or should I expect immediate performance issues & timeouts, etc?

 

(Flattening to enhance PowerPivot performance issues I've ran into)

 

My computer is i7 3700k processor, 16 GB Ram, Excel 2016 Professional Plus.

 

Thank you,

James K

 

 

1 ACCEPTED SOLUTION

@Jkaelinwhat are your data types in your facts? i really dont see an issue in combining them if they are most numerical, why doing you just try and see what happens?  essentially its mostly up to your machine.   and obviously because this is in memory technology you want to make sure you standarize your data as much as possible , the less unique values you have the better in order to get the most compression.  Start only with what you need rather than everything.





If I took the time to answer your question and I came up with a solution, please mark my post as a solution and /or give kudos freely for the effort 🙂 Thank you!

Proud to be a Super User!




View solution in original post

7 REPLIES 7
vanessafvg
Super User
Super User

@Jkaelin one dimension and 8 facts?  sounds like the issue might be with the modelling, although i might be wrong.

 

it really depends on a variety of things

1) your hardware (you seem to have sufficient hardware)

2) the modelling of your data.  what is in your facts from a data type perspective?  How wide are they?  do you need all the columns?  300000 is not a lot in my opinion, sounds like the design is the issue here

 

are you doing this in power bi or power pivot?





If I took the time to answer your question and I came up with a solution, please mark my post as a solution and /or give kudos freely for the effort 🙂 Thank you!

Proud to be a Super User!




@vanessafvg

 

What is in your facts?

I am pulling in various stock characteristic items.  P/E, P/S, Total Return, etc.  For various index's across time (1990-Current).

 

How wide are they?

They are about 10 columns wide for each fact table.  

 

Do I need all the columns?

Not at a given time, probably 20-30 columns at any given time.  

 

Doing the work in Power Pivot using various DAX percentile, productx, measures that have to iterate over the data set to find the top third, middle third, etc, & calculate stock performance.

 

How could I improve the data modeling?  

The data feed is limiting how many columns I can pull out, so I've had to harvest them seperatly into various fact tables.  Which stinks.  

 

Any thoughts would be very appreciated because I've been banging my head on a wall trying to optimize the data performance.

@Jkaelinwhat are your data types in your facts? i really dont see an issue in combining them if they are most numerical, why doing you just try and see what happens?  essentially its mostly up to your machine.   and obviously because this is in memory technology you want to make sure you standarize your data as much as possible , the less unique values you have the better in order to get the most compression.  Start only with what you need rather than everything.





If I took the time to answer your question and I came up with a solution, please mark my post as a solution and /or give kudos freely for the effort 🙂 Thank you!

Proud to be a Super User!




@vanessafvg

 

Data types are all whole numbers & decimals.  

 

Is flattening okay?  Sometimes I have to create a financial ratio such as Sales / Assets.  Sales will be in fTable 1 and Assets in fTable 2.  I must have this data already combined into ratio for me to calculate statistics on it, such as Top third, Middle third, etc.  

I really appreciate your help & thank you!

@Jkaelin like i said i dont think there is an issue with combining, as long as its all at the same grain it might even make sense to do so, its not your typical relational data, you have only separated it out due to source constraints, your fields are mostly numeric so no issue there, 300 000 is not a biggie.  Denormalising makes sense but like i say take out everything you dont need.  leave the dimension as a dimension and only combine the facts.  with my limited knowledge of what you doing, i dont see an issue





If I took the time to answer your question and I came up with a solution, please mark my post as a solution and /or give kudos freely for the effort 🙂 Thank you!

Proud to be a Super User!




KHorseman
Community Champion
Community Champion

Not sure why you would want to flatten something like that. That's like...the opposite of how this whole program is supposed to work.

 

What specific performance issues are you talking about?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@KHorseman

 

I currently have a Snowflake schema.  I've read that if I'm having poor performance it could be due to a lack of a Star Schema.  I've read Star Schema's operate best via w/ flattened tables and fewer relationships.  So I'm trying to flatten a few tables to help build an optimized Star Schema.  

 

Am I doing it all wrong?  I need help! I'm more of a finance guy trying to thrive in a data analyst world somedays.

 

Thank you!

James

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.