Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
AlexMB
Regular Visitor

Much more data when connecting to dataflow than directly to SQL

I've set up a dataflow that pulls selected tables from our ERP SQL.

 

When I connect to the dataflow from PBI Desktop it loads several GB's of data even though I'm selecting only a few 100 lines from two tables.

 

If I connect directly to the SQL I can load the lines in seconds. Using the dataflow it takes several minutes.

 

Am I doing something wrong?

 

My approach in both cases is the same:

  • Get Data (SQL vs Dataflow)
  • Select tables (two, in this case)
  • Transform data (filtered down to a few 100 lines in each table)
1 ACCEPTED SOLUTION

Yes. When you use a Dataflow as a source, to get folding, the Enhanced Compute Engine has to be turned on (premium capacity) and you need to use the Dataflows connector (not the Power BI Dataflows one).

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

4 REPLIES 4
AlexMB
Regular Visitor

Thanks @mahoneypat 

 

Could you point me one step further?

 

Google is giving me lots of contradictory information about query folding regarding dataflows. I don't know the term, so don't know what I'm looking for.

 

One thing I see mentioned is the "Enhanced Compute Engine". The dataflow doesn't sit within a premium capacity. Is this the root of my issue, perhaps?

Yes. When you use a Dataflow as a source, to get folding, the Enhanced Compute Engine has to be turned on (premium capacity) and you need to use the Dataflows connector (not the Power BI Dataflows one).

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Thanks. So outside of Premium, Dataflows isn't really a viable option, since any datasets will have to pull entire tables every time.

mahoneypat
Employee
Employee

There is likely something breaking "query folding" in your query. The use of a SQL statement returns only the desired rows, but you should be able to get similar refresh time if you maintain query folding. There are indicators in the query editor for Dataflows to show if it is in place or not, and you can modify/rearrange your steps (e.g., do filtering and column selection first) to potentially maintain it.

 

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.