Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

how to architect developer environment for fast data loads in Desktop

First off, this isn't strictly a Desktop question.  But, I don't know where else to ask this question.  So, you'll have to forgive me if this isn't the right forum.

 

I have PBI developers spread across the country (USA).  Most of the data that we load into PBI comes from a single location in the country (i.e., I don't have data centers distributed across the country).  Most of our PBI reports have lots of complicated data load transformations, but the data volumes aren't gigantic (~2M rows for our largest table).  We have been finding that, in developer work in Desktop (not report refreshes in the Service):

1) when the transformations are written in M (instead of directly in the data source language), the data loads are really slow (sometimes taking about 45 minutes)

2) developers that are physically near our data center experience much faster data loads than those that are further away (exacerbated by the fact that we're all working from home now because of the virus, i.e., we're using VPN)

 

To resolve issue #1, we have implemented a design standard to write the transformations in the data source language, not in M / Power Query.  To resolve issue #2, we have begun standing up machines near our data center for developers who are far away from the data center to RDP into.

 

I understand that our solution for issue #1 isn't recommended in general (e.g., because it prevents all subsequent query-folding, should we ever want to do M transformations after the source step).  And, I understand that data load performance in Desktop may be very different than in the Service.  (For example, a SQL query in Desktop runs twice, whereas it only runs once in the Service.)  However, I'm very uncomfortable with our design standard for issue #1.  Is slowness of data load in Desktop for complex M transformations typical?  If not, what is the best solution for this?

 

Also, is it typical for companies to stand up machines near data centers for PBI developers to RDP into in order to overcome Desktop load performance challenges?  If not, how is this problem solved?

3 REPLIES 3
Greg_Deckler
Super User
Super User

@Anonymous Well, putting in Views in your source data system is the recommended fastest way to load data. It is when you put the SQL code into a query statement that can cause problems. @edhans has specific opinions on this if memory serves. If the M query transformations can fold into the database that is the 2nd best option beyond putting the views in the database itself.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

@Greg_Deckler 

 

I will consider moving the heavy transformation logic into SQL Server as views.

 

How about for my second question?  To overcome data speed challenges, is it typical to put "developer" machines near data centers and have PBI developers RDP into them for PBI development?  If so, are there any gotcha's or best practices?  For example, should I assign 1 machine per developer (since PBI saves credentials into the Windows profile)?

@Anonymous Yeah, distance creates latency, no question if you could sit your developers on top of the servers that would be optimal! :D. I don't see why developers couldn't share jump boxes, that's pretty common but I'm sure they would be much happier not having to share, I mean, they are developers after all!! 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.