cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
carlpaul153
Helper I
Helper I

Python script in a function query becomes slow after using it multiple times.

I have a function type query called fxScript, which after making several modifications, finally as the last step invokes the following Export2Csv function that exports the table as a .csv file:

 

(SOURCE as table, PATH as text) =>
let
    Export = Python.Execute("# 'dataset' holds the input data for this script#(lf)import pandas#(lf)dataset = pandas.DataFrame(dataset)#(lf)dataset.to_csv(r'"&PATH&"', index=False)", [dataset=SOURCE])
in
    Export

 

 

I have used the function effectively on 34 files of exactly 100mb each (about 2.3 million rows each). The resulting .csv files are between 35-40mb each (approximately 0.5 million rows each).

Originally every 1-2 minutes each of the 34 .csv files were finished being created through the fxScript function. This was 3 days ago.

Since yesterday I am trying to use the same function * on the same input files *, but each file takes between 3 and 30 minutes (most usually take between 10-15 minutes).
I have verified that the bottleneck is not in any of the first steps of fxScript, which run quickly, but in the python script. At the bottom right of Power Bi the message "Waiting for Python" appears for a long time.

I have tried using only the first few rows of the input files. For example, with the first 200 lines the script runs relatively fast. By increasing the lines the delay time also increases (at least linearly).

Reading this article I have tried deleting the PythonScriptWrapper / s folder, but no has fixed the problem.

Copying and pasting my .pbix file also didn't work, i.e. creating a duplicate.

Since the script originally worked fast and now it doesn't, I suspect there must be some kind of temporary file that slows down the process.

Someone who can think of what could be could help me out?

3 REPLIES 3
V-lianl-msft
Community Support
Community Support

Hi @carlpaul153 ,

 

Try not to call the file as a function.

Please refer to the fourth suggestion in the blog you mentioned:


4) Use one R or Python script per query

Consolidate custom code into a single R or Python query. Rather than use “Run R script” or “Run Python script” numerous times in the same Power Query query, use a single script to perform multiple tasks on the same data frame. This will also save on local IO to the R or Python working directory and should help your Power Query query run faster.


 

Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

mahoneypat
Super User IV
Super User IV

Are your files stored on your local computer?  Or online (SharePoint/OneDrive)?  If the latter, could that connection speed be the issue?

 

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Excuse me, I forgot to clarify that. The files are stored on my pc 😕

Helpful resources

Announcements
PBI User Groups

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group!

MBAS on Demand

Microsoft Business Applications Summit sessions

On-demand access to all the great content presented by the product teams and community members! #MSBizAppsSummit #CommunityRocks

MBAS Attendee Badge

Claim Your Badge & Digital Swag!

Check out how to claim yours today!

secondImage

Are You Ready?

Test your skills now with the Cloud Skills Challenge.

Top Solution Authors