cancel
Showing results for 
Search instead for 
Did you mean: 

Schedule Automated Data Exports from Power BI using a simple R/Python Script

Today in this blog we will talk about scheduling the data exports from Power BI. I have recently come across a requirement where I need to send regular data exports of a dataset from one of my Power BI report. It is quite easy to just export this dataset from Power BI Desktop file itself by copying the data table or by exporting it to a csv file. But this exporting task becomes quite cumbersome when it is required every now and then on a regular basis. So, I tried to come up with a way to create an automated scheduled process for this task. I am sure there will be other ways as well to perform this similar task but let us see how we can achieve this using a simple R/Python script. In this article we will also see the steps to install a On-premises Gateway (personal mode) for setting by the scheduled refresh for the dataset.

 

Before we start this process, let us make sure we have everything set-up at our end to use R and Python in Power BI:

  1. Make sure R is installed on the machine where Power BI is being used. If not refer following link for the same: https://docs.microsoft.com/en-us/power-bi/create-reports/desktop-r-visuals
  2. Make sure Python is installed on the machine where Power BI is being used. If not refer following link for the same: https://docs.microsoft.com/en-gb/power-bi/connect-data/desktop-python-scripts
  3. Once R and Python are installed on the machine, make sure Power BI is pointed to the right directory of R and Python by navigating to the following paths in Power BI Desktop:
    1. For R: File->Options and Settings->Options->R scripting (point to R path as highlighted below): rp1.png
    2. For Python: File->Options and Settings->Options->Python scripting (point to Python path as highlighted below): rp2.png

Once the above steps are done let us now start with an example where we use R script to set up our dataset export from Power BI. Consider the following sample dataset:

rp3.png

 

Our task is to create a scheduled export process for this dataset on weekly basis. Navigate to Transform Data section in Power BI as shown below:

rp4.png

 

The following window opens:

rp5.png

 

Now navigate to R-script option using Transform option as shown in below and a new window appears: (Marked steps 1 to 3)

rp6.png

 

The R script window that opens just write the following single line code in there and hit OK:

rp7.png

 

The code above is just taking your table which you want to export as a file as an input (referred to as ‘dataset’) and writes it to a path specified on your machine. ("C:/testExports/exported_data.csv")

 

Now as you hit OK, you end up with the following warning on your screen:

rp8.png

 

Just click ‘Continue’ and it will ask you for privacy level setting for your dataset. Just make the privacy level as ‘Public’ as shown below and hit SAVE:

rp9.png

 

Once this Privacy setting is saved, you end up with empty table as shown below:

rp10.png

 

The reason for this is R generally identifies the dataset as ‘dataframe’ and we have not included that in our R-script. Let us just modify our R-script to the one as shown below:

rp11.png

 

You will see that I have added a new step before writing the dataset as a table. In this step, we are reading our data table as a ‘dataframe’ which R understands. Once we hit ok, we end up with following:

rp12.png

We see that an “output” step is added under applied steps section. Now we see our dataset. Just apply query changes and save your Power BI file.

 

The next step is to publish this Power BI file to Power BI Service:

rp13.png

 

Once it is published in Power BI Service to the relevant workspace, we will set up a scheduled refresh for our dataset.

rp14.png

 

To set-up a scheduled refresh for our dataset we will need to set up a Power BI On-premises Gateway in personal mode as running R/Python scripts is supported by this type of gateway only.

Let us now see how we install Power BI personal on premises gateway. In Power BI Service there is an option to download the gateway as shown below:

rp15.png

This will navigate to a download link for gateways, just make sure to download the gateway in personal mode: https://powerbi.microsoft.com/en-us/gateway/

Download the installer for the gateway in personal model and execute it by double-clicking on the downloaded executable file.

rp16.png

Following window appears. Enter you email address and Sign in to register the gateway.

rp17.png

Once you finish the sign in process, the following window appears mentioning that the gateway is running and is ready to be used:

rp18.png

You can close the above window and go to Power BI Service under Settings->Datasets. Here you will see that you have a gateway running: (I have also switch on the use of this gateway as shown below)

rp19.png

 

If we expand the Data Source Credentials section for this dataset, we see following warnings regarding dataset credentials:

rp20.png

We will edit the credentials and the warning message disappears:

rp21.png

 

Now let us set up a Scheduled refresh for this dataset: (You can set what-ever frequency you want here – Daily, Weekly and also mention a particular time)

rp22.png

Once we finish all the above steps, Power BI automatically downloads the data to the specified location.

My data got downloaded at the following location on my machine:

rp23.png

 

Similar steps can be used to perform this task using Python script. Just that in Query Editor we will use Run Python script in place of Run R script and the following code:

rp24.png

The script in Python:

rp25.png

So, this is a straightforward way of using R/Python script in Power BI to automate our data exports. This approach has got few limitations along with few benefits as well. Let us see them to conclude this article.

Limitations:

  1. This approach should not be used if the data is highly sensitive. Running R-script/Python-script in Query editor in Power BI is only supported for the datasets that can have a privacy level set to ‘public’.
  2. Scheduled refresh can only be set-up using On-premises Gateway in personal mode, so that R/Python scripts can be used.

Benefits:

  1. This approach avoids any extra effort of exporting data manually from a report.
  2. Once the approach is set up in Power BI Service, it is automated and scheduled accordingly to export data to a destination folder on a regular basis.
  3. This approach does not require any in-depth knowledge of R/Python coding. It uses a simple code to write data to a file.

Hope this article helps everyone out there.

 

- Pragati

 

Comments

Very will written and explained. It is something new to me. But I do get such requirement where I just need to export the data to CSV/excel file. thanks for sharing it