Today in this blog we will talk about scheduling the data exports from Power BI. I have recently come across a requirement where I need to send regular data exports of a dataset from one of my Power BI report. It is quite easy to just export this dataset from Power BI Desktop file itself by copying the data table or by exporting it to a csv file. But this exporting task becomes quite cumbersome when it is required every now and then on a regular basis. So, I tried to come up with a way to create an automated scheduled process for this task. I am sure there will be other ways as well to perform this similar task but let us see how we can achieve this using a simple R/Python script. In this article we will also see the steps to install a On-premises Gateway (personal mode) for setting by the scheduled refresh for the dataset.
Before we start this process, let us make sure we have everything set-up at our end to use R and Python in Power BI:
Once R and Python are installed on the machine, make sure Power BI is pointed to the right directory of R and Python by navigating to the following paths in Power BI Desktop:
For R: File->Options and Settings->Options->R scripting (point to R path as highlighted below):
For Python: File->Options and Settings->Options->Python scripting (point to Python path as highlighted below):
Once the above steps are done let us now start with an example where we use R script to set up our dataset export from Power BI. Consider the following sample dataset:
Our task is to create a scheduled export process for this dataset on weekly basis. Navigate to Transform Data section in Power BI as shown below:
The following window opens:
Now navigate to R-script option using Transform option as shown in below and a new window appears: (Marked steps 1 to 3)
The R script window that opens just write the following single line code in there and hit OK:
The code above is just taking your table which you want to export as a file as an input (referred to as ‘dataset’) and writes it to a path specified on your machine. ("C:/testExports/exported_data.csv")
Now as you hit OK, you end up with the following warning on your screen:
Just click ‘Continue’ and it will ask you for privacy level setting for your dataset. Just make the privacy level as ‘Public’ as shown below and hit SAVE:
Once this Privacy setting is saved, you end up with empty table as shown below:
The reason for this is R generally identifies the dataset as ‘dataframe’ and we have not included that in our R-script. Let us just modify our R-script to the one as shown below:
You will see that I have added a new step before writing the dataset as a table. In this step, we are reading our data table as a ‘dataframe’ which R understands. Once we hit ok, we end up with following:
We see that an “output” step is added under applied steps section. Now we see our dataset. Just apply query changes and save your Power BI file.
The next step is to publish this Power BI file to Power BI Service:
Once it is published in Power BI Service to the relevant workspace, we will set up a scheduled refresh for our dataset.
To set-up a scheduled refresh for our dataset we will need to set up a Power BI On-premises Gateway in personal mode as running R/Python scripts is supported by this type of gateway only.
Let us now see how we install Power BI personal on premises gateway. In Power BI Service there is an option to download the gateway as shown below:
Download the installer for the gateway in personal model and execute it by double-clicking on the downloaded executable file.
Following window appears. Enter you email address and Sign in to register the gateway.
Once you finish the sign in process, the following window appears mentioning that the gateway is running and is ready to be used:
You can close the above window and go to Power BI Service under Settings->Datasets. Here you will see that you have a gateway running: (I have also switch on the use of this gateway as shown below)
If we expand the Data Source Credentials section for this dataset, we see following warnings regarding dataset credentials:
We will edit the credentials and the warning message disappears:
Now let us set up a Scheduled refresh for this dataset: (You can set what-ever frequency you want here – Daily, Weekly and also mention a particular time)
Once we finish all the above steps, Power BI automatically downloads the data to the specified location.
My data got downloaded at the following location on my machine:
Similar steps can be used to perform this task using Python script. Just that in Query Editor we will use Run Python script in place of Run R script and the following code:
The script in Python:
So, this is a straightforward way of using R/Python script in Power BI to automate our data exports. This approach has got few limitations along with few benefits as well. Let us see them to conclude this article.
This approach should not be used if the data is highly sensitive. Running R-script/Python-script in Query editor in Power BI is only supported for the datasets that can have a privacy level set to ‘public’.
Scheduled refresh can only be set-up using On-premises Gateway in personal mode, so that R/Python scripts can be used.
This approach avoids any extra effort of exporting data manually from a report.
Once the approach is set up in Power BI Service, it is automated and scheduled accordingly to export data to a destination folder on a regular basis.
This approach does not require any in-depth knowledge of R/Python coding. It uses a simple code to write data to a file.