Starting with R-Scripting - Power BI

by gastoncruz Frequent Visitor on ‎02-28-2017 12:25 PM

R Scripts and Power BI

 

 

What is R-script?

 

R is a language and environment used for statistical and graphic computing. It is a modification of the S language, so there is code written in S, which is capable of being implemented in R without modifications.

R provides a wide variety of statistical models (linear and nonlinear models, classical statistical tests, time series analysis, classification, grouping, etc.) and graphic techniques, which makes it a very extensible and manipulative tool on the part of user.

One of R’s strengths is the ease with which publication graphics can be produced with good quality and design, including mathematical symbols and formulas where necessary.

R is available as Free Software under the terms of the GNU General Public License of the Free Software Foundation in the form of source code. It compiles and runs on a wide variety of UNIX platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

For more information about R and its capabilities refer to the official R-project page .

Enable R in Power BI

As a first step to have R in Power BI available, you will be performing your installation. As it is a free software, it can be obtained from multiple sources and repositories; Following the recommendations of Microsoft, it is recommended to obtain the executables from the Revolution Open page or the CRAN repository .

Installation from Revolution Open:

Download the latest stable version available:

Image 1.- Download R from Revolution Open.

Once the msi is downloaded, execute it and follow the following steps (marked with the cursor and the steps):

Image 2.- Installer of R.

Once the installer is started, accept the terms and conditions and continue the installation

Image 3.- License information for R.

This step is optional, although Intel MKL download is recommended , because it improves performance of calculations and measurements. If you want to download the component, leave the Install Math kernel Library (Intel ® MKL) option checked, otherwise uncheck the option and continue with the installation.

Image 3.- Window to select the installation of MKL.

If you chose to download MKL , accept the terms and conditions and proceed with the installation.

Image 4.- License Agreement for MKL.

Specify the destination folder of the installation (it is recommended to leave the one that comes by default, so that Power BI can automatically detect it)

Image 5.- Installation path of R.

After completing the previous steps, all the necessary options for the installation are configured.

Image 6.- Start of the installation process of R.

Once the installation is complete, close the installation dialog.

Image 7.- Completion of the installation of R

Referencing R from Power BI

Upon completion of the installation, Power BI will automatically identify the installed installation (as long as the destination folder has been left pre-defined). To verify the reference or change the destination folder, perform the following steps: 

·        Log in to Power BI Desktop.

Image 8.- Access to Power BI Desktop.

· Once the application go to File (File) -> Options and Settings -> Options .      

Image 9.- Access to the options of Power BI.

· Within the options go to the R script tab . In the tab we can see the destination folder of the R-script installation.      

• Within the options go to the R script tab. In the tab we can see the destination folder of the installation of Rs

Import modules

The default installation of R will include certain modules, but it will be imperative to get extra functionality. These imports can be made directly from the R query that is running through the use of the command:

Install.packages (‘[NAME MODULE]’)

All available R modules can be found in Available CRAN Packages By Name

Permission Issues

It is common to encounter permission problems when trying to install R modules directly from Power BI. Thesepermission errors are due to the fact that the library folder (folder in which imported module references are saved) is not enabled for writing by default. To resolve the incident, we must modify the security of the folder and enable the writing of the same.

Modules and sentence require

R queries often include require statements , which specify that to execute the contiguous portion of code it is necessary to use one or more specific modules.

In the following example, the require statement is used to import the timeDate module (module that enables date handling features).

Image 11.- Example of use of require to import a module.

If the required module does not exist or has not been imported, Power BI will display the following error in executing the script.

Image 12.- Error in Power BI if the module is not found.

In order to solve the incident it is necessary to add the installation statement of the module prior to the requestrequire statement , as shown below.

Image 13.- Solution to the error that occurs when the module is not found.

Using R-script

The R language can be used in Power BI for various functionalities, either to model data sources, create visualizations or simply create a zero data source from R commands. The possibility of using this language is what gives it a flexibility Extra to Power BI and is what most captivates programmers when it comes to seeing Power BI as something more than a reporting tool. 

R-script as input source

Within the data sources options is the option to use an R-script query. It is integrated by default into PowerBI Desktop and can be accessed from the Get Data option -> More

Image 14.- Access to the option to use an R query as data source of Power BI.

In the available sources we will have the option R script.

Image 15.- Data source R in Power BI.

Once the option is chosen we will have a box where you can specify the query R.

Image 16.- Window to specify the query R.

R script in information modeling

Power BI offers the possibility to model information from any source through the use of R. This option is available in the Power BI Queries section under the Transform tab . Starting from a report with a certain stipulated origin:

·        We must access Edit Querie:

Image 17.- Access to the Edit Query option.

· Under the Transform tab , in the Scripts section , we will find the option R      

Image 18.- Run R Script option on the Transform tab.

R script to create visuals

Without a doubt one of the most used options, due to the characteristics of the language, is to use it to create personalized visualizations within Power BI. The use of R within Power BI for visuals is very simple, you just have to choose the R visualization type and specify the data source, so that an embedded R IDE is enabled in the Power BI Desktop page.

Image 19.- Use of R for visual representations in Power BI.

Practical R-script applications in Power BI

Many examples can be found in the Power BI forum . We will focus on two personal experiences we have had, the first one when modeling text file entries with random formats over time and the use of R to export large volumes of data from a visual to a csv (Power BI only allows the export of up to 1500 entries)

Modeling a TXT

In this case we will reflect how to obtain information from a text file that generates an ERP system, which does not follow a pre-modeling, but rather the structure of the file depends on the amount of information exported. In order to take the entry to a specific format and thus create the necessary structures to create the requested reports, it was necessary to use scripts in R.

·        The first thing was to import the text file as data source and proceeded to edit the query through the insertion of an R-script.

Image 20.- Import of the text file with R-Script.

·        Power BI offers the input table as a dataset, which can be accessed via the dataset invocation . Through this the first thing to do is to pass the dataset to a frame which makes it possible to modify the columns and cells of the table as an array. This can be done through the use of the sentence:

Df1 <- data.frame (dataset)

In this step, we have converted the table into a frame which is stored in the variable df1. The next step will be to model the input information through the use of the gsub formula .

The gsub formula allows through the use of regex, replace sub strings of a string. The formula applies individually to each of the cells in the specified column as input. The format of the formula is as follows:

Gsub (“[REGEX PATTERN]”, “[REPLACEMENT STRING / PATTERN]”, [INPUT COLUMN])

Once the entries of the columns VALUE (accessible through df1 $ VALUE), QOH (accessible through df1 $ QOH) andPART (accessible through df1 $ PART) are modeled, we specify the return value df1 , so That the result table is generated with the inputs modeled and ready to be used in the model.

Exporting large records to csv

Power BI offers us the possibility of exporting to a csv the inputs used in a visual (the export will contain the filters and transformations made) in order to manipulate the entries outside the scope of the program.

To export the entries of a visual we must position ourselves on the top right corner and select the option Export Data

Image 21.- Export of the data of a visual.

This functionality has a limitation of records to export, if it exceeds the allowed threshold, we will be presented with the following message indicating that the information will be exported with errors.

Image 22.- Error shown when the limit of data to be exported is exceeded.

In these cases we can use an R script to export up to 150000 records to a csv. To do this we must choose the R visualization and use the following script.

Write.table (dataset, file = “[PATH OF FILE]”, sep = “[SEPARATOR]”, row.names = FALSE)
plot (dataset); conclusion
 

This is just a starting point to demonstrate the flexibility offered by Power BI through this language, we hope that this will motivate you to continue researching and inventing new functionalities with the R language and its application in Power BI.