As a data mashup, visualization and analytics tool, Power BI provides a lot of power and flexibility with regards to ingesting, transforming, visualizing and gaining insights from your data. However, in order to provide so much power and flexibiliy, Power BI integrates a number of different data languages including DAX, "M", SQL, MDX and R. Because of this, it can be confusing for new users to figure out when and where to use what language to most effectively accomplish their desired outcome. This article covers the various languages that the general business user (non-developer) will encounter when using Power BI.
If you are new to Power BI, the first language you will likely consciously encounter and use is Data Analysis Expressions (DAX). DAX is a library of functions and operators that can be combined to build formulas and expressions. DAX is very Excel formula like, meaning that it is the easiest for beginners to pickup and understand.
DAX is used after your data has been imported into your data model.
DAX can be used with Microsoft SQL Server Analysis Services (SSAS) Tabular models , Power Pivot in Excel, and Power BI Designer.
Use DAX to create Custom Columns and Measures within a data model.
The only real "tricky" thing about DAX is getting used to thinking about context. In other words, the DAX formula or expression that you create may give different results depending upon the context in which that formula or expression runs. This makes DAX very powerful but can add an element of confusion and forces a different way of thinking about solutions. For example, take a simple DAX measure like the following:
Total Sales = SUM([Sales])
Placing this measure into a column chart using Department as the x-axis will give different results versus placing the same measure in a line chart using Year as the x-axis.
To learn more, see the Data Analysis Expressions (DAX) Reference.
1 - Being very similar to Excel formulas, DAX has a sense of familiarity and is thus the easiest language for beginners to start using.
Power Query Formula Language "M"
The Power Query Formula Language (informally known as "M") is a powerful mashup query language optimized for building queries that mashup data. It is a functional, case sensitive language similar to F#. M will likely be the first language that new users actually use although it is unlikely that they are aware of the fact that they are using it. The reason is that when users are importing data into their data model, which is generally the first step in using Power BI Designer, the queries are most likely using M in the background. However, the Query Editor provides a powerful graphical interface that allows users to perform complex data mashups without ever having to look at the M code that the Query Editor is building behind the scenes.
M is used during your data import.
M can be used with Power Query in Excel and in the Query Editor of Power BI Designer. To see the M code behind a query in Power BI Designer, use the Advanced Editor in the Query Editor. The code displayed in the Advanced Editor is most likely M code. The exception being if you create a query based upon R.
Use M to transform and cleanup your data as well as create custom columns during data import.
M includes a common set of function definitions available for use from an expression called the standard library. M functions can be evaluated in the Query Editor in Power Query, by using the formula bar or advanced editor.
Here are some example M functions:
To learn more, see the Power Query Formula Language (informally known as "M").
3 - Unless you are already familiar with F#, and odds are you are not (see graphic of language popularity), you will likely find M a moderately challenging language to pickup and learn. M simply isn't "like" a lot of other languages, you have to think a little differently when using it rather than more traditional languages like Java, C, etc. I have found it most analogous in my background of programming to something like LISP, which also focuses heavily on sets/lists.
Regardless of flavor, T-SQL, PL-SQL, etc., the SQL language is a fairly old and standard special purpose programming language used to manage and manipulate data within relational databases.
SQL is used before your data import.
In general, SQL is pervasive throughout relational databases and can be used, with some dialect differences, with any relational database system. In Power BI, you can use SQL when using a data source connector for a relational database such as SQL Server, MySQL, PostGres, IBM DB2. From within the connector interface, choose "Advanced Options" and there will be an optional area into which you can paste SQL code.
Use SQL to pre-filter, join or transform your data before you start working with it. It will generally be more efficient to perform a complex join or selection of columns using SQL versus in M or within the data model.
In general, one will be using SQL Queries versus SQL Statements with Power BI. Queries retrieve data based on a specific criteria versus Statements, which generally have a persisent effect on the schema and data. The exception to this is when performing an EXEC function against a stored parameter. A fairly simple SQL query is something like the following:
SELECT column1,column2 FROM table1 WHERE column3='Some value';
This would return all the columns for "column1" and "column2" for all rows in the table "table1" where the value for those rows in "column3" equaled "Some value".
To learn more, see W3Schools SQL Tutorial.
2 - SQL is fairly pervasive and has a long history so it is generally familiar to a lot of people.
Multi-Dimensional Expressions (MDX) is the standard language defined by Microsoft to query multidimensional cubes, such as those created with Microsoft SQL Server Analysis Services. MDX might at first appear to be similar to SQL, however, MDX is a completely different language. Whereas SQL was designed to query relational databases composed of tables where data is organized in terms of columns and rows, MDX was designed to query OLAP cubes where data is organized in terms of measures, dimensions, hierarchies, and levels.
MDX is used before your data import.
MDX is used when querying SQL Server Analysis Services (SSAS), Excel when connecting to multidimensional cubes and in Power BI Designer. In Power BI Designer, you can use MDX when using a data source connector for a multidimensional cube such SSAS. Similar to how to use SQL above, from within the connector interface, choose "Advanced Options" and there will be an optional area into which you can paste MDX code.
Use SQL to pre-filter and join your data before you start working with it. It will generally be more efficient to perform a complex join or selection of columns using MDX versus in M or within the data model.
To learn more about MDX, see Microsoft's Multidimensional Data Access documentation
3 - MDX is more difficult to wrap your head around than SQL and is somewhat of a unique language as far as languages go.
While the other languages covered here are more special purpose programming languages, R, by contrast, is more of a full fledged, traditional programming language. Historically, R's use was very limited to a select few data scientists, data analysts and statisticians for doing heavy statistical computing and data visualizations. However, recently, R has become much more popular and is integrated into many of Microsoft's data platform components including SQL Server 2016 and Power BI.
R is unique within Power BI as it can be used before, during and after your import of data into your data model.
As a full fledged programming language, R has its own runtime and environment for coding. Therefore, you can use R to prep data before you ever consider importing it into your data model. R can also be used during data import as entire queries or steps within queries can be written in R. After importing the data into your model, R can be used in R visuals. To use R as the basis for an entire query, in Power BI Designer choose Get Data | More | Other | R Script. In the Power BI Query Editor, on the Transform tab, "Run R Script" is the last icon on the right and will insert an R script into your query as a step. When on the Report tab within Power BI Designer, one of the icons in the last row is the "R script visual". Perhaps unsurprisingly, its icon is a capital "R".
The R language has a huge number of modules that add a tremendous amount of capabilities to Power BI. There are some things, such as connecting Power BI to Azure Machine Learning, that can currently only be done in R. R's library of visualizations adds hundreds of possible visuals to Power BI's core set of 27 visuals. The uses for R are nearly endless.
To learn more, see An Introduction to R.
4 - R gets the highest ranking of the programming languages listed here based largely upon the facts that it is a full-fledged programming language as well as the vastness of the language itself when including all of the possible custom modules.
This article would be remiss if not at least mention another highly popular full-fledged, general purpose programming language used among data scientists and data wranglers, Python. Consistently listed annually in the top 5 for most popular programming languages and given its popularity among data scientists and data wranglers (essentially as popular as R), it is an obvious omission from Power BI's data language capabilities. Currently, Power BI does not support or otherwise integrate with Python at all and unlike Microsoft's rapid adoption of R throughout its data platform products, currently Microsoft's data platform does not include any meaningful integration of Python. If I were a betting man, I would bet that this will change at some point simply based upon Microsoft's recent adoption of open source languages like R and the popularity of Python within the data space. So, my recommendation is that if you have not already, you might want to at least write a Python "Hello world" program sooner rather than later.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.