As a data mashup, visualization and analytics tool, Power BI provides a lot of power and flexibility with regards to ingesting, transforming, visualizing and gaining insights from your data. However, in order to provide so much power and flexibiliy, Power BI integrates a number of different data languages including DAX, "M", SQL, MDX and R. Because of this, it can be confusing for new users to figure out when and where to use what language to most effectively accomplish their desired outcome. This article covers the various languages that the general business user (non-developer) will encounter when using Power BI.
If you are new to Power BI, the first language you will likely consciously encounter and use is Data Analysis Expressions (DAX). DAX is a library of functions and operators that can be combined to build formulas and expressions. DAX is very Excel formula like, meaning that it is the easiest for beginners to pickup and understand.
DAX is used afteryour data has been imported into your data model.
DAX can be used with Microsoft SQL Server Analysis Services (SSAS) Tabular models , Power Pivot in Excel, and Power BI Designer.
Use DAX to create Custom Columns and Measures within a data model.
The only real "tricky" thing about DAX is getting used to thinking about context. In other words, the DAX formula or expression that you create may give different results depending upon the context in which that formula or expression runs. This makes DAX very powerful but can add an element of confusion and forces a different way of thinking about solutions. For example, take a simple DAX measure like the following:
Total Sales = SUM([Sales])
Placing this measure into a column chart using Department as the x-axis will give different results versus placing the same measure in a line chart using Year as the x-axis.
1 - Being very similar to Excel formulas, DAX has a sense of familiarity and is thus the easiest language for beginners to start using.
Power Query Formula Language "M"
The Power Query Formula Language (informally known as "M") is a powerful mashup query language optimized for building queries that mashup data. It is a functional, case sensitive language similar to F#. M will likely be the first language that new users actually use although it is unlikely that they are aware of the fact that they are using it. The reason is that when users are importing data into their data model, which is generally the first step in using Power BI Designer, the queries are most likely using M in the background. However, the Query Editor provides a powerful graphical interface that allows users to perform complex data mashups without ever having to look at the M code that the Query Editor is building behind the scenes.
M is used during your data import.
M can be used with Power Query in Excel and in the Query Editor of Power BI Designer. To see the M code behind a query in Power BI Designer, use the Advanced Editor in the Query Editor. The code displayed in the Advanced Editor is most likely M code. The exception being if you create a query based upon R.
Use M to transform and cleanup your data as well as create custom columns during data import.
M includes a common set of function definitions available for use from an expression called the standard library. M functions can be evaluated in the Query Editor in Power Query, by using the formula bar or advanced editor.
3 - Unless you are already familiar with F#, and odds are you are not (see graphic of language popularity), you will likely find M a moderately challenging language to pickup and learn. M simply isn't "like" a lot of other languages, you have to think a little differently when using it rather than more traditional languages like Java, C, etc. I have found it most analogous in my background of programming to something like LISP, which also focuses heavily on sets/lists.
Regardless of flavor, T-SQL, PL-SQL, etc., the SQL language is a fairly old and standard special purpose programming language used to manage and manipulate data within relational databases.
SQL is used before your data import.
In general, SQL is pervasive throughout relational databases and can be used, with some dialect differences, with any relational database system. In Power BI, you can use SQL when using a data source connector for a relational database such as SQL Server, MySQL, PostGres, IBM DB2. From within the connector interface, choose "Advanced Options" and there will be an optional area into which you can paste SQL code.
Use SQL to pre-filter, join or transform your data before you start working with it. It will generally be more efficient to perform a complex join or selection of columns using SQL versus in M or within the data model.
In general, one will be using SQL Queries versus SQL Statements with Power BI. Queries retrieve data based on a specific criteria versus Statements, which generally have a persisent effect on the schema and data. The exception to this is when performing an EXEC function against a stored parameter. A fairly simple SQL query is something like the following:
SELECT column1,column2 FROM table1 WHERE column3='Some value';
This would return all the columns for "column1" and "column2" for all rows in the table "table1" where the value for those rows in "column3" equaled "Some value".
2 - SQL is fairly pervasive and has a long history so it is generally familiar to a lot of people.
Multi-Dimensional Expressions (MDX) is the standard language defined by Microsoft to query multidimensional cubes, such as those created with Microsoft SQL Server Analysis Services. MDX might at first appear to be similar to SQL, however, MDX is a completely different language. Whereas SQL was designed to query relational databases composed of tables where data is organized in terms of columns and rows, MDX was designed to query OLAP cubes where data is organized in terms of measures, dimensions, hierarchies, and levels.
MDX is used before your data import.
MDX is used when querying SQL Server Analysis Services (SSAS), Excel when connecting to multidimensional cubes and in Power BI Designer. In Power BI Designer, you can use MDX when using a data source connector for a multidimensional cube such SSAS. Similar to how to use SQL above, from within the connector interface, choose "Advanced Options" and there will be an optional area into which you can paste MDX code.
Use SQL to pre-filter and join your data before you start working with it. It will generally be more efficient to perform a complex join or selection of columns using MDX versus in M or within the data model.
3 - MDX is more difficult to wrap your head around than SQL and is somewhat of a unique language as far as languages go.
While the other languages covered here are more special purpose programming languages, R, by contrast, is more of a full fledged, traditional programming language. Historically, R's use was very limited to a select few data scientists, data analysts and statisticians for doing heavy statistical computing and data visualizations. However, recently, R has become much more popular and is integrated into many of Microsoft's data platform components including SQL Server 2016 and Power BI.
R can be used before, duringand afteryour import of data into your data model.
As a full fledged programming language, R has its own runtime and environment for coding. Therefore, you can use R to prep data before you ever consider importing it into your data model. R can also be used during data import as entire queries or steps within queries can be written in R. After importing the data into your model, R can be used in R visuals. To use R as the basis for an entire query, in Power BI choose Get Data | More | Other | R Script. In the Power BI Query Editor, on the Transform tab, "Run R Script" is the next to last icon on the right and will insert an R script into your query as a step. When on the Report tab within Power BI, one of the icons in the next to last row is the "R script visual". Perhaps unsurprisingly, its icon is a capital "R".
The R language has a huge number of modules that add a tremendous amount of capabilities to Power BI. R's library of visualizations adds hundreds of possible visuals to Power BI's core set of visuals. The uses for R are nearly endless.
4 - R gets a high ranking here based largely upon the facts that it is a full-fledged programming language as well as the vastness of the language itself when including all of the possible custom modules.
The original version of this article phophesized that support for Python would be coming to Power BI. Obviously this has come to pass.
While the other languages covered here are more special purpose programming languages, Python, by contrast, is more of a full fledged, traditional programming language. Python is a highly popular full-fledged, general purpose programming language used among data scientists and data wranglers. Python is consistently listed annually in the top 5 for most popular programming languages.
Python can be used before, duringand afteryour import of data into your data model.
As a full fledged programming language, Python has its own runtime and environment for coding. Therefore, you can use Python to prep data before you ever consider importing it into your data model. Python can also be used during data import as entire queries or steps within queries can be written in Python. After importing the data into your model, Python can be used in Python visuals. To use Python as the basis for an entire query, in Power BI choose Get Data | More | Other | Python Script. In the Power BI Query Editor, on the Transform tab, "Run Python Script" is the last icon on the right and will insert a Python script into your query as a step. When on the Report tab within Power BI, one of the icons in the next to last row is the "Python script visual". Perhaps unsurprisingly, its icon is "Py".
The Python language has a huge number of modules that add a tremendous amount of capabilities to Power BI. Python's library of visualizations adds hundreds of possible visuals to Power BI's core set of visuals. The uses for Python are nearly endless.
4 - Python gets a high ranking based largely upon the facts that it is a full-fledged programming language as well as the vastness of the language itself when including all of the possible custom modules.
This article would be remiss if not at least mention another highly popular full-fledged, general purpose programming language used among data scientists and data wranglers, Python. Consistently listed annually in the top 5 for most popular programming languages and given its popularity among data scientists and data wranglers (essentially as popular as R), it is an obvious omission from Power BI's data language capabilities. Currently, Power BI does not support or otherwise integrate with Python at all and unlike Microsoft's rapid adoption of R throughout its data platform products, currently Microsoft's data platform does not include any meaningful integration of Python. If I were a betting man, I would bet that this will change at some point simply based upon Microsoft's recent adoption of open source languages like R and the popularity of Python within the data space. So, my recommendation is that if you have not already, you might want to at least write a Python "Hello world" program sooner rather than later.