I’ve worked with many data analysts in my career, many times they have been my customers. They have been either the user of BI system I have developed for them, or they were looking for a quick and easy way of analyzing the existing data. For database professionals the best source of data is always from a database. However, in real world there are many data sources just sitting on an Excel file or series of CSV or text files in a shared folder. Data Analysts should be able to do a quick data analysis with this data. Power BI made things simple and easy.
Before going a step further, I have to mention that It would be always great to have a centralized data warehouse for all data that any reporting or analysis is sourcing from that. There is no doubt in it. But I have seen that in many cases the low budget defined for the data analysis solution or the short time span doesn’t allow such system to be created. So there has to be a middle way of quick and easy analysis for serving such requirements.
In this blog post I like to show you an example of getting data from a transactional SQL Server database. I want to work with Pubs database. Pubs is the database for book authoring and publications and their sales information. You can download Pubs database from here: http://www.microsoft.com/en-us/download/details.aspx?id=23654 . In future blog posts I’ll show you another example of working with CSV files, with an interesting data set.
Power BI can connect to heaps of data sources as well as SQL Server. When you use Get Data tab in Power BI Desktop you can simply choose SQL Server database as a source. This would usually ask for server and database. after entering valid information and also required credential then you will be directed to Navigator. In the Navigator you can select the database as you want (pubs in this example), and list of tables or views you want to be used for data analysis.
For this example, let’s get stores, sales and titles. We want to do an analysis later on for titles sold based on the store information. With choosing any of the tables you will see their preview of the columns and data values in the main pane.
After selecting tables, you can click on Load. The data will be loaded into the model, you can see some messages about number of data rows loaded. Let’s visualize the data. Yes, you can do it now! Without any hassles!
In the right hand pane expand titles, and select title. This will generate a list of titles in the main pane. Then from the Fields section under sales select Qty. this will show the quantity of each title sold in a tabular style view. In the Visualization Section click on the Clustered Bar Chart, and the table will be transformed into a nice bar chart. Easy, isn’t it?
Behind the Scene
Behind the scene for this nice bar chart is a set of queries and their relationships. Let’s look at them one by one. First Click on Edit Queries in the main menu. The Query Editor window will be open. Query Editor Window is the graphical user interface for Power Query element embedded in Power BI. You can do every data manipulation you want in this editor and prepare your data for later analysis.
In the query editor window as you see there are 4 main sections;
List of queries
Main menu for data transformations, adding columns and etc.
Query setting pane for name of the query, and list of applied steps on it
Main pane showing a view of data rows for the selected query on selected step
When we create a connection to a SQL Server database, Power BI is intelligence enough to understand the underlying relationship between data tables. At this moment we don’t want to make any changes in the Query Editor window here, so simply just click on Close & Apply button in the top menu in left hand side.
Now click on the relationship tab (in the left hand side down the latest icon), you will see that Power BI (as I mentioned earlier) fetched the relationship of tables automatically from the database. If you have transactional database tables they are usually related to each other with PK and FK and you don’t need to create relationship yourself. However if you want to do so or if you want to change the existing relationship you can do it in this window. Leave everything as it is at the moment.
Changes into Model
Let’s go through an example of a change into modelling. You can see that we have the Qty sold in sales query, but we don’t have the Sales Amount or the revenue of it. Revenue or the Sales amount is equal to quantity sold multiplied by the price of title (keep it simple for now, I know that there might be discounts and many other scenarios in between, but for simplicity we just go through a basic example of it). Price is field in titles query. So we need to add a Calculated Measure for Sales Amount then.
Go to Data tab (the second tab in let hand side of Power BI Desktop), then click on Sales query. And then add a column (the option is in the menu bar of this section). In the calculation pane change name of the column to Sales_Amount and write the expression as:
Sales_Amount = sales[qty]*RELATED(titles[price])
RELATED is a DAX function that brings a column from another related table to this table (the relationship is based on what defined in the relationship tab), you just need to mention which table and which field.
And now you have the sales amount column. Go back to report tab, and change the bar chart to show Sales_Amount instead of Qty.
Changes into Data and Query
Sometimes you get another part of the data later on, or you want to do some data transformation in the data set. Let’s say we want to add some more insight to this data set; we want to see Author information as well. And we have Authors table in the database, we have also another table called titleauthor, this is a many to many bridge table that saved information about authors who published titles. We might have more than one author for a title, or an author who wrote more than one title.
Go to Query Editor, and then add another set of data from the same SQL Server database, then choose two tables: authors, and titleauthor. Then after bringing these two dataset you would see five queries now in the query editor window. Click on Authors, you can see a view of data rows, there are columns for author first name and also last name, but it would be handy to have a column with their full name. So let’s create one.
Go to Add Column tab, and then click on Add Custom Column. Change the column name to “Author Full Name”, and write the expression as:
Note that this expression is written in M or Power Query Formula Language. This is different from DAX that you’ve experience earlier. DAX is data analysis expression language, which is better to be used for data modeling scenarios (such as adding year to date measure and so on), while M is data transformation language which is useful when you want to apply data transformations (such as joins, lookups, aggregate, concatenation, and many other data transformations). Always remember that you have three steps when you are working with Power BI;
Preparing the data: You should be using M and Query Editor to prepare data to be loaded into the model.
Modelling the data: You should be using DAX and Data and Relationship tabs to build the model that serve the report.
Visualizing the data: You should visualize data based on what model provided for you.
Now let’s apply changes and go to Power BI Desktop main window. Go to relationship tab first. You will see the two new tables added with their relationship. You probably won’t show titleauthor table into your report, because this is just for creating relationship, so you can hide it from report. Simply right click on it and hide it.
Go back to Report tab, and create a column chart with Author Full Name and the Sales amount. You can even do some formatting in the chart easily. I’ve created a stacked column chart with legend as type (from titles table). Then I set the legend to be down in the bottom and also center. I’ve also changed the title of chart and set a background color for it. And here is what I see then.
So feel free! And create whatever you want, there are many options in the formatting area of each chart and visualization element.
Long story short, I want to slice and dice this data by the publication year. In titles table there is a publication date. So I have to fetch year part of it. You can add a Year column with going to Query Editor, clicking on titles query, then selecting the pubdate column. Now click on Add column menu tab, and then add a date column of year type, and add year. This will automatically generate year of the selected date. Easy, in just few clicks!
Because Year is a numeric column, the default behavior of Power BI for numeric column is considering that as decimal value with decimal point and also summarize-able! Which is not what we want for Year. So go to Data Tab, and change data type of the Year column to Whole Number, and change Default Summarization to Do Not Summarize.
Now go to Report tab, and create Slicer (you will find it in visualization pane) with Year. You can also visualize the total sales based on publication year as well with a donut chart. Donut chart and pie chart won’t be great visualization when you have too many items with close values. But here we have only two years with very different sales amount. So here is what I got so far;
This can go on and on, I can now add a map to this report showing the sales amount by geo-location information of stores. In Stores table we have city, and address, and state. However none of these are enough by its own, because there might be multiple cities with the same name in the world! So I have to concatenate them all to build a full address. Here is how I build the full address column in Query Editor Window.
I’ve selected stores query, and then columns one by one (from address to state), and then added a column as merged column, I’ve set separator as comma, and named the column as Full Address. Now when I build a map with Full Address as the location and Sales Amount as the value I see exact locations.
I don’t want to go through all other steps that I’ve done. I’ve just added a Card visualization to show the grand total of sales in my database. And also changed my first chart (titles and sales amount) to a TreeMap with type (from titles) as Group, and Title itself as details. So at the end of the day I’ve build this report:
It was easy to build and I’ve built it on top of a normal database. I believe you agree with me. Now it’s your time to go through your own database and create visualization as you want, play with it, and let me know if you had any questions!
Oh, and almost I’ve forgot you can deploy this report into PowerBI.com website for free! You just need a Power BI account and then you are good to go.