Missed the introduction to this series? See Become your organization’s strategic advisor by using Machine Learning and Power BI
Reviewing individual business units and benchmarking them against one another is an important process to verify that investment levels are still appropriate. Working as a business controller I frequently facilitated management reviews. The goal then was to assess if teams had the appropriate amounts of resources. A business that was underperforming and that saw few growth opportunities might find their resources reallocated whilst a high growth area might have a business case for incremental investments. To normalize for variations in growth potential and risk various techniques were used. An internal finance function might develop this point of reference in the form of a forecast or “budget”. This number is allocated across all relevant business dimensions: customer segment, vertical, channel, product, geography and time. By then comparing actual results to the budget a single KPI, e.g. Variance to Budget could be used to focus the discussion. The complexity of the underlying assumptions however made the initial analysis time consuming, difficult to quickly understand and less informative of the root cause of the variance. A simple analysis that I found very effective both as a complement and as an input to a large normalization model was time series decomposition. It is easy to perform and explain, and it illustrates many of the opportunities of Machine Learning. We will also use it again later in this series as an input to more complex models.
The basic principles of time series decomposition are that your historic data, e.g. revenue by month, can be broken into a long-term trend, a recurring component and a remainder (anything that was not explained by the previous two). A popular method for this type of decomposition is STL.
Below is an example that I did using R visuals in Power BI. The data covers sales for five stores and six products over a four-year period. All charts in the page below shows the sum of revenue. The first bar chart by store and the second by product. In the top right is the STL analysis. The first line is the original data which is followed by the seasonal component (the amount of variation that comes with the month of the year), the third one is the trendline and the last one shows what the previous two vectors could not explain, i.e. the remainder. It is an additive decomposition, meaning the sum of these three components for a particular historic month will equal the original value (the first line equals the sum of the next three).
In the bottom left a single period’s seasonal values are isolated (the x-axis is the first letter of the month). The last chart finally shows revenue over time which is useful for highlighting or slicing a subset of the data.
These charts were all made with a categorical column and the sum of revenue. The time-series decomposition (top-right) was made with an R-visual using the below script.
Using the report
By clicking on and drilling into the different stores and products we can benchmark the businesses against one another and analyze their trends and seasonality. The trendline is a great way to compare performance between different stores and products with different seasonality. A store that has a positive trendline is doing better than one with a flat or negative trend, even if the latter is showing higher month over month growth. The reason for this is that the trend line takes the two stores’ different seasonality into account. Store 2 for instance has had a positive trend line (third line in the right most plot) since the middle of 2014:
Store 3 on the other hand saw flat followed by a declining trend in that same period:
Imaging that it is the beginning of September 2016 and you are reviewing these two stores. Store 3 would show higher month over month as well as quarter to date growth. By comparing the seasonality and trendline of the two you will see that store 3’s growth is actually just a difference in seasonality and it is still store 2 which outperforms. By clicking on the different stores in the bar chart in the top left you may also notice from the seasonality chart in the bottom left that store 3 is the only one with August being the second-best month and March, which has a large uplift for the other stores is not as impactful.
Average of all stores:
I’ve also found time series decomposition to be very valuable when deciding when to make ad hoc investments, e.g. running a marketing campaign – you probably don’t want it to start just before the end of the products in scope’s season – and to evaluate the efficacy of an investment. If we take the example of a marketing campaign that was run to boost performance of products 3 and 4 you would expect to see initially high positive residual values (not explained by the trend or seasonality) for those products as it kicks-off and depending on how long the effects are that may translate into a higher trend.
This type of break down is non-additive. What I mean with that is the sum of one vector, for instance the seasonality, for store 1 and 2 individually are not identical to the seasonality for these two stores in aggregate. It is therefore less useful to pre-calculate these types of values before bringing them into an analytics tool. With Power BI in combination with R we can interactively explore all tuples/permutations of the business and the R script will automatically re-calculate the relevant values.
How-to recreate the report
R visuals will automatically work in Power BI when the report is published to the service but to get them to work in Power BI Desktop you need to have R installed on your computer. Please see my previous post for details. You may also want an R dedicated script editor, e.g. RStudio (the Windows installation file is the first link under “Installers for Supported Platforms”).
Once you have R installed you can use R visuals just as you would use any other native visual in Power BI with the addition of an R script that you paste in the editor. My example file can be found attached to this post below.
To add an R visual to your report:
1. Add a visual to the canvas that lets you validate that you have the right columns and aggregations, e.g. “Table”. In this case I need to aggregate Revenue to have one and exactly one value per Date. I therefore verify that the aggregation method is “Sum”, and that there are no missing dates nor NULL values.
2. Convert the visual to an R visual by clicking on the R visualization icon
3. Add the script using the R script* editor (visible when you select the R visual directly in the canvas):
## Convert the column "Revenue" in the Power BI input "dataset" into a time series object with the right periodicity (12 for months per year). Here we manually define that the first year is 2013 but we will later calculate that on the fly to not have to update the script each year myts <- ts(dataset[,'Revenue'], start = 2013, frequency= 12) ## Decompose the timeseries object using the Loess function stl <- stl(myts, s.window="periodic", robust = TRUE) ## Plot the result plot(stl, col="blue", main = "Time series decomposition using Loess")
The second R visual can be created by repeating the above steps (or copy/pasting the previous visual) and changing the last row of the R script so that the full script looks like:
## Convert the column "Revenue" in the Power BI input "dataset" into a time series object with the right periodicity (12 for months per year). Here we manually define that the first year is 2013 but we will later calculate that on the fly to not have to update the script each year myts <- ts(dataset[,'Revenue'], start = 2013, frequency= 12) ## Decompose the timeseries object using the Loess function stl <- stl(myts, s.window="periodic", robust = TRUE) ## Plot the result monthplot(stl, choice = "seasonal", cex.axis = 0.8, main = "Seasonality (last period)")
* Third-party programs. This software enables you to obtain software applications from other sources.
Those applications are offered and distributed by third parties under their own license terms.
Microsoft is not developing, distributing or licensing those applications to you, but instead,
as a convenience, enables you to use this software to obtain those applications directly from
the application providers.
By using the software, you acknowledge and agree that you are obtaining the applications directly
from the third-party providers and under separate license terms, and that it is your responsibility to locate,
understand and comply with those license terms.
Microsoft grants you no license rights for third-party software or applications that is obtained using this software.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.