Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Microsoft Power BI Community
- Forums
- Get Help with Power BI
- Desktop
- Simple Linear Regression with DAX

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Simple Linear Regression with DAX

11-02-2015
06:21 AM

OK, related to forecasting, I went ahead and followed my own advice and built out a forecasting model in DAX using simple linear regression. Probably can be improved but here is how I did it:

CSV Files:

regression.csv

X,Y

60,3.1

61,3.6

62,3.8

63,4

65,4.1

estimation.csv:

X

64

75

58

In regression table, create the following columns:

X*Y = [X]*[Y]

X*X = [X]*[X]

In regression table, create the following measures:

Count = COUNTAX(ALL('regression'),[X])

SumX = SUMX(ALL('regression'),[X])

SumY = SUMX(ALL('regression'),[Y])

SumX*Y = SUMX(ALL('regression'),[X*Y])

SumX*X = SUMX(ALL('regression'),[X*X])

Slope = ([Count]*[SumX*Y] - [SumX]*[SumY]) / ([Count]*[SumX*X] - [SumX]*[SumX])

Intercept = ([SumY] - [Slope]*[SumX]) / [Count]

In regression and estimation tables create the following column:

Estimate = [Intercept] + [Slope]*[X]

You can now plot your original values and the linear regression estimation values as well as plot your X values for estimation and the linear regression estimates.

Latest book!:

DAX is easy, CALCULATE makes DAX hard...

Solved! Go to Solution.

1 ACCEPTED SOLUTION

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

11-02-2015
07:20 PM

In theory, I believe that you could use any filtering mechanism you wanted, you would not have to use ALL. The issue is that if you don't use some sort of aggregation function like SUMX and try to just use SUM, the process will not work because when you place those measures into a column formula, they become filtered by the row they are in and it jacks everything up (technical term). So, you HAVE to use some sort of FILTER. I used ALL as an example since I theorized that would be a common requirement to do a linear regression using all available data.

It is a good thing to call out, I originally tried to just use COUNT and SUM in my measures and was getting a lot of "Not a number" in my Estimate column until I realized that I needed to use COUNTAX and SUMX instead so it is definitely worth noting. Thanks for calling it out.

I think that a logical next step would be to come up with a Simple Rolling Linear Regression Pattern, that would be pretty spiffy. Probably could do that using a combination on the rolling aggregations patterns on daxpatterns.com and this pattern here.

Latest book!:

DAX is easy, CALCULATE makes DAX hard...

9 REPLIES 9

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

02-03-2022
02:58 PM

Worked like a charm. I just needed a simple trend line with a start X,Y value pair, and an end X,Y value pair in x increments of 100 and this worked perfect. I've been looking for the solution to get out of Excel to compute Y values and this was the solution.

Thanks!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

12-19-2016
08:52 AM

Thank You very much for the solution.

As say a partner ....

"I was following along with this and think I understand the basic calculation. Do you mind providing an example for how X and Y relate to a real world scenario. Am I correct to assume X may be **time (measure agregate months, etc)** and Y a sales $ amount **(measure agregate ... "Amount / personal", ..... )? "**

Any Solution???

Thank you in advance

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

12-22-2015
03:44 PM

Thx for the example!

I was able to use your example to do a simple lineair regression forecast using a calender table. For "X" I used the date numbers (E.g.: 20150101), which is an extra column 'DateNumber' in my date table that displays the date as a number. Because I need future dates in my calender table to plot the forecast against, I had to add an extra FILTER to exclude future years from the sample being used in the calculation.

Count:=COUNTAX(FILTER(ALL('Calender'); YEAR(Calender[DATE]) < YEAR(NOW()));'Calender'[DateNumber])

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

12-09-2015
05:56 AM

I was following along with this and think I understand the basic calculation. Do you mind providing an example for how X and Y relate to a real world scenario. Am I correct to assume X may be time and Y a sales $ amount?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

11-02-2015
12:58 PM

Cool, thanks for the step-by-step.

With using "ALL" in the measures, does that limit filtering based on a date range for the time intelligence functions? I.E. - projecting next quarter based on only last quarter? Or is that intentional based on wanting the max available time horizon for analysis?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

11-02-2015
07:20 PM

In theory, I believe that you could use any filtering mechanism you wanted, you would not have to use ALL. The issue is that if you don't use some sort of aggregation function like SUMX and try to just use SUM, the process will not work because when you place those measures into a column formula, they become filtered by the row they are in and it jacks everything up (technical term). So, you HAVE to use some sort of FILTER. I used ALL as an example since I theorized that would be a common requirement to do a linear regression using all available data.

It is a good thing to call out, I originally tried to just use COUNT and SUM in my measures and was getting a lot of "Not a number" in my Estimate column until I realized that I needed to use COUNTAX and SUMX instead so it is definitely worth noting. Thanks for calling it out.

I think that a logical next step would be to come up with a Simple Rolling Linear Regression Pattern, that would be pretty spiffy. Probably could do that using a combination on the rolling aggregations patterns on daxpatterns.com and this pattern here.

Latest book!:

DAX is easy, CALCULATE makes DAX hard...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

09-19-2016
09:44 PM

That's very useful I think. I want to add a couple of things:

I plotted the sales v profit graph for my data, added the trendline and then, using the results of your DAX, put the intercept and slope values in a Card ... very easy to do I am sure you agree!

To take this a step further, we should also look at the **residuals** so I created a new column for that, which in my case is

Residuals = sales_profit[Sales]-sales_profit[Profit]

I can then easily plot the estimates or fitted values on a new graph on the X axis and the residuals on the Y axis

I haven't done this yet but the next step might be to add another column for **Studentised Residuals**: residual/standard deviation of residuals ... then use that for further analysis.

I am aware that we can do all of this in R in Power BI but I still think you did us all a favour!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

11-03-2015
12:37 AM

Or we can wait for R scripts within power BI and then be able to do any kind of statistical analysis 🙂 but good job.

Featured Topics

Top Solution Authors

User | Count |
---|---|

263 | |

81 | |

71 | |

65 | |

60 |

Top Kudoed Authors

User | Count |
---|---|

330 | |

95 | |

81 | |

76 | |

71 |