cancel
Showing results for 
Search instead for 
Did you mean: 

Simple Linear Regression

Member
12591 Views
Daniil Member
Member

Re: Simple Linear Regression

@tonymaclaren, it is likely that at the grand total level what you see is not a total, but the Intercept. This is because the following expression evaluates to Intercept:

Intercept + Slope * SELECTEDVALUE ( 'cetdata'[YEAR] )

At the grant total level, there is usually more than one 'cetdata'[YEAR], hence SELECTEDVALUE ( 'cetdata'[YEAR] ) returns BLANK, turning the multiplication into BLANK as well, leaving only Intercept.

 

I updated my blog post to deal with this kind of situations.

 

@rdodworth, glad you found this useful!

tonymaclaren Regular Visitor
Regular Visitor

Re: Simple Linear Regression

Dar Daniil

 

Thanks for the explanation.The same value also appears at the top of the [Estimated] measure, but only if the whole of the data for ('Ctedata'[YEAR])is displayed. is there a modification to eliminate this?

 

Tony

Daniil Member
Member

Re: Simple Linear Regression

Tony, it's a bit difficult to advise anything without seeing your report -- can you share a sample of your data?

tonymaclaren Regular Visitor
Regular Visitor

Re: Simple Linear Regression

Hi Daniil

 

Sure, no problem.What would you like me to supply and how.The full dataset is for 391 years.

 

Tony

Daniil Member
Member

Re: Simple Linear Regression

Wow, 391 years of data sounds exciting :-) If you could reduce your data to the minimum amount with which you could reproduce your formula problems, that would be great.

tonymaclaren Regular Visitor
Regular Visitor

Re: Simple Linear Regression

Thanks Daniil

I have attached a subset of the file that exhibits the exact same problem using the following measure:

Estimated tony Data =
VAR Known =
    FILTER (
        SELECTCOLUMNS (
            ALLSELECTED ( 'Tony_data'[YEAR] ),
            "Known[X]", 'Tony_data'[YEAR],
            "Known[Y]", [Averagetemp]
        ),
        AND (
            NOT ( ISBLANK ( Known[X] ) ),
            NOT ( ISBLANK ( Known[Y] ) )
        )
    )
VAR Count_Items =
    COUNTROWS ( Known )
VAR Sum_X =
    SUMX ( Known, Known[X] )
VAR Sum_X2 =
    SUMX ( Known, Known[X] ^ 2 )
VAR Sum_Y =
    SUMX ( Known, Known[Y] )
VAR Sum_XY =
    SUMX ( Known, Known[X] * Known[Y] )
VAR Average_X =
    AVERAGEX ( Known, Known[X] )
VAR Average_Y =
    AVERAGEX ( Known, Known[Y] )
VAR Slope =
    DIVIDE (
        Count_Items * Sum_XY - Sum_X * Sum_Y,
        Count_Items * Sum_X2 - Sum_X ^ 2
    )
VAR Intercept =
    Average_Y - Slope * Average_X
RETURN
    Intercept + Slope * SELECTEDVALUE ( 'Tony_data'[YEAR])

 

 

Averagetemp = SUM(Tony_data[AverageTemperature])

 

Tony

Daniil Member
Member

Re: Simple Linear Regression

Tony, I can't reproduce the problem, unfortunately. Can you please attach a pbix file and screenshots with the problem highlighted?

 

Also, have you followed my updated blog post? I don't use SELECTEDVALUE anymore: https://xxlbi.com/blog/simple-linear-regression-in-dax/

tonymaclaren Regular Visitor
Regular Visitor

Re: Simple Linear Regression

Hi Daniil

 

Thanks again for all your help. I have attached the whole file because it might interest you anyway. I sholuld stress it is a work in progress and my first use of Power BI. Please keep it to yourself.

 

The page labeled Tony Data with the table in it shows the problem clearly--although it does not reproduce in the chart on that page. It is not really a problem, but it would be nice for us to understand the behaviour.Play around with the date slider and you will notice that if you set the start date to 1960 only you see -32.72 at the top and bottom of the [Estimated Tony Data] column. Other start dates do not exhibit this problem.

 

I will take a look at the blog.

 

Tony

leanh Visitor
Visitor

Re: Simple Linear Regression

Aye Tony, hope my suggestion could work for you with such problem:

 

You can try replacing this part:

 

      RETURN
          Intercept + Slope * SELECTEDVALUE ( 'Tony_data'[YEAR])

 

with this one:

 

      RETURN
          SUMX( Tony_data ,
                      Intercept + Slope * SELECTEDVALUE ( 'Tony_data'[YEAR])
                    )

 

Btw, this one is the very first reply of mine in this place, so HELLO WORLD then :'>

jwjwjwjwjw Regular Visitor
Regular Visitor

Re: Simple Linear Regression

Thanks for the measure Daniil, it is extremely helpful.

 

I am attempting to apply it to a forecasting problem that is relatively simple in excel, but is causing some trouble for me with Dax/PowerBI. I have 19 different categories that I need to run the regression on, and then I need to sum those categories for use on further regressions. Could your measure be modified to accomplish this or do I need to create 19 different measures? 

 

Thanks for any help you could provide.