-1

For the following regression model, $$ y=\hat{\beta}_0+\hat{\beta}_1\cdot|m(a)-m(b)|+\sum_{k=0}^{12}\hat{\beta}_{k} 1[|m(a)-m(b)=k|] $$ The fitting plot not a linear.

Why the blue line (that is fitted using a regression model that assumes a linear trend) is not a linear line?

Hermi
  • 717
  • 3
  • 12

1 Answers1

1

"Linear" regression estimate coefficients in order to explain the response of a variable $Y$ to changes in variables $X_k$ using a linear equation of the form $Y=X\beta$. Ex:

$$y_i = \beta_0 + \beta_1 x_{1, i} + \beta_2 x_{2, i} + \epsilon_i$$

"Linear" refers to the fact that $\mathbb{E}(y_i)$ is defined as a linear combination of the parameters $\beta$, not necessarily $X$, which could be modified depending on your objectives/interpretations of the data. Indeed, if some variables $X_k$ are "transformed" variables derived from your actual variable of interest, then the responses will not be linear, but you are still using a linear regression model. Ex:

\begin{split} y_i &= \beta_0 + \beta_1 x_{1, i} + \beta_2 x_{2, i} + \epsilon_i\\ &= \beta_0 + \beta_1 t_{i}^2 + \beta_2 \sqrt{t_{i}} + \epsilon_i \end{split}

With the equation above, $\mathbb{E}(y_i)$ is defined as a linear combination of parameters $\beta$ and of transformed variables $X_k$. But it is not a linear combination of the actual variable of interest, time $t$, which has a non-linear impact on your dependent variable $Y$.

This is what is happening in your case, where your variables $X_k$ are non-linear functions of time $t$, since you introduced absolute values of time differences and binary variables depending on the time of each observation.

FP0
  • 456
  • Thanks! In Wiki, is says "Linearity. This means that the mean of the response variable is a linear combination of the parameters (regression coefficients) and the predictor variables." in https://en.wikipedia.org/wiki/Linear_regression. What do you mean $E(y_i)$ is a linear combination of $\beta$ and $X$? – Hermi Jul 23 '22 at 14:36
  • 1
    You're welcome!

    Indeed, but in my second equation, the predictors of the linear regression are the variables $X_k$. So these 2 sentences have the same meaning.

    – FP0 Jul 23 '22 at 15:31