I've been making regression models for quite a while now, I am comfortable with the idea that $$y = \beta_0 +\beta_1x^2 \text{ is linear and } y =\beta_0 +\beta_1^2x \text{ is not}$$ but I've never really understood why that is. In my mind, we're assuming $\beta_1$ is a fixed number, so why can't we just say $\beta_{new} = \beta_1^2$ and go from there. Additionally, when would it come up that I have a quadratic regression coefficient? Overall, I'm really curious about why this matters at all.
-
In your example, it seems $\beta_{new}$ is being forced to be non-negative, so a regression method which could produce negative estimates might not work. – Henry Jan 04 '24 at 15:37
3 Answers
In general, linearity of some function is defined (see Wikipedia) by the function satisfying 2 conditions:
- Additivity: $f(a+b)=f(a)+f(b)$
- Homogeneity of degree 1: $f(ac) = cf(a) \quad \forall c$.
When speaking about linear regression model, the linearity is meant as a linearity in parameters ($\beta$).
So it can be seen that the function $\beta_0 + \beta_1 x^2$ satisfies the definition of linearity with respect to the parameter $\beta_1$, while the equation $\beta_0 + \beta_1^{2} x$ does not. In both cases, we don't care about the linearity with respect to $x$ or $x^2$, as these are simple some constants to us.
- 512
$\beta_1$ isn't fixed until we estimate it. Once we estimate it, we could, of course, square it, but then it would not work in the equation.
I can't think, off the top of my head, why you would want this (and some quick Googling was not effective), but it is sometimes the case that you want the parameter as an exponent $y = \beta_0 + x_1^{\beta_1}$, I've see that. Here is an example from Penn State class Stat 462.
Wikipedia gives an example where $f(x, \beta) = \frac{\beta_1x}{\beta_2} + x$ that is apparently used in enzyme kinetics (I don't even know what "enzyme kinetics" is).
- 119,535
- 36
- 175
- 383
linear neural networks are an example where you generate polynomials in the coefficients.https://arxiv.org/abs/1312.6120. In that paper they explain that though representationally equivalent to a linear regression, the optimisation is non convex and shows some of the same difficulties in optimisation (ie finding a solution) as nonlinear neural networks
Note linear neural networks are proposed in the paper as a way of studying the optimisation behaviour of (nonlinear) neural networks; they have the same representation power as linear regression.
So to answer your question, you cannot find the solution using standard least squares linear algebra when your coefficients are polynomials.
Note that if your coefficient is $\beta_1^2$ then it is forced to be positive, so in your example a solution may not even exist.
- 6,743