How can you derive formula and regression coefficients for a regression model of a form $y(x)= A + B\, x + C\, \cos (2 \pi x) + D\, \sin (2 \pi x)$? I know that there are automatic tools who can do that if I provide the data but I need a formula and a procedure. Thank you in advance.
2 Answers
You simply compute $x_c=\cos(2\pi x)$ and $x_s=\sin(2\pi x)$ and perform a plain multiple linear regression of $y$ on $x, x_c,$ and $x_s$.
That is you supply the original $x$ and the two calculated predictors as if you had three independent variables for your regression, so your now-linear model is:
$$Y = \alpha + \beta x +\gamma x_c + \delta x_s+\varepsilon$$
This same idea applies to any transformation of the predictors. You can fit a regression of the form $y = \beta_0 + \beta_1 s_1(x_1) + \beta_2 s_2(x_2) +...+ \beta_k s_k(x_k)+\varepsilon$ for transformations $s_1$, ... $s_k$ by supplying $s_0(x_1), s_2(x_2), ..., s_k(x_k)$ as predictors.
So for example, $y = \beta_0 + \beta_1 \log(x_1) + \beta_2 \exp(x_1) + \beta_3 (x_2\log x_2) + \beta_4 \sqrt{x_3x_4} +\varepsilon$ would be fitted by supplying $\log(x_1),$ $\exp(x_1),$ $(x_2\log x_2),$ and $\sqrt{x_3x_4}$ as predictors (IVs) to linear regression software.
The regression is just fitted as normal to the new set of predictors and the coefficients are those for the original equation.
See, for example the answer here: regression that creates $x\log(x)$ functions, which details a different specific example.
-
Thank you for answering but unfortunately I still don't know how to solve it.This means that I have to use the least squares method with three predictors instead of one? In the link that you provided the regression curve is calculated with a matlab built in function so I can't see the derivation of the formulas for the regression coefficients and if it is more simple than without transforming the predictors. – Kata Rina Aug 01 '15 at 14:49
-
The answer to your specific problem is in the first sentence above. [The rest of my answer deals with the general case of any transformed predictors. You can safely ignore the rest if you wish.] So yes, exactly as it explicitly states in that first sentence, you supply those 3 predictors to the regression. What are you using to compute your regression? – Glen_b Aug 02 '15 at 00:51
You can find list of methods used for solving of linear regression problems in this article from Do Q Lee:
Numerically efficient methods for solving Least-Squares problems
Most commonly used methods for these kind of problems are:
Normal equations method using Cholesky factorization. It is the fastest method but numerically unstable. Normal equations is basically system of linear equations. You get this system by computing partial derivations using every predictor and setting this partial derivation to zero. This corresponds to finding global minimum of error term.
QR factorization. More accurate and broadly applicable, but may fail when matrix of linear system of equations is nearly rank-deficient.
Singular value decomposition. It is expensive to compute, but is numerically stable and can handle rank deficiency. You can use tool like Matlab to compute SVD of choosen matrix. If you are deploying customized solution you can use software package like LAPACK or its Intel clone which is heavily optimised using x86 assembler and from september 2015 completely free for everyone.
In all three cases you need to find a solution to system of linear equations. There are not analytical formulas for regression coefficient except for very simple cases like for example line fitting.
- 231
I took a look at the question '' Fit a sinusoidal term to a data '' before posting my question but it didn't satisfy what I need because it has only trigonometric elements unlike my curve so I don't think it's a duplicate.
In any case, thank you both for offering help.
– Kata Rina Aug 01 '15 at 14:51