8

A simple sine curve could be written as $\text{amplitude}\cdot\sin(x+\text{phase})$. It can be also written in linear form as $a \cdot \sin(x) + b \cdot \cos(x)$.

I run my analysis with R as:

 fit.lm2 <- lm(temperature~sin(2*pi*Time/366) + cos(2*pi*Time/366))
 summary(fit.lm2)

Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 26.9188 0.1005 267.87 < 2e-16 sin(2 * pi * Time/366) 1.7468 0.1390 12.56 < 2e-16 cos(2 * pi * Time/366) 1.2077 0.1485 8.13 6.94e-11

The general form of the equation is $y = b_0 + b_1x_1 + b_2x_2$, thus, in my case, it can be written as $y = 26.9188x_0 + 1.7468x_1 + 1.2077x_2$.

If I were to write it back to the simple sine form, $\text{amplitude}\cdot\sin(x+\text{phase})$, is it correct to say that:

$\text{amplitude} = b_0 = 26.9188$

$\text{phase} = \arctan\left(\frac{b_1}{b_2}\right)$

Is this the correct way how to do it?

User1865345
  • 8,202
Eddie
  • 295
  • 2
    Assuming daily data, a divisor of 365.25 or so will be nearer the mark than 366 for a period of several years. You should not notice the difference, but still, there it is. – Nick Cox Nov 27 '13 at 16:36
  • Thanks for adding that @Nick Cox. I will experiment it with my dataset. – Eddie Nov 27 '13 at 22:41

1 Answers1

12

The fit is

$$y = 26.9188 + 1.7468\sin(x) + 1.2077\cos(x).$$

Consider a general (non-zero) linear combination $\alpha \sin(x) + \beta\cos(x).$ Viewing $(\alpha, \beta)$ as a vector and writing it in polar coordinates $(r, \phi)$ yields

$$\alpha = r \cos(\phi),\quad \beta = r \sin(\phi), \quad r = \sqrt{\alpha^2+\beta^2}$$

whence

$$\alpha\sin(x) + \beta\cos(x) = r\cos(\phi)\sin(x) + r\sin(\phi)\cos(x) = r\sin(x+\phi).$$

$r$ is the amplitude and $\phi$ is the phase. In the present case $\alpha=1.7468$ and $ \beta=1.2077$ entailing

$$r = \sqrt{ 1.7468^2+1.2077^2 } = 2.123641$$

and

$$\phi = \arctan(\beta, \alpha) = 0.6049163.$$

Consequently

$$y = 26.9188 + 2.123641 \sin(x + 0.6049163).$$

This can be checked by plotting. Here is R code to do it:

b0 <- coef(fit.lm2)[1]
alpha <- coef(fit.lm2)[2]
beta <- coef(fit.lm2)[3]

r <- sqrt(alpha^2 + beta^2)
phi <- atan2(beta, alpha)

par(mfrow=c(1,2))
curve(b0 + r * sin(x + phi), 0, 2*pi, lwd=3, col="Gray",
      main="Overplotted Graphs", xlab="x", ylab="y")
curve(b0 + alpha * sin(x) + beta * cos(x), lwd=3, lty=3, col="Red", add=TRUE)

curve(b0 + r * sin(x + phi) - (b0 + alpha * sin(x) + beta * cos(x)), 
      0, 2*pi, n=257, lwd=3, col="Gray", main="Difference", xlab="x", y="")

Plots

The two formulas agree to sixteen significant figures in double-precision arithmetic. The difference reflects pseudo-random floating point errors. (Because my data are not exactly the same as the original data, the "difference" plot will differ in its details but will still exhibit only tiny variations.)

whuber
  • 322,774
  • 1
    Thanks a lot for your effort and time for producing a really comprehensive answer. Really appreciate it. – Eddie Nov 27 '13 at 21:12