8

The Ljung-Box and Box-Pierce tests make use of the sample autocorrelation $$ r_k = \frac {\sum_{t=k+1}^n a_ta_{t-k}} {\sum_{t=1}^n a_t^2}$$ and the Ljung-Box test exploits the result that $$Var(r_k) = \frac {n-k}{n(n+2)}$$
Here, the lag order is $k$, $n$ is the length of the series, and $a_t$ is the (true) error, and thus not the residual of some preliminary estimation.

In the original paper by Box and Pierce, we find

enter image description here

My question: For those among us who cannot readily show this, how can we establish this result?

Box and Pierce work under the assumption of (jointly and independent) normal innovations $a_t$, so that exact variances are within reach. In particular, the distribution of the sample correlation coefficient should then provide a starting point, as discussed e.g. here, What is the distribution of sample correlation coefficients between two uncorrelated normal variables? and here.

However, I have not been able to use these results to establish that of Box and Pierce, nor have found any other proof.

FWIW, a little simulation suggests that the stated result is indeed quite a bit more accurate than the $1/n$ approximation, with the difference however, as expected, shrinking with $n$:

enter image description here

n1 <- 25
n2 <- 100
max.lag <- 14
autocorrs.small <- replicate(20000, acf(rnorm(n1), lag.max=max.lag, plot=F)$acf[-1])
autocorrs.large <- replicate(20000, acf(rnorm(n2), lag.max=max.lag, plot=F)$acf[-1])

plot(1:max.lag, apply(autocorrs.small, 1, var), ylim=c(0, 0.04), col="brown") points(1:max.lag, apply(autocorrs.large, 1, var), col="green")

lines(1:max.lag, (n1-1:max.lag)/(n1(n1+2)), lwd=2, col="brown") segments(1, 1/n1, max.lag, 1/n1, lwd=2, col="brown", lty=2) lines(1:max.lag, (n2-1:max.lag)/(n2(n2+2)), lwd=2, col="green") segments(1, 1/n2, max.lag, 1/n2, lwd=2, col="green", lty=2)

One difference that comes to mind is that the sample autocorrelation coefficient can only make use of $n-k$ observations in the numerator due to the lags, unlike the standard correlation coefficient. E.g., the last provided link would suggest $Var(r_k)=1/(n-1)$ under independence.

Can anyone see further steps from here?

Note: This question has been asked before, without answer so far.

  • Have you considered using the Bartlett's formula? I would play around it to see what happens. – User1865345 Apr 01 '23 at 11:35
  • Thanks, that had crossed my mind, too. But Bartlett's formula gives the asymptotic variance, doesn't it? See e.g. https://stats.stackexchange.com/questions/610737/maq-show-sqrtn-hat-rhoq-l-oversetd-to-n0-1-2-sum-j-1/610745#610745 Box and Pierce seem to be able to give a finite sample result (under normality). (As B&P also state in their above approximation, as $n$ becomes large, their result collapses to what Bartlett's formula would give asymptotically.) – Christoph Hanck Apr 01 '23 at 13:22
  • 1
    Yes. You are right, Christoph. In situations like these, I consult the books by Kendall and Anderson. I checked Kendall and they dealt in large sample theory and deduced the Bartlett's formula. I will check Anderson. – User1865345 Apr 01 '23 at 13:29
  • 1
    BTW, Christoph, I have left the post's link in the chat; the veterans do peek in the room. In any case, if nothing happens, a bounty could be thought about. On another note, I searched extensively in Anderson but of no avail. – User1865345 Apr 03 '23 at 07:29
  • @User1865345, full references would be nice. Neither a book by Kendall nor one by Anderson rings a bell to me... – Richard Hardy Apr 03 '23 at 07:35
  • 1
  • 1
    There is no particular reason to search in these books; it's that they were the contemporary books of that period extensive enough to be taken as the authority (mainly, I only have access to them right now). @RichardHardy – User1865345 Apr 03 '23 at 07:43
  • 2
    @User1865345, thank you! Appreciate it. – Richard Hardy Apr 03 '23 at 07:45
  • 1
    @User1865345, I now also try a bounty. – Christoph Hanck Apr 03 '23 at 10:26
  • 1
    At this point, I would be able to sleep peacefully if we are able to decipher "readily" done deduction which apparently we are failing to detect and the books didn't bother to cover. @ChristophHanck. – User1865345 Apr 03 '23 at 10:35

2 Answers2

7

Edit (05/07/2023)

When answering this question, I realized the job actually can be done by only summoning Lemma 1 below (hence avoid touching the much more difficult Lemma 2), which will substantially reduce the machinery and calculations. The argument (largely exploiting the independence between $T(X)$ and $r$) may be how authors discovered the variance expression.

With the notations introduced in my old answer, the goal is to prove \begin{align} & E[T_iT_j] = 0, & 1 \leq i \neq j \leq n, \tag{i} \\ & E[T_i^2T_j^2] = \frac{1}{n(n + 2)}, & 1 \leq i \neq j \leq n, \tag{ii} \\ & E[T_i^2T_jT_k] = 0, & i, j, k \text{ distinct}, \tag{iii} \\ & E[T_iT_jT_kT_l] = 0, & i, j, k, l \text{ distinct}. \tag{iv} \end{align}

We assume normality of $X$ from now on. To prove (i), write $X_iX_j = T_iT_j \times r^2$. Since $T_iT_j$ is independent of $r^2$ ($T_iT_j$ is a function of $T(X)$), $E[X_iX_j] = E[T_iT_jr^2] = E[T_iT_j]E[r^2]$. But then $E[X_iX_j] = E[X_i]E[X_j] = 0$ (by normality) and $E[r^2] > 0$ imply that $E[T_iT_j] = 0$. In the same manner, (iii) and (iv) hold.

Similarly, $X_i^2X_j^2 = T_i^2T_j^2 \times r^4$ and independence imply that $E[X_i^2X_j^2] = E[T_i^2T_j^2]E[r^4]$, whence \begin{align} E[T_i^2T_j^2] = \frac{E[X_i^2]E[X_j^2]}{E[r^4]} = \frac{1}{E[r^4]}. \end{align} So it suffices to determine $E[r^4]$, which is straightforward: \begin{align} E[r^4] &= E[(X_1^2 + \cdots + X_n^2)^2] = nE[X_1^4] + 2\binom{n}{2}E[X_1^2X_2^2] \\ &= 3n + n(n - 1) = n(n + 2). \end{align} This completes the proof of (ii).

Old Answer (04/05/2023)

One way to show this result is to use the following two lemmas (Lemma 1 and Lemma 2 are Theorem 1.5.6 and Exercise 1.32 in Aspects of Multivariate Statistical Theory by R. Muirhead respectively):

Lemma 1. If $X$ has an $m$-variate spherical distribution with $P(X = 0) = 0$ and $r = \|X\| = (X'X)^{1/2}, T(X) = \|X\|^{-1}X$, then $T(X)$ is uniformly distributed on $S_m$ and $T(X)$ and $r$ are independent.

Lemma 2. Let $T$ be uniformly distributed on $S_m$ and partition $T$ as $T' = (\mathbf{T_1}' | \mathbf{T_2}')$, where $\mathbf{T_1}$ is $k \times 1$ and $\mathbf{T_2}$ is $(m - k) \times 1$. Then $\mathbf{T_1}$ has density function \begin{align*} f_{\mathbf{T_1}}(u) = \frac{\Gamma(m/2)}{\pi^{k/2}\Gamma[(m - k)/2]}(1 - u'u)^{(m - k)/2 - 1}, \quad 0 < u'u < 1. \tag{1} \end{align*}

Here $S_m$ stands for the unit sphere in $\mathbb{R}^m$: $S_m = \{x \in \mathbb{R}^m: x'x = 1\}$. The proof to Lemma 1 can be found in the referenced text, and the proof to Lemma 2 can be found in this link (NOT EASY!).

With these preparations, now let's attack the problem. Denote $X = (a_1, \ldots, a_n) \sim N_n(0, I_{(n)})$, then by Lemma 1$^\dagger$, $r_k$ can be rewritten as \begin{align*} r_k = T_1T_{k + 1} + T_2T_{k + 2} + \cdots + T_{n - k}T_n, \end{align*} where $T := (T_1, \ldots, T_n)' = X/\|X\|$ has uniform distribution on $S_n$.

By Lemma 2, for $1 \leq i \neq j \leq n$, we have \begin{align*} & E(T_iT_j) = \int_{0 < t_i^2 + t_j^2 < 1}t_it_jf_{(T_i, T_j)}(t_i, t_j)dt_idt_j, \tag{2} \\ & E(T_i^2T_j^2) = \int_{0 < t_i^2 + t_j^2 < 1}t_i^2t_j^2f_{(T_i, T_j)}(t_i, t_j)dt_idt_j, \tag{3} \end{align*} where $f_{(T_i, T_j)}(t_i, t_j)$ is given by $(1)$ with $\mathbf{T_1} = (T_i, T_j)$. To evaluate $(2)$ and $(3)$, apply the polar transformation $t_i = r\cos\theta, t_j = r\sin\theta$, $0 < r < 1, 0 \leq \theta < 2\pi$. It then follows that \begin{align} E(T_iT_j) = \frac{\frac{n}{2} - 1}{\pi}\int_0^1\int_0^{2\pi}r^2\sin\theta\cos\theta(1 - r^2)^{n/2 - 2}rdrd\theta = 0. \tag{4} \end{align} This is because $\int_0^{2\pi}\sin\theta\cos\theta d\theta = 0$.

In addition, it follows by \begin{align} & \int_0^{2\pi}\sin^2\theta\cos^2\theta d\theta = \frac{1}{4}\pi, \\ & \int_0^1 r^5(1 - r^2)^{n/2 - 2}dr = \frac{1}{2}B\left(3, \frac{n}{2} - 1\right) = \frac{1}{(\frac{n}{2} + 1) \times \frac{n}{2} \times (\frac{n}{2} - 1)} \end{align} that \begin{align} E(T_i^2T_j^2) = \frac{\frac{n}{2} - 1}{\pi}\int_0^1\int_0^{2\pi}r^4\sin^2\theta\cos^2\theta(1 - r^2)^{n/2 - 2}rdrd\theta = \frac{1}{n(n + 2)}. \tag{5} \end{align}

To complete the evaluation of cross-product terms from $\operatorname{Var}(r_k)$, it remains to show $E[T_a^2T_bT_c] = 0$ for distinct $a, b, c \in \{1, \ldots, n\}$ and $E[T_aT_bT_cT_d] = 0$ for distinct $a, b, c, d \in \{1, \ldots, n\}$. These calculations are shown as follows.

To calculate $E[T_a^2T_bT_c]$, applying lemma 2 with $\mathbf{T_1} = (T_a, T_b, T_c)$ yields \begin{align*} E(T_a^2T_bT_c) = \int_{0 < t_a^2 + t_b^2 + t_c^2 < 1}t_a^2t_bt_cf_{(T_a, T_b, T_c)}(t_a, t_b, t_c)dt_adt_bdt_c. \tag{6} \end{align*} Under the spherical transformation \begin{align*} & t_a = r\cos(\theta_1), \\ & t_b = r\sin(\theta_1)\cos(\theta_2), \\ & t_c = r\sin(\theta_1)\sin(\theta_2), \end{align*} where $0 < r < 1$, $0 \leq \theta_1 < \pi$, $0 \leq \theta_2 < 2\pi$, the integrand in $(6)$ that includes $\theta_1, \theta_2$ is (after multiplying the Jacobian determinant) $\cos^2(\theta_1)\sin^3(\theta_1)\cos(\theta_2)\sin(\theta_2)$, which integrates to $0$ over $[0, \pi) \times [0, 2\pi)$. Hence $E[T_a^2T_bT_c] = 0$.

To calculate $E[T_aT_bT_cT_d]$, applying lemma 2 with $\mathbf{T_1} = (T_a, T_b, T_c, T_d)$ yields \begin{align*} E(T_aT_bT_cT_d) = \int_{0 < t_a^2 + t_b^2 + t_c^2 + t_d^2 < 1}t_at_bt_ct_df_{(T_a, T_b, T_c, T_d)}(t_a, t_b, t_c, t_d)dt_adt_bdt_cdt_d. \tag{7} \end{align*} Under the spherical transformation \begin{align*} & t_a = r\cos(\theta_1), \\ & t_b = r\sin(\theta_1)\cos(\theta_2), \\ & t_c = r\sin(\theta_1)\sin(\theta_2)\cos(\theta_3), \\ & t_d = r\sin(\theta_1)\sin(\theta_2)\sin(\theta_3), \\ \end{align*} where $0 < r < 1$, $0 \leq \theta_1, \theta_2 < \pi$, $0 \leq \theta_3 < 2\pi$, the integrand in $(7)$ that includes $\theta_1, \theta_2, \theta_3$ is (after multiplying the Jacobian determinant) $\cos(\theta_1)\sin^5(\theta_1)\cos(\theta_2)\sin^3(\theta_2)\cos(\theta_3) \sin(\theta_3)$, which integrates to $0$ over $[0, \pi) \times [0, \pi) \times [0, 2\pi)$. Hence $E[T_aT_bT_cT_d] = 0$.

To summarize all these pieces, we conclude that $E[r_k] = 0$ and \begin{align} & \operatorname{Var}(r_k) = E[r_k^2] \\ =& E(T_1^2T_{k + 1}^2) + \cdots + E(T_{n - k}^2T_n^2) + \sum E[T_a^2T_bT_c] + \sum E[T_aT_bT_cT_d] \\ =& (n - k) \times \frac{1}{n(n + 2)} = \frac{n - k}{n(n + 2)}. \end{align}

This completes the proof. As a by-product, that both $(6)$ and $(7)$ are identical to $0$ also readily (now it is truly "readily") imply that $r_k$ and $r_l$ are uncorrelated when $k \neq l$, which is another proposition claimed by the original paper.


$^\dagger$: The condition of Lemma 1 implies that the main result still holds when the distribution assumption of innovations $(a_1, \ldots, a_n)$ is slightly generalized to spherical distributions (from Gaussian).

Zhanxiong
  • 18,524
  • 1
  • 40
  • 73
  • 1
    I cannot say for sure what was the "readily" done approach of Box, but this surely is ingenious. +1. – User1865345 Apr 05 '23 at 03:35
  • 1
    @User1865345 Yes, it's definitely a non-trivial result (as you can see, my answer still omitted some calculations). To me, express $r_k$ in terms of $T_1, \ldots, T_n$ is quite natural, the difficulty lies in connecting the problem to Lemma 2. By the way, the proof of Lemma 2 itself is formidable (as you may see from the mathoverflow link). – Zhanxiong Apr 05 '23 at 03:44
  • Lemma 2 is itself a powerful result. I am reading the post which I can comprehend. – User1865345 Apr 05 '23 at 03:46
  • 2
    @User1865345 I complemented previously omitted calculations and added more comments (now this answer can also be used to show ${r_k}$ are uncorrelated). So at least for this question, you may "sleep peacefully" now. – Zhanxiong Apr 05 '23 at 12:54
  • I have completed reading it slowly. And it can't be more clearer. Thanks. – User1865345 Apr 05 '23 at 16:40
  • Great answer, thanks - the other one is somewhat more accessible to me, while yours seems more general. Hence I hope splitting bounty and accepted answer is a useful compromise – Christoph Hanck Apr 09 '23 at 13:08
  • @ChristophHanck During the course of solving a similar problem, I found that there is a much shorter, more accessible proof. Please check my edit. – Zhanxiong May 08 '23 at 03:46
4

This might not be the most elegant proof, but I believe it answers the question:

We assume $a_1,a_2,...a_n$ are i.i.d standard normal (we can assume as in Ljung-Box a constant variance $\sigma^2$, but then we can divide by $\sigma^2$ both the numerator and denominator in the expression for $r_k$, arriving at the standardized variables, so this doesn't lose generality).

Consider, for arbitrary $k \ne m$ $$ r = \frac{a_ka_m}{\sum_{t=1}^n a_t^2} = \frac{a_ka_m}{a_k^2 + a_m^2 + Z^2}$$

where $Z^2 \sim \chi^2_{n-2}$ and $Z^2,a_k,a_m$ are all independent. It follows from symmetry$^1$ that $r$ have zero mean, $E[r]=0$. Next introduce the rotated variables: $$ u = (a_k+a_m)/\sqrt{2} , v = (a_k-a_m)/\sqrt{2} $$

which are also independent and standard normal, and observe that $a_k^2 + a_m^2 = u^2+v^2$, $a_ka_m = (u^2-v^2)/2$, so we have

$$r = \frac{1}{2}\left( \frac{u^2}{u^2+v^2+Z^2} - \frac{v^2}{u^2+v^2+Z^2} \right) \equiv \frac{1}{2}(U - V).$$

Now notice that $U$ and $V$ are ratios of Chi-squared random variables, which have a known Beta distribution:

$$U,V \sim \text{Beta}\left(\frac{1}{2},\frac{n-1}{2}\right)$$

with $E[U]=\frac{1}{n}$ and $E[U^2]= \frac{3}{n(n+2)}$ following from the properties of the Beta distribution. Furthermore, notice that $UV = \left( \frac{uv}{u^2+v^2+Z^2} \right)^2 $ has exactly the same distribution as $r^2$, so $E[UV]=E[r^2]=Var(r)$.

From $r=(U-V)/2$ we also have

$$\begin{align*} Var(r) &= \frac{1}{4}( Var(U) + Var(V) - 2Cov(U,V) ) \\ &= \frac{1}{2}( Var(U) - Cov(U,V) ) \\ &= \frac{1}{2}( Var(U) - E[UV] + E[U]^2 ) \\ &= \frac{1}{2}( E[U^2] - Var(r) ) \end{align*}$$

So finally,

$$ Var(r) = \frac{1}{3}E[U^2] = \frac{1}{n(n+2)} .$$

To complete the proof we can observe that $r_k$ is a sum of $(n-k)$ terms which all have the same distribution as $r$, namely $r_k = \sum_{t=k+1}^n r_{t,t-k}$, and they are uncorrelated since for any $k,m \ne s,t$, $Cov(r_{k,m},r_{s,t}) = E[r_{k,m} \cdot r_{s,t}] = 0$, following again from symmetry. Therefore,

$$Var(r_k) = (n-k)Var(r) = \frac{n-k}{n(n+2)}.$$


$^1$ Since the joint distribution of $\{a_t\}$ is symmetric with respect to interchaning $a_t \leftrightarrow -a_t$, the expectation of any odd function (with respect to any of its arguments) is zero. For example, define $\tilde a_k = -a_k$, such that $ r = -\frac{\tilde a_ka_m}{\sum_{t=1}^n a_t^2} \equiv -\tilde r$. Since $r$ and $\tilde r$ have the same distribution, $E[r] = E[\tilde r]=-E[r]$, implying that $E[r]=0$. The same argument holds for any other odd function such as $\frac{a_ka_ma_sa_t}{(\sum_{t=1}^n a_t^2)^2}$, provided that at least one of the $a_i$'s in the numerator has an odd power. (This is essentially the same argument as given, e.g., here )

J. Delaney
  • 5,380
  • both answers-proofs are incredible ( now picture loud applause from the crowd ). thanks to both of you. – mlofton Apr 05 '23 at 13:06
  • Excellent! Could you provide a little more detail as to why "$UV = \left( \frac{uv}{u^2+v^2+Z^2} \right)^2 $ has exactly the same distribution as $r^2$"? I keep looking at the two expressions without success. – Christoph Hanck Apr 05 '23 at 19:59
  • My question is: how to get $E[r_{k, m} \cdot r_{s, t}] = 0$ "from symmetry"? – Zhanxiong Apr 06 '23 at 01:08
  • @ChristophHanck $UV$ is the same as $r^2 = \left(\frac{a_ka_m}{a_k^2 + a_m^2 + Z^2}\right)^2$, just with $u,v$ replacing $a_k,a_m$ (which have the same distributions). Is it not clear ? – J. Delaney Apr 06 '23 at 08:50
  • @Zhanxiong Since the joint distribution of ${a_t}$ is symmetric with respect to interchanging $a_t \leftrightarrow -a_t$, then the expectation of any odd function is zero. I will add a clarification to the answer – J. Delaney Apr 06 '23 at 08:54
  • @J.Delaney, oh, right, I think I was for some reason jumping over the "distribution" part, as all are N(0,1), and trying to establish algebraic equivalence. – Christoph Hanck Apr 06 '23 at 13:45
  • 1
    @ChristophHanck It's a nice little "miracle" since otherwise just expressing $r$ as a difference of Beta variates would not be enough to find its variance! – J. Delaney Apr 06 '23 at 15:11