24 Autocorrelation tests

24.1 Durbin-Watson test

The aim of the Durbin-Watson test is to verify if a time series presents autocorrelation or not. Specifically, let’s consider a time series \(\mathbf{X}_n = (X_1, \dots, X_t, \dots, X_n)\) modeled with an AR(1) model, i.e. \[ X_t = \phi_1 X_{t-1} + u_t \text{,} \tag{24.1}\] then, the null hypothesis \(\mathcal{H}_0\) of the test is the absence of autocorrelation, i.e. \[ \mathcal{H}_0: \phi_1 = 0 \text{,}\quad \mathcal{H}_1: \phi_1 \neq 0 \text{.} \] The Durbin-Watson statistic, denoted as \(\text{DW}\), is computed as: \[ \text{DW}(\mathbf{x}_n) = \frac{\sum_{i=2}^{n} (x_{i} - x_{i-1})^{2} }{\sum_{i=2}^{n} x_{i-1}^2} \approx 2(1 - \phi_1) \text{,} \] and under \(\mathcal{H}_0\) the DW statistic is approximately \[ \text{DW}(\mathbf{X}_n) \underset{\mathcal{H}_0}{\approx} 2 \text{.} \] The test always generates a statistic between 0 and 4. However, there is not a known distribution for critical values. Hence to establish if we can reject or not \(\mathcal{H}_0\) when we have values very different from 2, we should look at the tables.

24.2 Breush-Godfrey

The Breush-Godfrey test is a generalization of the Durbin-Watson allowing multiple lags in the regression. The null hypothesis \(\mathcal{H}_0\) of the test is the absence of autocorrelation up to the order \(p\), i.e. \[ \begin{aligned} {} & \mathcal{H}_0: \phi_1 = \dots = \phi_p = 0 \;\iff \text{absence of autocorrelation} \\ & \mathcal{H}_1: \phi_1 \neq 0, \dots, \phi_p \neq 0 \iff \text{autocorrelation} \end{aligned} \] To evaluate \(\mathcal{H}_0\) usually is fitted an AR(p) model on the a time series \(\mathbf{X}_n = (X_1, \dots, X_t, \dots, X_n)\), i.e. \[ X_t = \phi_1 X_{t-1} + \dots + \phi_p X_{t-p} + u_t \text{.} \tag{24.2}\] and then look at the F-test (Equation 15.18), that under \(\mathcal{H}_0\) is distributed as a Fisher–Snedecor (Equation 32.3), with \(p\) and \(n-p-1\) degrees of freedom. Alternatively, is is possible to use the \(\text{LM}\) statistic, i.e. \[ \text{LM} = nR^2 \sim \chi(p) \text{,} \] where \(R^2\) is the R-squared (Equation 15.15) of the AR(p) regression defined in Equation 24.2.

24.3 Box–Pierce test

Let’s consider a sequence of \(n\) IID observations with mean zero and finite variance, i.e. \(X_t \sim \text{IID}(0, \sigma^2)\) and \(0 < \sigma^2 < \infty\). Then, the autocorrelation for a generic \(k\)-lag can be estimated as: \[ \hat{\rho}_k(\mathbf{x}_n) = \frac{\sum_{t=k+1}^{n} x_{t} x_{t-k}}{\sum_{t=1}^{n} x_{t}^2} \text{.} \] One can note that the numerator is a sum of IID random variables with variance \(\sigma^4\). By CLT as \(n \to \infty\) \[ V_1 = \frac{1}{\sqrt{n}} \sum_{t=k+1}^{n} x_{t} x_{t-k} \overset{\text{d}}{\underset{n \to \infty}{\sim}} \mathcal{N}(0,\sigma^4) \text{.} \] The denominator instead can be seen as \[ V_2 = \frac{1}{n} \sum_{t=1}^{n} X_{t}^2 \overset{\text{p}}{\underset{n \to \infty}{\to}} \sigma^2 \text{.} \] Taking the ratio of \(V_1\) and \(V_2\) gives \[ \sqrt{n} \hat{\rho}_k(\mathbf{X}_n) = \frac{V_1}{V_2} = \frac{\frac{1}{\sqrt{n}} \sum_{t=k+1}^{n} x_{t} x_{t-k}}{\frac{1}{n} \sum_{t=1}^{n} X_{t}^2} \frac{\overset{\text{d}}{\underset{n \to \infty}{\to}}}{\overset{\text{p}}{\underset{n \to \infty}{\to}}} \frac{\mathcal{N}(0,\sigma^4)}{\sigma^2} \text{.} \] Therefore, as \(n \to \infty\) \[ \hat{\rho}_k(\mathbf{X}_n) \overset{\text{d}}{\underset{n \to \infty}{\sim}} \mathcal{N}\left(0,\frac{1}{n}\right) \implies \sqrt{n} \hat{\rho}_k(\mathbf{X}_n) \overset{\text{d}}{\underset{n \to \infty}{\sim}} \mathcal{N}(0,1) \text{.} \] A classic result gives from Bartlett (1946) provides the covariance matrix of sample autocorrelations of a stationary process for large \(n\), i.e. \[ \mathbb{V}\{\hat\rho_k(\mathbf{X}_n)\} \approx \frac{1}{n}\left(1+2\sum_{j=1}^{k-1}\rho_j^2\right) \text{,}\quad \mathbb{C}v\{\hat\rho_j(\mathbf{X}_n),\hat\rho_k(\mathbf{X}_n)\}\approx 0 \text{.} \] with \(j\neq k\). Under white noise assumption (\(\rho_j=0\) for \(j>0\)), this simplifies to \[ \mathbb{V}\{\hat\rho_k(\mathbf{X}_n)\} \approx \frac{1}{n} \text{.} \] Hence, recalling that under normality \(n \hat{\rho}_k^2(\mathbf{X}_n) \sim \chi^{2}(1)\) one can generalize the result considering \(p\)-auto correlations. More precisely, let’s define a vector containing the first \(p\) standardized auto-correlations. Due to the previous result it converges in distribution to a multivariate standard normal, i.e. \[ \sqrt{n} \begin{bmatrix} \hat{\rho}_1(\mathbf{X}_n) \\ \vdots \\ \hat{\rho}_k(\mathbf{X}_n) \\ \vdots \\ \hat{\rho}_p(\mathbf{X}_n) \end{bmatrix} \underset{n \to \infty }{\overset{\text{d}}{\longrightarrow}} \text{MVN}_p(\boldsymbol{0}, \mathbf{I}_p) \text{.} \] To test the following set of hypothesis, \[ \begin{aligned} {} & \mathcal{H}_0: \rho_1 = \dots = \rho_p = 0 \\ & \mathcal{H}_1: \rho_1 \neq 0, \dots, \rho_p \neq 0 \end{aligned} \] it is possible to recall that the sum of the squares of \(p\)-normal random variable is distributed as a \(\chi^2(p)\). Then, Box and Pierce (1970) proved that the statistic defined as \[ \text{Q}^{\tiny \text{BP}}_p(\mathbf{X}_n) = n \sum_{k = 1}^{p} \hat{\rho}^2_{k}(\mathbf{X}_n) \underset{\mathcal{H}_0}{\overset{\text{d}}{\longrightarrow}} \chi^2(p) \text{,} \] is distributed as a \(\chi^2\) with \(p\) degrees of freedom. In general, depending on the value of the statistic on a sample \(\mathbf{x}_n\), one obtain that \[ \begin{cases} \text{Q}^{\tiny \text{BP}}_p(\mathbf{x}_t) > q_{1-\alpha} \quad \mathcal{H}_0 \text{ rejected} \\ \text{Q}^{\tiny \text{BP}}_p(\mathbf{x}_t) < q_{1-\alpha} \quad \mathcal{H}_0 \text{ not rejected} \end{cases} \] where \(q_{1-\alpha}\) is the quantile with probability \(1-\alpha\) of a \(\chi^2\) random variable with \(p\) degrees of freedom.

Large samples and heteroskedasticity.

Note that such test, also known as Portmanteau test, provide an asymptotic result valid only for large samples. Moreover, the assumption of the test is that the observations are IID, hence the test does no apply in presence of heteroskedasticity.

24.3.1 Ljung-Box test

In general, the Box–Pierce test is an asymptotic test that holds as \(n\to\infty\). For finite n, \(\text{Q}_p^{\tiny \text{BP}}\) tends to underestimate the true correlation \(\rho_k\). Ljung and Box (1978) sharpened the analysis by deriving the finite-sample second moments of the residual autocorrelations when the model is correct. They show (for \(k\ge1\)): \[ \mathbb{V}\{\hat{\rho}_k(\mathbf{X}_n)\} = \frac{n-k}{n(n+2)} \text{,} \] where \(n-k\) appears because the numerator of \(\hat\rho_k\) uses only \(n-k\) usable pairs. The extra \(n+2\) reflects exact finite-sample algebra for sums of products and the random denominator (sum of squares). Hence, instead of treating each \(\hat{\rho}_k\) as if it had variance \(1/n\) (Box–Pierce assumption), the Ljung–Box statistic rescales by \((n+2)/(n-k)\) to better match the finite-sample distribution.

Standardizing \(\hat{\rho}_k\) and taking the square one obtain the Ljung-box statistic, i.e. \[ \text{Q}^{\tiny \text{LB}}_p(\mathbf{X}_n) = n(n+2)\sum_{k = 1}^{p} \frac{\hat{\rho}_{k}^2(\mathbf{X}_n)}{n-k} \underset{\mathcal{H}_0}{\overset{\text{d}}{\longrightarrow}} \chi^2(m) \text{.} \] In general, depending on the value of the statistic on a sample \(\mathbf{x}_n\), one obtain that \[ \begin{cases} \text{Q}^{\tiny \text{LB}}_p(\mathbf{x}_n) > q_{1-\alpha} \quad \mathcal{H}_0 \text{ rejected} \\ \text{Q}^{\tiny \text{LB}}_p(\mathbf{x}_n) < q_{1-\alpha} \quad \mathcal{H}_0 \text{ not rejected} \end{cases} \] where \(q_{1-\alpha}\) is the quantile with probability \(1-\alpha\) of the \(\chi^2(p)\) distribution with \(p\) degrees of freedom. If we reject \(\mathcal{H}_0\), the time series presents autocorrelation, otherwise if \(\mathcal{H}_0\) is not rejected we have no autocorrelation.