16  Generalized least squares

References: Chapter 3. Gardini A. ().

16.1 Working hypothesis

The assumptions of the generalized least squares estimator are:

  1. The linear model approximate the conditional expectation, i.e. E{Yixi}=xib.
  2. The conditional variance of the response variable Y depends on the observation i, i.e. V{Yixi}=σi2 with 0<σi2< for all i with i=1,,n.
  3. The response variables Y are correlated Cv{Yi,Yjxi}=σij for all ij and i,j=1,,n.

Equivalently the formulation of the assumptions in terms of the stochastic component u are

  1. The residuals have mean zero, i.e. E{uixi}=0 for all i with i=1,,n.
  2. The conditional variance of the residuals depends on the observation i, i.e. V{uixi}=σi2 with 0<σi2<.
  3. The residuals are correlated, i.e. Cv{ui,ujxi}=σij for all ij and i,j=1,,n.

In this case the variance covariance matrix Σ is defined as in and contains the variances and the covariances between the observations.

16.2 Generalized least squares estimator

Proposition 16.1 (Generalized Least Squares (GLS))
The generalized least squares estimator (GLS) is the function QGLS that minimize the weighted sum of the squared residuals and return an estimate of the true parameter b, i.e.  (16.1)QGLS(b)=u^(b)Σ1u^(b). Formally, the GLS estimator is the solution of the following minimization problem, i.e.  (16.2)bGLS=argminbΘb{QGLS(b)}. Notably, if X and Σ are non-singular one obtain an analytic expression, i.e.
(16.3)bGLS=(XΣ1X)1XΣ1y.

Singularity of X or Σ

The solution is available if and only if X and Σ are non-singular. In practice the conditions are:

  1. rank(Σ)=n for the inversion of Σ.
  2. rank(X)=k and condition 1. for the inversion of XΣ1X.

Proof. Let’s prove the optimal solution in . Developing the optimization problem in : QGLS(b)=u^(b)Σ1u^(b)==(yXb)Σ1(yXb)==yΣ1y2bXΣ1y+bXΣ1Xb In order to minimize the above expression, let’s compute the first derivative of QGLS(b) with respect to b dQGLS(b)db=2XΣ1y+2XΣ1Xb. Then, setting the above expression equal to zero and solving for b=bGLS gives the solution, i.e. bGLS=(XΣ1X)1XΣ1y.

Proposition 16.2 (Two-stage derivation of GLS estimator)
The GLS estimator in can be equivalently recovered as bGLS=(XTTX)1XTTy, where T=Λ1/2e with Σ=eΛe and

  • Λ is the diagonal matrix containing the eigenvalues of Σ.
  • e is the matrix with the eigenvectors of Σ that satisfies the following relation, i.e. ee=ee=Jn.

Moreover, the matrix T=Λ1/2e satisfies the product: (16.4)TT=eΛ1/2Λ1/2e=eΛ1e=Σ1

Proof. Let’s consider a linear model of the form y=Xb+u, and let’s apply some (unknown) transformation matrix Tn×n by multiplying on both sides, i.e. Ty=TXb+Tu y~=X~b+u~ In this context, the conditional expectation of y~ reads E{y~X~}=X~b, while it’s conditional variance V{y~X~}=V{u~X~}=TΣT. The next step is to identify a suitable transformation matrix T such that the conditional variance became equal to the identity matrix (), i.e.  V{u~X~}=In. In this way it is possible to work under the Gauss-Markov assumptions () obtaining an estimator with minimum variance.

A possible way to identify T is to decompose the variance-covariance matrix () as follows Σ=eΛeΣ1=eΛ1e where Λ is the diagonal matrix containing the eigenvalues and e is the matrix with the eigenvectors that satisfy the following relation, i.e. ee=ee=In,n.

Thus, for the particular choice of T=Λ1/2e, one obtain a conditional variance equal to 1 for all the observations, i.e.
V{u~X~}=TΣT==(Λ1/2e)eΛe(eΛ1/2)==Λ1/2ΛΛ1/2=Jn where Jn,n reads as in . Finally, substituting X~=TX in the OLS formula () and using the result one obtain exactly the GLS estimator in , i.e.  bGLS=(X~X~)1X~y~==(XTTX)1XTTy==(XΣ1X)1XΣ1y

16.3 Properties GLS

Theorem 16.1 (Aikten theorem)
Under the following working hypothesis, also called Aikten hypothesis, i.e. 

  1. y=Xb+u.
  2. E{u}=0.
  3. E{uu}=Σ, i.e. heteroskedastic and correlated errors.
  4. X is non-stochastic and independent from the errors u for all n’s.

The Generalized Least Square (GLS) estimator is BLUE (Best Linear Unbiased Estimator), where “best” stands for the estimator with minimum variance in the class of linear unbiased estimators of b.

Proposition 16.3 (Properties GLS estimator)
1. Unbiased: bGLS is correct and it’s conditional expectation is equal to true parameter in population, i.e.  (16.5)E{bGLSX}=b.

  1. Linear in the sense that it can be written as a linear combination of y and X, i.e. bGLS=Axy, where Ax do not depend on y, i.e. (16.6)bGLS=AxyAx=(XΣ1X)1XΣ1.
  2. Under the Aikten hypothesis () it has minimum variance in the class of the unbiased linear estimators and it reads: (16.7)V{bGLSX}=(XΣ1X)1.

Proof. The GLS estimator is correct. It’s expected value is computed from and substituting , is equal to the true parameter in population, i.e.
(16.8)E{bGLSX}=E{(XΣ1X)1XΣ1y}==E{(XΣ1X)1XΣ1(Xb+u)}==(XΣ1X)1XΣ1Xb+(XΣ1X)1XΣ1E{uX}==b

Under the assumption of heteroskedastic and correlated observations the conditional variance of bGLS follows similarly as for the OLS case () but with V{uX}=Σ, i.e. (16.9)V{bGLSX}=(XX)1XΣ1V{uX}Σ1X(XX)1==(XΣ1X)1XΣ1ΣΣ1X(XΣ1X)1==(XΣ1X)1XΣ1X(XΣ1X)1==(XΣ1X)1 where become a special case of where Σ=σu2In.