16  Generalized least squares

References: Chapter 3. Gardini A. ().

16.1 Working hypothesis

The assumptions of the generalized least squares estimator are:

  1. E{yix1,,xn}=E{yiX}=xib.
  2. V{yix1,,xn}=V{yiX}=σi2 with 0<σi2<.
  3. Cv{yi,yjx1,,xn}=Cv{yi,yjX}=σij.

Equivalently the formulation of the assumptions in terms of the stochastic component e are

  1. yi=xib+ei for i=1,n.
  2. E{eix1,,xn}=E{eiX}=0.
  3. V{eix1,,xn}=V{eiX}=σi2 with 0<σi2<.
  4. Cv{ei,ejx1,,xn}=Cv{ei,ejX}=σij

In this case the variance covariance matrix Σ is defined as in and contains the variances and the covariances between the observations.

16.2 Generalized least squares estimator

Proposition 16.1 (Generalized Least Squares (GLS))
The generalized least squares estimator (GLS) is the function QGLS that minimize the weighted sum of the squared residuals and return an estimate of the true parameter b, i.e.  (16.1)QGLS(b)=e(b)Σ1e(b). Formally, the GLS estimator is the solution of the following minimization problem, i.e.  (16.2)bGLS=argminbΘb{QGLS(b)}. Notably, if X and Σ are non-singular one obtain an analytic expression, i.e.
(16.3)bGLS=(XΣ1X)1XΣ1y.

Singularity of X or Σ

The solution is available if and only if X and Σ are non-singular. In practice the conditions are:

  1. rank(Σ)=max=n for the inversion of Σ.
  2. rank(X)=max=k and condition 1. for the inversion of XΣ1X.

Proof. Let’s prove the optimal solution in . Developing the optimization problem in : QGLS(b)=e(b)Σ1e(b)==(yXb)Σ1(yXb)==yΣ1y2bXΣ1y+bXΣ1Xb In order to minimize the above expression, let’s compute the first derivative of QGLS(b) with respect to b dQGLS(b)db=2XΣ1y+2XΣ1Xb. Then, setting the above expression equal to zero and solving for b=bGLS gives the solution, i.e. bGLS=(XΣ1X)1XΣ1y.

16.3 Properties GLS

Theorem 16.1 (Aikten theorem)
Under the following working hypothesis, also called Aikten hypothesis, i.e. 

  1. y=Xb+e.
  2. E{e}=0.
  3. E{ee}=Σ, i.e. heteroskedastic and correlated errors.
  4. X is non-stochastic and independent from the errors e for all n’s.

The Generalized Least Square (GLS) estimator is BLUE (Best Linear Unbiased Estimator), where “best” stands for the estimator with minimum variance in the class of linear unbiased estimators of b.

Proposition 16.2 (Properties GLS estimator)

  1. Unbiased: bGLS is correct and it’s conditional expectation is equal to true parameter in population, i.e.  (16.4)E{bGLSX}=b.

  2. Linear in the sense that it can be written as a linear combination of y and X, i.e. bGLS=Axy, where Ax do not depend on y, i.e.
    (16.5)bGLS=AxyAx=(XΣ1X)1XΣ1.

  3. Under the Aikten hypothesis () it has minimum variance in the class of the unbiased linear estimators and it reads: (16.6)V{bGLSX}=(XΣ1X)1.

Proof. The GLS estimator is correct. It’s expected value is computed from and substituting , is equal to the true parameter in population, i.e.
(16.7)E{bGLSX}=E{(XΣ1X)1XΣ1y}==E{(XΣ1X)1XΣ1(Xb+e)}==(XΣ1X)1XΣ1Xb+(XΣ1X)1XΣ1E{eX}==b

Under the assumption of heteroskedastic and correlated observations the conditional variance of bGLS follows similarly as for the OLS case () but with V{eX}=Σ, i.e. (16.8)V{bGLSX}=(XX)1XΣ1V{eX}Σ1X(XX)1==(XΣ1X)1XΣ1ΣΣ1X(XΣ1X)1==(XΣ1X)1XΣ1X(XΣ1X)1==(XΣ1X)1 where become a special case of where Σ=σe2In.

16.4 Alternative derivation

Let’s consider a linear model of the form y=Xb+ε and a transformation matrix Tn×n. Multiplying on both sides by T, the model can be rewritten as follows: Ty=TXb+Tε y~=X~b+ε~ The conditional mean of the transformed models reads as: E{y~X~}=X~b while it’s conditional variance V{y~X~}=V{ε~X~}=TΣT The idea is to identify a transformation matrix T such that the conditional variance became equal to the identity matrix, i.e. V{ε~X~}=In. In this way it is possible to work under the Gauss-Markov assumptions obtaining an estimator with minimum variance. Let’s decompose the variance-covariance matrix () as Σ=eΛe where

  • Λ is the diagonal matrix containing the eigenvalues.
  • e is the matrix with the eigenvectors that satisfy the following relation, i.e. ee=ee=Jn.

Setting the transformation matrix as T=Λ1/2e gives that the conditional variance is equal to 1 for all the observations, i.e.
V{ε~X~}=TΣT==(Λ1/2e)eΛe(eΛ1/2)==Λ1/2ΛΛ1/2=Jn Moreover, the matrix T=Λ1/2e satisfies the product: (16.9)TT=eΛ1/2Λ1/2e=eΛ1e=Σ1 Finally, substituting X~=TX in the OLS formula () and using the result one obtain exactly the GLS estimator in , i.e.  b~=(X~X~)1X~y~==(XTTX)1XTTy==(XΣ1X)1XΣ1y

16.5 Models with heteroskedasticity

16.5.1 Working hypothesis

The assumptions of the generalized linear model with heteroskedastic errors are:

  1. E{yix1,,xn}=E{yiX}=xib.
  2. V{yix1,,xn}=V{yiX}=σi2 with 0<σi2<.
  3. Cv{yi,yjx1,,xn}=Cv{yi,yjX}=0

equivalently the formulation in terms of the stochastic component

  1. yi=xiβ+ei for i=1,n.
  2. E{eix1,,xn}=E{eiX}=0.
  3. V{eix1,,xn}=V{eiX}=σi2 with 0<σi2<.
  4. Cv{ei,ejx1,,xn}=Cv{ei,ejX}=0

For an heteroskedastic linear model the variance-covariance matrix of the residuals in matrix notation is written as: Σn×n=(σ12000σ22000σn2)=diag(σ12, σ22, , σn2)