17  Restricted least squares

References: Chapters 3.7, 3.8, 4.2 Gardini A. ().

Let’s consider a generic uni-variate linear model with k-regressors, namely y=b1X1++bjXj++bkXk+u=bX+u, and suppose that we are interested in testing whereas the coefficient bj is statistically different from a certain value r known at priori. In this case the null hypothesis can be equivalently represented using a more flexible matrix notation, i.e.  (17.1)H0:bj=rH0:Rbr=0, where Rk×1=(010)j-th position. Hence, the linear restriction in can be written in matrix as H0:Rk×1bk×1r1×1=01×1(010)j-th position(b1bjbk)(r)=(0).

17.1 Multiple restrictions

Let’s consider multiple restrictions, i.e.  H0:(1)b1b2=0b1 and b2 has same effect(2)b3+b4=1b3 plus b4 unitary root Let’s construct the vector for (1) (first column of R) and (2) (second column of R), i.e. R2×4b4×1r2×1=02×1(11000011)R(b1b2b3b4)b(01)r=(00).

17.2 Restricted least squares

Proposition 17.1 (Restricted Least Squares (RLS) estimator)
Let’s consider a linear model under the OLS assumptions and let’s consider a set of m linear hypothesis on the parameters of the model taking the form H0:Rm×kbk×1rm×1=0m×1. Therefore, the optimization problem became restricted to the space of parameters that satisfies the conditions. More precisely, the space Θ~b, that is a subset of the parameter space Θ~bΘb where the linear constraint holds true, is defined as Θ~b={bRk:Rbr=0}. Hence, the optimization problem in is restricted to only the parameters that satisfy the constraint.

Formally, the RLS estimator is the solution of the following minimization problem, i.e.  (17.2)bRLS=argminbΘ~b{QOLS(b)}. where QOLS reads as in the OLS case (). Notably, the analytic solution for bRLS reads (17.3)bRLS=bOLS(XX)1R[RT(XX)1R]1(RbOLSr).

Proof. In order to solve the minimization problem in , let’s construct the Lagrangian () as QRLS(b,λ)=QOLS(b)2λ(Rbr). Then, one obtain the following system of equation, i.e.  {bQRLS(b,λ)=2Xy+2XXb2Rλ=0(A)λQRLS(b,λ)=2(Rbr)=0(B) Let’s firstly solve explicitly b=bRLS from (A), i.e.  (17.4)bRLS=(XX)1Xy(XX)1Rλ==bOLS(XX)1Rλ and substitute the result in (B), i.e.  RbRLSr=0R[bOLS(XX)1Rλ]r=0RbOLSR(XX)1Rλr=0RbOLSr=[R(XX)1R]λ Hence, it is possible to explicit the Lagrange multipliers λ as: (17.5)λ=[R(XX)1R]1(RbOLSr). Finally, substituting λ () in gives the optimal solution, i.e.  bRLS=bOLS(XX)1R[R(XX)1R]1(RbOLSr). Note that if constraints hold true in the OLS estimate, H0 is true and therefore RbOLSr=0. Hence the RLS and OLS parameters are the same, i.e. bRLS=bOLS.

Proposition 17.2 (Expectation RLS estimator)
The RLS estimator () is correct for the true parameter in population b if and only if the restrictions imposed by H0 are true in population, i.e. expected value is computed as: (17.6)E{bRLSX}=b(XX)1R[R(XX)1R]1(Rbr), where E{bRLSX}=b only if the second component is zero, that happens only when H0 holds true and so Rbr=0.

Proof. Let’s apply the expected value on remembering that X, R and r are non-stochastic and that bOLS is correct (). Developing the computations gives: E{bRLSX}=E{bOLSX}E{(XX)1R[R(XX)1R]1(RbOLSr)X}==b(XX)1R[R(XX)1R]1(RE{bOLSX}r)==b(XX)1R[R(XX)1R]1(Rbr) Hence bRLS is correct if and only if the restriction holds true in population, i.e. E{bRLSX}=bRbr=0.

Proposition 17.3 (Variance RLS estimator)
The variance of the RLS estimator () V{bRLS}=V{bOLS}σu2(XX)1R[R(XX)1R]1R(XX)1. It is interesting to note that the variance of the RLS estimator is always lower or equal than the variance of the OLS estimator, in fact V{bRLS}V{bOLS}.

Proof. In order to compute the variance of the RLS estimator, let’s apply the variance operator to @, i.e.  V{bRLS}=V{bOLS}V{(XX)1R[R(XX)1R]1RbOLS}. Let’s denote with Rx the matrix Rx=(XX)1R[R(XX)1R]1R, and let’s bring it outside the variance, i.e.  V{bRLS}=V{bOLS}RxV{bOLS}Rx. Moreover, substituting the expression of the variance of bOLS () one obtain V{bRLS}=V{bOLS}σu2Rx(XX)1Rx. Developing the matrix multiplication gives Rx(XX)1Rx=(XX)1R[R(XX)1R]1R(XX)1R[R(XX)1R]1R(XX)1==(XX)1R[R(XX)1R]1R(XX)1

17.3 A test for linear restrictions

Under the assumption of normality of the error terms, it is possible to derive a statistic to test the significance of the linear restrictions imposed by Rbr=0. Let’s test the validity of the hull hypothesis H0 against its alternative hypothesis H1, i.e.  H0:Rbr=0,H1:Rbr0. Under normality, the OLS estimate are multivariate normal, thus applying the scaling property one obtain that the distribution under H0 is normal, i.e.  (17.7)RbRLSrN(Rbr,σu2R(XX)1R). Thus, we can write the statistic (17.8)Wm=1σu2(Rbr)(R(XX)1R)1(Rbr).

If we work under H0, then the mean in is zero, i.e.
RbRLSrH0N(0,σu2R(XX)1R). Recalling the relation () between the distribution of the quadratic form of a multivariate normal and the χ2 distribution, then the test statistic (17.9)WmH0dχ2(m), has χ2(m) distribution, with m the number of restrictions.

Instead, under H1 the distribution of the linear restriction is exactly equal to . Thus, applying property 4. in one obtain that the test statistic is distributed as a non central χ2(m,δ), i.e. (17.10)WmH1dχ2(m,δ), where the non centrality parameter δ reads δ=1σu2(Rbr)[R(XX)1R]1(Rbr)>0. As general decision rule H0 is rejected if the statistic in is greater than the quantile with confidence level α of a χ2(m) random variable. Such critic value, denoted with qα represents the value for which the probability that a χ2(m) is greater than qα is exactly equal to α, i.e.  P(Wm>qα)=α.