15  Classic least squares

References: Angela Montanari (), Chapter 3. Gardini A. ().

15.1 Assumptions

Let’s start from the classic assumptions for a linear model. The working hypothesis are:

  1. The linear model approximate the conditional expectation, i.e. E{Yixi}=xib.
  2. The conditional variance of the response variable y is constant, i.e. V{Yixi}=σu2 with 0<σu2<.
  3. The response variables Y are uncorrelated, i.e. Cv{Yi,Yjxi}=0 with ij and i,j{1,,n}.

Equivalently the formulation in terms of the stochastic component reads

  1. The residuals have mean zero, i.e. E{uixi}=0 for all i with i=1,,n.
  2. The conditional variance of the residuals is constant, i.e. V{uixi}=σu2 with 0<σu2<.
  3. The residuals and the regressors are uncorrelated, i.e. Cv{ui,ujxi}=0 for all ij and i,j=1,,n.

Hence, in this setup the error terms are assumed to be independent and identically distributed with mean zero and equal variances σi2=σ2 for all i. Thus, the general expression of the covariance matrix in reduces to Σ=σu2In.

Generate the matrix of data
# True variance - covariance matrix 
sigma <- diag(1, 4)
sigma[2,2] <- 1.3
sigma[3,3] <- 0.7
sigma[4,4] <- 1.7
sigma[1,2] <- sigma[2,1] <- 0.3
sigma[1,3] <- sigma[3,1] <- 0.4
sigma[1,4] <- sigma[4,1] <- -0.8
# Simulate data 
set.seed(1)
W <- mvtnorm::rmvnorm(10, sigma = sigma) 
y <- W[,1]
X <- cbind(1, W[,-1])
n <- nrow(X)
k <- ncol(X)

15.2 Estimator of b

Proposition 15.1 (Ordinary Least Squares (OLS) estimator)
The ordinary least squares estimator (OLS) is obtained by minimizing the sum of the squared residuals expressed by the function (15.1)QOLS(b)=u^(b)u^(b). that return an estimate of the parameter b. In this context, the fitted residuals are seen as function of the unknown b ().

Formally, the OLS estimator is the solution of the following minimization problem, i.e.  (15.2)bOLS=argminbΘb{QOLS(b)}, where ΘbRk is the parameter space. Notably, if X is non-singular one obtain an analytic expression, i.e.
(15.3)bOLSk×1=(XX)1Xy. Equivalently, it is possible to express in terms of the covariance matrix of the X and y, i.e.  (15.4)bOLS=Cv{X,Y}V{X}.

Proof. Developing the product of the residuals in : (15.5)QOLS(b)=u^(b)u^(b)==(yXb)(yXb)==yybXyyXb+bXXb==yy2bXy+bXXb To find the minimum, let’s compute the derivative of QOLS with respect to b, set it equal to zero and solve for b=bOLS, i.e. bQOLS(b)=2Xy+2XXb=0Xy=XXbbOLS=(XX)1Xy To establish if the above solution corresponds also to a global minimum, one must check the sign of the second derivative, i.e. b2QOLS(b)=2XX>0, that in this case being always positive defined and denotes a global minimum. An alternative derivation of this estimator, as in , is obtained by estimate the variance-covariance matrix Cv{X,Y} as in and the variance matrix V{X} as in .

OLS estimator of beta
b <- solve(t(X) %*% X) %*% t(X) %*% y
Singularity of X

Note that the solution is available if and only if X is non-singular. Hence, the columns should not be linearly dependent. In fact, one of the k-variables can be written as a linear combination of the others, then the determinant of the matrix XX is zero and the inversion is not possible. Moreover, to have that rank(XX)=k it is necessary that the number of observations have to be greater or equal than the number of regressors, i.e. nk.

Intercept estimate

If in the data matrix X was included a column with ones, then the intercept parameter is obtained from or . However, if it was not included, it is computed as: αOLS=E{YXbOLS}.

OLS estimator of the intercept
# Vector of ones 
J_1n <- matrix(1, nrow = 1, ncol = n)
# Vector of means 
x_bar <- t((J_1n %*% X)/n)
# OLS estimator 
a <- mean(y - X %*% b)

15.2.1 Projection matrices

Substituting the OLS solution () in we obtain the matrix Px, that project the vector y on the sub space of Rn generated by the matrix of the regressors X, i.e.  (15.6)Px=X(XX)1X.

Projection matrix Px
P_x <- X %*% solve(t(X) %*% X) %*% t(X) 

Proposition 15.2 (Properties Matrix Px)
The projection matrix Px satisfies the following three properties, i.e. 

  1. Px is an n×n symmetric matrix.
  2. Px Px=Px is idempotent.
  3. Px X=X.
Properties of matrix Mx
# Property 2. 
# sum((P_x %*% P_x - P_x)^2) # close to zero
# Property 3.
# sum((P_x %*% X - X)^2) # close to zero

Proof. Let’s consider the property 2. of Px, i.e. Px Px=(X(XX)1X)(X(XX)1X)==(XX)1X==Px Let’s consider the property 3. of Px, i.e. Px X=[X(XX)1X]X=X.

Substituting the OLS solution () in the residuals () we obtain another projection matrix Mx, that projects the vector y on the orthogonal sub-space with respect to the sub-space generated by the matrix of the regressors X, i.e.  (15.7)Mx=InPx. where In is identity matrix ().

Projection matrix Mx
M_x <- diag(1, n, n) - P_x

Proposition 15.3 (Properties Matrix Mx)
The projection matrix Mx satisfies the following 3 properties, i.e. 

  1. Mx is n×n and symmetric.
  2. Mx Mx=Mx is idempotent.
  3. Mx X=0.
Properties of matrix Mx
# Property 2. 
# sum((M_x %*% M_x - M_x)^2) # close to zero
# Property 3.
# sum((M_x %*% X)^2) # close to zero

Proof. Let’s consider the property 2. of Mx, i.e. MxMx=(InPx)(InPx)==InPx==Mx Let’s consider the property 3. of Mx, i.e. MxX=(InPx)X==(InX(XX)1X)X==XX=0

Remark 15.1. By definition Mx and Px are orthogonal, i.e. Px Mx=0. Hence, the fitted values defined as y^=Pxy are the projection of the empiric values on the sub-space generated by X. Symmetrically, the fitted residuals u^=Mxy are the projection of the empiric values on the sub-space orthogonal to the sub-space generated by X.

Proof. Let’s prove the orthogonality between Mx and Px, i.e.  PxMx=Px(InPx)=PxPx=0.

Orthogonality of matrix Px and Mx
# sum(M_x %*% P_x) # close to zero

15.2.2 Properties OLS

Theorem 15.1 (Gauss-Markov theorem)
Under the Gauss-Markov hypothesis the Ordinary Least Square (OLS) estimate is BLUE (Best Linear Unbiased Estimator), where “best” stands for the estimator with minimum variance in the class of linear unbiased estimators of the unknown true population parameter b. More precisely, the Gauss-Markov hypothesis are:

  1. y=Xb+u.
  2. E{u}=0.
  3. E{uu}=σu2In, i.e. omoskedasticity.
  4. X is non-stochastic and independent from the errors for all n’s.

Proposition 15.4 (Properties OLS estimator)
1. Unbiased: bOLS is correct and it’s conditional expectation is equal to true parameter in population, i.e.  (15.8)E{bOLSX}=b. 2. Linear in the sense that it can be written as a linear combination of y and X, i.e. bOLS=Axy, where Ax do not depend on y, i.e.
(15.9)bOLS=Axy,Ax=(XX)1X. 3. Under the Gauss-Markov hypothesis () bOLS is the estimator that has the minimum variance in the class of the unbiased linear estimators of b and it’s variance reads: (15.10)V{bOLSX}=σu2(XX)1. Denoting with cjj the j-th element on the diagonal of (XX)1, the variance of the j-th regressor reads (15.11)V{bjOLSX}=σu2(XX)[j,j]1. where (XX)[j,j]1 denotes the element on the diagonal in the position [j,j].

Matrix (XX)1
tXX <- solve(t(X) %*% X)

Proof.

  1. The OLS estimator is correct: it’s expected value is computed from and substituting , is equal to the true parameter in population, i.e.
    E{bOLSX}=E{(XX)1XyX}==E{(XX)1X(Xb+u)X}==(XX)1XXb+(XX)1XE{uX}==b

  2. In general, applying the properties of the variance operator, the variance of bOLS is computed as: V{bOLSX}=V{(XX)1XyX}==V{(XX)1X(Xb+u)X}==V{(XX)1XXb+(XX)1XuX}==V{b+(XX)1XuX}==V{(XX)1XuX} Then, since X is non-stochastic one can bring it outside the variance thus obtaining: (15.12)V{bOLSX}=(XX)1XV{uX}X(XX)1==(XX)1XE{uuX}X(XX)1 Under the Gauss Markov hypothesis () the conditional variance V{uX}=σu2In and therefore the reduces to: V{bOLSX}=σu2(XX)1XX(XX)1==σu2(XX)1

15.3 Variance decomposition

In a linear model, the deviance (or total variance) of the dependent variable y can be decomposed into the sum of the regression variance and the dispersion variance. This decomposition helps us understand how much of the total variability in the data is explained by the model and how much is due to unexplained variability (residuals).

  • Total Deviance (Dev{y}): represents the total variability of the dependent variable y. It is calculated as the sum of the squared difference of yi from its mean y¯.

  • Regression Deviance (DevReg{y}): represents the portion of variability that is explained by the regression model. It is computed as the sum of the squared differences between the fitted values y^i and y¯.

  • Dispersion Deviance (DevDisp{y}): represents the portion of variability that is not explained by the model. It is computed as the sum of the squared differences between the observed values yi and the fitted values y^i ().

Hence, the total deviance of y can be decomposed as follows: (15.13)Dev{y}= DevReg{y}+DevDisp{y}i=1n(yiy¯)2= i=1n(y^iy¯)2+i=1n(y^iyi)2yyny¯2 =bXXbny¯2+uu where y¯ is the sample mean ().

Variance decomposition
y_bar <- mean(y)
# Fitted values 
y_hat <- a + X %*% b
# Residuals 
u <- y - y_hat
# Deviance of y
dev_y <- sum(y^2) - n*y_bar^2 # 6.642875 
# Deviance of regression
dev_reg <- sum((y_hat - y_bar)^2) # 6.162379
# Deviance of dispersion
dev_disp <- sum(u^2) # 0.4804956
# equal to zero 
# dev_y - (dev_reg + dev_disp) 

Proof. Let’s prove the expression for the regression deviance DevReg{y}, i.e.  DevReg{y}=Dev{y}DevDisp{y}==yyny¯2uu==yyny¯2(yXb)(yXb)==yyny¯2+yyyXbybX+bXXb==2yyny¯22y(Xb)+bXXb==bXXbny¯2

The decomposition of the deviance of y holds true also with respect to the correspondents degrees of freedom.

Deviance Degrees of freedom Variance
Dev{y}=i=1n(yiy¯)2 n1 s^y2=Dev{y}n1
DevReg{y}=i=1n(y^iy¯)2 k1 s^r2=DevReg{y}k1
DevDisp{y}=i=1n(y^iyi)2 nk1 s^u2=DevDisp{y}nk1
Table 15.1: Deviance and variance decomposition in a multivariate linear model

15.3.1 Estimator of σu2

The OLS estimator do not depend on variance of the residuals σu2 and it is not possible to obtain in one step both the estimators.

Proposition 15.5 (Unbiased estimator of σu2)
Let’s define an unbiased estimator of the population variance σu2 as: s^u2=u^(bOLS)u^(bOLS)nk1. Note that if also an intercept is included the denominator became nk1. In general the regression variance overestimate the true variance σu2, i.e. s^r2=kσu2+g(b,X),g(b,X)0. Only in the special case where b1=b2==bk=0 in population, then g(b,X)=0 and also the regression variance produces a correct estimate of σu2.

Estimator of σu2
s2_u <- dev_disp / (n - k) # 0.0800826

Proof. By definition, the residuals can be computed pre multiplying the matrix Mx () to y, i.e. u^(bOLS)=yy^(bOLS)==yXbOLS==yX(XX)1Xy==(InPx)y==Mxy Substituting the true relation in population, i.e. y=Xb+u in population, one obtain u^(bOLS)=Mx(Xb+u)==MxXb+Mxu==Mxu since MxX=0. Being the matrix Mx symmetric and idempotent (): u^(bOLS)u^(bOLS)=(Mxu)(Mxu)==uMxMxu==uMxu Thus, since uMxu is a scalar, the expected value of the deviance of dispersion read E{u^(bOLS)u^(bOLS)}=E{uMxu}==E{trace(uMxu)}==E{trace(Mxuu)}==trace(MxE{uu})==E{uu}trace(MxIn)==σu2trace(MxIn)==σu2trace(Mx) The trace () of the matrix Mx reads trace(Mx)=trace(InPx)==trace(In)trace(X(XX)1X)==trace(In)trace(XX(XX)1)==trace(In)trace(Jk+1)==nk1 where implicitely we consider a column of 1 for the intercept. Hence, E{u^(bOLS)u^(bOLS)}=σu2(nk1). Equivalently, th expectation of the deviance of dispersion is equal to E{DevDisp{y}}=E{uu}=σu2(nk1).

15.3.2 R2

The R2 statistic, also known as the coefficient of determination, is a measure used to assess the goodness of fit of a regression model. In a multivariate context, it evaluates how well the independent variables explain the variability of the dependent variable.

Definition 15.1 (Multivariate R2)
The R2 represents the proportion of the variation in the dependent variable that is explained or predicted by the independent variables. Formally, it is defined as the ratio of the deviance explained by the model (DevReg{y}) to the total deviance (Dev{y}). It can also be expressed as one minus the ratio of the residual deviance (DevDisp{y}) to the total deviance, i.e.  (15.14)R2=DevReg{y}Dev{y}=1DevDisp{y}Dev{y}. Using the variance decomposition (), it is possible to write a multivariate version of the R2 as: (15.15)R2=(bOLS)XXbOLSny¯2yyny¯2=1u^(bOLS)u^(bOLS)yyny¯2.

R2
R2 <- 1 - dev_disp  / dev_y # 0.9276675

The numerator represents the variance explained by the regression model, while the denominator the total variance in the dependent variable. The term numerator in the second expression represents the variance of the residuals, or the variance not explained by the model. A value of the R2 close to 1 denotes that a large proportion of the variability of the dependent variable has been explained by the regression model, while a value close to 0 indicates that the model explains very little of the variability.

Variance Inflation Factor (VIF)

An alternative expression for the variance of the j-th regressor () reads V{bjOLS}=σu2Dev{Xj}11Rj02VIFj, where Dev{Xj} is the deviance of the regressor j and Rj02 is the multivariate coefficient of determination on the regression of Xj on the other regressors.

Proposition 15.6 (Adjusted R2)
A more robust indicator that does not always increase with the addition of a new regressor is the adjusted R2, which is computed as: R¯2=1n1nk1DevDisp{y}Dev{y}=1s^u2s^y2. The R¯2 can be negative, and its value will always be less than or equal to that of R2. Unlike R2, the adjusted version increases only when the new explanatory variable improves the model more than would be expected simply by adding another variable.

Adjusted R2
# R2_bar <- 1 - (n - 1) / (n - k) * (dev_disp / dev_y) # 0.8915013

Proof. To arrive at the formulation of the adjusted R2 let’s consider that under the null hypothesis H0:b1=b2==bk the variance of regression s^r2 () is a correct estimate of the variance of the residuals σ2. Hence, under H0: n1kE{DevReg{y}Dev{y}}=1. This implies that the expectation of the R2 is not zero (as it should be under H0) but: E{R2}=kn1. Let’s rescale the R2 such that when H0 holds true it is equal to zero, i.e.  Rc2=R2kn1. However, the specification of Rc2 implies that when R2=1 (perfect linear relation between X and y) the value of Rc2<1, i.e. Rc2=nk1n1<1. Hence, let’s correct again the indicator such that it takes values in [0,1], i.e.  R¯2=(R2kn1)n1nk1==(R2(n1)kn1)n1nk1==n1nk1R2knk1 Remembering that R2 can be rewritten as in one obtain: R¯2=n1nk1(1DevDisp{y}Dev{y})knk1==(n1)Dev{y}(n1)DevDisp{y}Dev{y}(nk1)knk1==n1nk1n1nk1DevDisp{y}Dev{y}knk1==1n1nk1DevDisp{y}Dev{y}==1s^u2s^y2

Limitations of R2

The R2 statistic has some limitations. Firstly, it can be close to 1 even if the relationship between the variables is not linear. Additionally, R2 increases whenever a new regressor is added to the model, making it unsuitable for comparing models with different numbers of regressors.

15.4 Diagnostic

Let’s consider a linear model where the residuals u are IID normally distributed random variables. Hence, the working hypothesis of the Gauss Markov theorem holds true.

15.4.1 t-test for bj

A t-test evaluate if a parameter in a regression is stastically different from zero, given the effect of the others k1 regressors. The test is built under the null hypothesis of linear independence in population between y and Xj, i.e.  H0:bj=0H1:bj0. If the residuals are normally distributed, then the vector of parameter b^ is distributed as a multivariate normal, thus also the marginal distribution of each b^j will be normal.

Using the expectation () and variance () of bjOLS, we can standardize the estimated parameter bjOLS to obtain a Student-t distributed statistic (), i.e.  (15.16)tj=bjOLSE{bjOLS}V{bjOLS}=bjOLSbjs^u2cjjtnk1, where the unknown σu2 is replaced with its correct estimator s^u2. Under H0:bj=0 and one obtain (15.17)tj=H0bjOLSs^u2cjjtnk1. Given a confidence level α the test is rejected if the test statistic falls in the rejection area, i.e.  H0 is rejectedtj<qα/2ORtj>q1α/2, where q is the quantile function of a Student-t with ν=nk1 degrees of freedom.

t-tests
# variance of the b_ols
v_b_ols <- diag(tXX) * s2_u
# t-statistic
t_stat <- b / sqrt(v_b_ols)
# p-values 
p.val <- (1-pt(abs(t_stat), n-k))*2
# [t1]  2.480090  --> p.value 0.0477  (4.77 %)
# [t2]  2.867638  --> p.value 0.0285  (2.85 %)
# [t3] -7.696738  --> p.value 0.00025 (0.025 %)

15.4.2 Confidence intervals for b

Under the assumption of normality, the statistic tj () can be used to build a confidence interval around

Under the assumption of normality, from , one can build a confidence interval for bj, i.e.  bj[bjOLS+qα/2V{bjOLS},bjOLS+q1α/2V{bjOLS}]. where α is the confidence level, qα is the quantile at level α of a Student-t distribution with nk1 and V{bjOLS} reads as in .

Confidence intervals
conf.int <- cbind(b + qt(0.05, n-k) * sqrt(v_b_ols),
                  b + qt(0.95, n-k) * sqrt(v_b_ols))
# With 90% probability the true b is inside the bounds
# [b1]  0.05049223  < b <  0.4159749
# [b2]  0.11779346  < b <  0.6129894
# [b3] -0.62083599  < b < -0.3705442

15.4.3 F-test for the regression

The Ftest evaluates the significance of the entire regression model by testing the null hypothesis of linear independence between y and X, i.e.  H0:b1=b2==bk=0H1:at least one bj0, where the only coefficient different from zero is the intercept. In this case, the test statistic reads (15.18)Ftest=s^r2s^u2=DevReg(y)(nk1)kDevDisp(y)Fk,nk1, that is distributed with an F-Fischer () with ν1=k and ν2=nk1 degrees of freedom. s^r2 is the regression variance and s^u2 is the dispersion variance. By fixing a significance level α, the null hypothesis H0 is rejected if Ftest>Fk,nk1α. Remembering the relation between the deviance and the R2, i.e. DevReg(y)=R2Dev(y) and DevDisp(y)=(1R2)Dev(y), it is possible to express the F-test in terms of the multivariate R2 as: Ftest=R21R2nk1kFk,nk1.

Ftest
F_test <- R2/(1-R2) * (n-k)/(k-1)    # 19.23757
F_test <- dev_reg / dev_disp * ((n-k)/(k-1)) # 19.23757
# Critical value 
# qf(0.9, k, n-k) 3.180763
# P-value
# 1 - pf(F_test, k-1, n-k) # 0.0008050532 (0.08 %)
Interpretation F-test

If the null hypothesis H0 is rejected then:

  • The variability of Y explained by the model is significantly greater than the residual variability.
  • At least one of the k regressors has a coefficient bk that is significantly different from zero in the population.

On contrary if H0 is not rejected, then the model is not adequate and there is no evidence of a linear relation between y and X.

15.5 Multi-equations OLS

Proposition 15.7 (Multi-equations OLS estimator)
Let’s consider a multivariate linear model, i.e. with p>1 in (), then the model in matrix notation reads: Yn×p=Jn,1a1×p+Xn×kbk×p+Un×p, then the OLS estimate of b is obtained from , i.e.  bOLS=Cv(Y,X)Cv(X)1, and similarly for the intercept a aOLS=E{Y}bOLSE{X}. The variance covariance matrix of the residuals is computed as: Σ=Cv(u)=Cv(Y)bOLSCv(Y,X).

Example 15.1 Let’s simulate n observations for the regressors X from a multivariate normal distribution with parameters μX=(0.50.50.5)ΣX=(0.50.20.10.21.20.10.10.10.3). Then, to construct two dependent variables we simulate a matrix p×k=6 for the parameters b from a standard normal, i.e. for j=1,,6, bjN(0,1) and the intercept parameters a from a uniform distribution in [0,1], i.e.  bp×k=(b1,1b1,2b1,3b2,1b2,2b2,3)ap×1=(a1a2). Thus, for i=1,,n, one obtain a multi-equation model of the form: {Yi,1=β0,1+β1,1Xi,1+β1,2Xi,2+β1,kXi,3+ui,1Yi,2=β0,2+β2,1Xi,1+β2,2Xi,2+β2,kXi,3+ui,2 where ui,1 and ui,2 are simulated from a multivariate normal random variables with true covariance matrix equal to: Cv{u}=(0.550.30.30.70).

Setup
library(dplyr)
######################## Setup ########################
set.seed(1) # random seed 
n <- 500    # number of observations
p <- 2      # number of dependent variables 
k <- 3      # number of regressors 
# True regressor's mean 
true_e_x <- matrix(rep(0.5, k), ncol = 1)
# True regressor's covariance matrix 
true_cv_x <-  matrix(c(v_z1 = 0.5, cv_12 = 0.2, cv_13 = 0.1, 
                       cv_21 = 0.2, v_z2 = 1.2, cv_23 = 0.1, 
                       cv_31 = 0.1, cv_32 = 0.1, v_z3 = 0.3), 
                     nrow = k, byrow = FALSE)
# True covariance of the residuals 
true_cv_e <- matrix(c(0.55, 0.3, 0.3, 0.70), nrow = p)
##########################################################
# Generate a synthetic data set 
## Regressors  
X <- mvtnorm::rmvnorm(n, true_e_x, true_cv_x) 
## Slope (Beta)
true_beta <- rnorm(p*k)
true_beta <- matrix(true_beta, ncol = k, byrow = TRUE) 
## Intercept (Alpha)
true_alpha <- runif(p, min = 0, max = 1)
true_alpha <- matrix(true_alpha, ncol = 1) 
## Matrix of 1 for matrix multiplication  
ones <- matrix(rep(1, n), ncol = 1)
## Fitted response variable 
Y <- ones %*% t(true_alpha) + X %*% t(true_beta)
## Simulated error 
eps <- mvtnorm::rmvnorm(n, sigma = true_cv_e)
## Perturbed response variable 
Y_tilde <- Y + eps
Parameters fit
# True Beta 
df_beta_true <- dplyr::as_tibble(true_beta)
colnames(df_beta_true) <- paste0("$\\mathbf{b}_", 1:ncol(df_beta_true), "$")
# Perturbed Beta (fitted)
fit_beta <- cov(Y_tilde, X) %*% solve(cov(X))
df_beta_pert <- dplyr::as_tibble(fit_beta)
colnames(df_beta_pert) <- paste0("$\\mathbf{b}_", 1:ncol(df_beta_pert), "$")
# True Alpha 
df_alpha_true <- dplyr::as_tibble(t(true_alpha))
colnames(df_alpha_true) <- paste0("$\\mathbf{a}_", 1:ncol(df_alpha_true), "$")
# Perturbed Alpha (fitted)
## Perturbed mean  
e_y <- matrix(apply(Y_tilde, 2, mean), ncol = 1)
e_x <- matrix(apply(X, 2, mean), ncol = 1)
## Estimated Alpha (on perturbed data)
fit_alpha <- e_y - cov(Y_tilde, X) %*% solve(cov(X)) %*% e_x
df_alpha_pert <- dplyr::as_tibble(t(fit_alpha))
colnames(df_alpha_pert) <- paste0("$\\mathbf{a}_", 1:ncol(df_alpha_pert), "$")
Parameter b1 b2 b3
True 0.8500 -0.9253 0.8936
True -0.9410 0.5390 -0.1820
Fitted 0.8457 -0.8699 0.9396
Fitted -0.9532 0.5518 -0.1804
Table 15.2: Fitted parameters
Parameter a1 a2
True 0.8137 0.8068
Fitted 0.7942 0.7423
Table 15.3: Fitted parameters