13  Multivariate data

Let’s consider a matrix \(\mathbf{X}_{n \times k}\) with \(n\)-observations and \(k\)-variables. Then, let’s define some useful operations that can be performed on the matrix. \[ \underset{n \times k }{\mathbf{X}} = \begin{pmatrix} x_{1,1} & \dots & x_{1,j} & \dots & x_{1,k} \\ \vdots & & \vdots & & \vdots \\ x_{i,1} & \dots & x_{i,j} & \dots & x_{i,k} \\ \vdots & & \vdots & & \vdots \\ x_{n,1} & \dots & x_{n,j} & \dots & x_{n,k} \\ \end{pmatrix} = \begin{pmatrix} \underset{n \times 1}{\mathbf{x}_{1}} & \dots & \underset{n \times 1}{\mathbf{x}_{j}} & \dots & \underset{n \times 1}{\mathbf{x}_{k}} \\ \end{pmatrix} \text{.} \tag{13.1}\] where \(\mathbf{x}_{j} = \begin{pmatrix} x_{1,j} & \dots & x_{i,j} & \dots & x_{n,j} \end{pmatrix}^\top\). In general:

Generate the matrix of data
# True variance - covariance matrix 
sigma <- diag(1, 4)
sigma[2,2] <- 1.3
sigma[3,3] <- 0.7
sigma[4,4] <- 1.7
sigma[1,2] <- sigma[2,1] <- 0.3
sigma[1,3] <- sigma[3,1] <- 0.4
sigma[1,4] <- sigma[4,1] <- -0.8
# Simulate data 
set.seed(1)
W <- mvtnorm::rmvnorm(10, sigma = sigma) 
y <- W[,1]
X <- W[,-1]
n <- nrow(X)
k <- ncol(X)

\[ \underset{10 \times 3 }{\mathbf{X}} = \begin{array}{c|rrrr} i & \mathbf{x}_1 & \mathbf{x}_2 & \mathbf{x}_3 \\ \hline 1 & 0.167 & -0.750 & 2.191 \\ 2 & -0.865 & 0.524 & 0.793 \\ 3 & -0.276 & 1.374 & 0.327 \\ 4 & -2.623 & 0.783 & 0.179 \\ 5 & 1.061 & 0.657 & 0.809 \\ 6 & 0.9743 & 0.179 & -2.806 \\ 7 & -0.000647 & -0.0367 & -2.078 \\ 8 & 0.364 & 0.946 & 0.132 \\ 9 & 0.0189 & -1.0173 & -0.733 \\ 10 & -0.132 & 0.8137 & 1.153 \\ \end{array} \]

13.1 Vector of means

Let’s consider the matrix \(\mathbf{X}\) (Equation 13.1), then the vector of means for each column is computed as: \[ \underset{k \times 1}{\bar{\mathbf{x}}} = \begin{pmatrix} \bar{x}_1 \\ \vdots \\ \bar{x}_k \\ \end{pmatrix} = \begin{pmatrix} \frac{1}{n} \mathbf{J}_{1,n} \mathbf{X} \end{pmatrix}^{\top} = \frac{1}{n} \mathbf{X}^{\top} \mathbf{J}_{n,1} \text{.} \] where \(\mathbf{J}_{n,1}\) is the matrix of ones (Equation 31.2).

Vector of means
# Vector of ones 
J_1n <- matrix(1, nrow = 1, ncol = n)
# Vector of means 
x_bar <- t((J_1n %*% X)/n)

# Standard computation
# matrix(apply(X, 2, mean), nrow = k, ncol = 1)

\[ \underset{3 \times 1}{\bar{\mathbf{x}}} = \begin{pmatrix} \bar{x}_1 \\ \bar{x}_2 \\ \bar{x}_3 \\ \end{pmatrix} = \begin{pmatrix} - 0.131 \\ 0.347 \\ -0.00296 \\ \end{pmatrix} \]

13.2 Deviation matrix

Let’s compute the matrix of centered observations, where each element is computed as \(\tilde{x}_{i,j} = x_{i,j} - \bar{x}_{j}\), i.e. \[ \underset{n \times k }{\widetilde{\mathbf{X}}} = \begin{pmatrix} \tilde{x}_{1,1} & \dots & \tilde{x}_{1,j} & \dots & \tilde{x}_{1,k} \\ \vdots & & \vdots & & \vdots \\ \tilde{x}_{i,1} & \dots & \tilde{x}_{i,j} & \dots & \tilde{x}_{i,k} \\ \vdots & & \vdots & & \vdots \\ \tilde{x}_{n,1} & \dots & \tilde{x}_{n,j} & \dots & \tilde{x}_{n,k} \end{pmatrix} \text{.} \] In matrix notation, it is possible to estimate \(\widetilde{\mathbf{X}}\) as: \[ \begin{aligned} \widetilde{\mathbf{X}} & {} = \mathbf{X} - \mathbf{J}_{n,1} \bar{\mathbf{x}}^{\top} = \\ & = \mathbf{X} - \mathbf{J}_{n,1} \left( \frac{1}{n} \mathbf{X}^{\top} \mathbf{J}_{n,1} \right)^{\top} = \\ & = \mathbf{X} - \frac{1}{n} \mathbf{J}_{n,1} \mathbf{J}_{1,n} \mathbf{X} = \\ & = \left(\mathbf{I}_{n} - \frac{1}{n} \mathbf{J}_{n} \right) \mathbf{X} = \\ & = \mathbf{A} \mathbf{X} \end{aligned} \] where \(\mathbf{I}_{n}\) the identity matrix (Equation 31.3) and \(\mathbf{J}_{n}\) the matrix of ones (Equation 31.2). \(\mathbf{A}\) is called centering matrix and it is formally defined as: \[ \underset{n \times n }{\mathbf{A}} = \mathbf{I}_{n} - \frac{1}{n} \mathbf{J}_{n} \text{.} \tag{13.2}\]

Deviation matrix
# Identity matrix
I <- diag(1, n, n)
# Centering matrix
A <- I - (t(J_1n) %*% J_1n)/n
# Deviation matrix
X_tilde <- A %*% X

\[ \underset{10 \times 3 }{\widetilde{\mathbf{X}}} = \begin{array}{c|rrr} i & \tilde{\mathbf{x}}_1 & \tilde{\mathbf{x}}_2 & \tilde{\mathbf{x}}_3 \\ \hline 1 & 0.299 & -1.098 & 2.194 \\ 2 & -0.7343 & 0.1767 & 0.7968 \\ 3 & -0.145 & 1.026 & 0.3308 \\ 4 & -2.492 & 0.4359 & 0.1829 \\ 5 & 1.192 & 0.3104 & 0.8122 \\ 6 & 1.105 & -0.1681 & -2.803 \\ 7 & 0.1305 & -0.3842 & -2.075 \\ 8 & 0.4959 & 0.5991 & 0.1355 \\ 9 & 0.15 & -1.364 & -0.7304 \\ 10 & -0.001575 & 0.4662 & 1.156 \\ \end{array} \]

13.3 Variance-covariance matrix

The covariance for a matrix \(\mathbf{X}\) is also a matrix \(k \times k\) of the form: \[ \underset{k \times k }{\boldsymbol{\Sigma}} = \begin{pmatrix} \mathbb{V}\{\mathbf{X}_1\} & \dots & \mathbb{C}v\{\mathbf{X}_1, \mathbf{X}_j\} & \dots & \mathbb{C}v\{\mathbf{X}_1, \mathbf{X}_k\} \\ \vdots & & \vdots & & \vdots \\ \mathbb{C}v\{\mathbf{X}_j, \mathbf{X}_1\} & \dots & \mathbb{V}\{\mathbf{X}_j\} & \dots & \mathbb{C}v\{\mathbf{X}_j, \mathbf{X}_k\} \\ \vdots & & \vdots & & \vdots \\ \mathbb{C}v\{\mathbf{X}_k, \mathbf{X}_1\} & \dots & \mathbb{C}v\{\mathbf{X}_k, \mathbf{X}_j\} & \dots & \mathbb{V}\{\mathbf{X}_k\} \\ \end{pmatrix} \tag{13.3}\] Given a matrix \(\mathbf{X}\), the variance-covariance matrix in matrix notation can be estimated as \[ \begin{aligned} \boldsymbol{\Sigma} & {} = \frac{1}{n}\widetilde{\mathbf{X}}^{\top} \widetilde{\mathbf{X}} = \\ & = \frac{1}{n} \left( \textbf{A} \mathbf{X} \right)^{\top} \textbf{A} \mathbf{X} = \\ & = \frac{1}{n} \mathbf{X}^{\top} \textbf{A}^{\top} \textbf{A} \mathbf{X} \end{aligned} \tag{13.4}\] where \(\mathbf{A}\) is the centering matrix (Equation 13.2). The variance-covariance matrix is:

  1. Squared, i.e. \(k \times k\), and symmetric.
  2. Semi-definite positive.
  3. Has the variances on the diagonal. Hence the trace (Equation 31.8) is \(\text{trace}(\mathbb{C}v\{\mathbf{X}\}) = \sum_{j = 1}^{k} \mathbb{V}\{\mathbf{X}_j\}\).
Covariance matrix
# covariance matrix
cv_x <- (t(A %*% X) %*% A %*% X) / (n-1)
# In terms of deviation matrix 
# t(X_tilde) %*% X_tilde / (n-1)
# Standard computation
# cov(X)

\[ \underset{3 \times 3 }{\boldsymbol{\Sigma}} = \begin{pmatrix} 1.087 & -0.163 & -0.319 \\ -0.163 & 0.576 & 0.143 \\ -0.319 & 0.143 & 2.25 \\ \end{pmatrix} \]

13.4 Stardardized variables

In order to remove the effect of the unit of measure in the different variables it is possible to work under the matrix of standardized variables, i.e.  \[ \underset{n \times k}{\mathbf{Z}} = \begin{pmatrix} z_{1,1} & \dots & z_{1,j} & \dots & z_{1,k} \\ \vdots & & \vdots & & \vdots \\ z_{i,1} & \dots & z_{i,j} & \dots & z_{i,k} \\ \vdots & & \vdots & & \vdots \\ z_{n,1} & \dots & z_{n,j} & \dots & z_{n,k} \\ \end{pmatrix} \text{.} \] where each element \(z_{i,j}\) is defined as: \[ z_{i,j} = \frac{\tilde{x}_{i,j}}{\sqrt{\mathbb{V}\{\mathbf{X}_j\}}} = \frac{x_{i,j} - \bar{x}_j}{\sqrt{\mathbb{V}\{\mathbf{X}_j\}}} \text{.} \] In matrix notation, \(\mathbf{Z}\) can be rewritten as: \[ \mathbf{Z} = \widetilde{\mathbf{X}} \cdot \mathbf{D}^{-\frac{1}{2}} \text{.} \] where the matrix \(\mathbf{D}\) is defined as: \[ \underset{k \times k}{\mathbf{D}} = \boldsymbol{\Sigma} \cdot \mathbf{I}_{k} = \begin{pmatrix} \mathbb{V}\{\mathbf{X}_1\} & \dots & 0 & \dots & 0 \\ \vdots & & \vdots & & \vdots \\ 0 & \dots & \mathbb{V}\{\mathbf{X}_j\} & \dots & 0 \\ \vdots & & \vdots & & \vdots \\ 0 & \dots & 0 & \dots & \mathbb{V}\{\mathbf{X}_n\} \\ \end{pmatrix} \text{.} \tag{13.5}\] where \(\mathbf{I}_{k}\) is the identity matrix (Equation 31.3). The standardized variables have mean equal to zero and unitary variance. Moreover, the numbers do not depend anymore from the unit of measure of the variables.

Matrix D
# Variances 
v_x <- c(mean((X[,1] - x_bar[1])^2),
         mean((X[,2] - x_bar[2])^2), 
         mean((X[,3] - x_bar[3])^2)) * n/(n-1)
D <- (1/sqrt(v_x)) * diag(1, k, k)

\[ \underset{3 \times 3 }{\mathbf{D}} = \begin{pmatrix} 1.087 & 0 & 0 \\ 0 & 0.576 & 0 \\ 0 & 0 & 2.25 \\ \end{pmatrix} \]

Standardized variables
# Standardized variables
Z <- X_tilde %*% D
# Standard computation
# scale(X)

\[ \underset{10 \times 3}{\mathbf{Z}} = \begin{array}{c|rrr} i & \mathbf{z}_1 & \mathbf{z}_2 & \mathbf{z}_3 \\ \hline 1 & 0.2868 & -1.446 & 1.461 \\ 2 & -0.7042 & 0.2327 & 0.5305 \\ 3 & -0.1391 & 1.352 & 0.2203 \\ 4 & -2.39 & 0.574 & 0.1218 \\ 5 & 1.143 & 0.4087 & 0.5408 \\ 6 & 1.06 & -0.2213 & -1.866 \\ 7 & 0.1251 & -0.506 & -1.382 \\ 8 & 0.4755 & 0.7889 & 0.0902 \\ 9 & 0.1439 & -1.796 & -0.4863 \\ 10 & -0.00151 & 0.6139 & 0.7696 \\ \end{array} \]

13.5 Correlations matrix

The correlation is a statistic that measure the linear dependence between two or more variables. The elements of the correlation matrix are: \[ \underset{k \times k}{\boldsymbol{\rho}} = \begin{pmatrix} 1 & \dots & \mathbb{C}r\{\mathbf{X}_1, \mathbf{X}_j\} & \dots & \mathbb{C}r\{\mathbf{X}_1, \mathbf{X}_k\} \\ \vdots & & \vdots & & \vdots \\ \mathbb{C}r\{\mathbf{X}_j, \mathbf{X}_1\} & \dots & 1 & \dots & \mathbb{C}r\{\mathbf{X}_j, \mathbf{X}_k\} \\ \vdots & & \vdots & & \vdots \\ \mathbb{C}r\{\mathbf{X}_k, \mathbf{X}_1\} & \dots & \mathbb{C}r\{\mathbf{X}_k, \mathbf{X}_j\} & \dots & 1 \\ \end{pmatrix} \text{.} \] where each element is the correlation (Equation 5.11) between \(\mathbf{X}_j\) and \(\mathbf{X}_i\) random variables.

In matrix notation, the correlation of a matrix \(\mathbf{X}\) can be estimated from the deviation matrix or from the standardized one, i.e. \[ \begin{aligned} \boldsymbol{\rho} & {} = \mathbf{D}^{-\frac{1}{2}} \boldsymbol{\Sigma} \, \mathbf{D}^{-\frac{1}{2}} = \\ & = \frac{1}{n} \mathbf{D}^{-\frac{1}{2}} \widetilde{\mathbf{X}}^{\top} \widetilde{\mathbf{X}} \, \mathbf{D}^{-\frac{1}{2}} = \\ & = \frac{1}{n} \mathbf{Z}^{\top} \mathbf{Z} \end{aligned} \] where \(\mathbf{D}\) is defined as in Equation 13.5. The correlation matrix is:

  1. Squared \(k \times k\) and symmetric.
  2. Positive semi-definite matrix.
  3. Trace: (Equation 31.8), i.e. \(\text{trace}(\boldsymbol{\varrho}) = \sum_{j = 1}^{k} 1 = k\).
Correlation matrix
# Correlation matrix 
cr_x <- t(Z) %*% Z / (n-1)

# In terms of covariance matrix 
# t(D) %*% cv_x %*% D 

# In terms of deviation matrix 
# t(X_tilde %*% D) %*% X_tilde %*% D / (n-1)

# Standard computation
# cor(X)

\[ \underset{3 \times 3}{\boldsymbol{\rho}} = \begin{pmatrix} 1.0 & -0.205 & -0.204 \\ -0.205 & 1.0 & 0.125 \\ -0.204 & 0.125 & 1.0 \\ \end{pmatrix} \]