8  Convergence concepts

Last modified

June 4, 2026

Reference: Chapter 6. Resnick (2005).

Let’s consider a sequence of real numbers, say \(a_n\). Stating that the associated series converges, formally \(\sum_{k=1}^\infty a_k < \infty\), implies that the tail sums converge to zero, i.e. \[ \sum_{k=1}^\infty a_k < \infty \implies \lim_{n \to \infty} \sum_{k=n}^{\infty} a_k = 0 \text{.} \]

8.1 Types of convergence

Definition 8.1 (Point wise) A sequence of random variables \(\{X_n\}_{n\ge1}\) is said to be convergent pointwise to a limit \(X\) if for all \(\omega \in \Omega\): \[ X_n(\omega) \underset{n \to \infty}{\longrightarrow} X(\omega) \iff \lim_{n \to \infty} X_n(\omega) = X(\omega) \text{.} \] This kind of definition requires that convergence happens for every \(\omega \in \Omega\).

Example 8.1 Let \(\Omega = \{0,1\}\) and let’s define for each \(\omega\) a sequence of random variables defined as: \[ X_n(\omega) = \frac{\omega}{n} \text{.} \] Then \(X_n(\omega)\) converges pointwise to 0 for every \(\omega\); in fact, \[ \lim_{n\to \infty} X_n(\omega) = \frac{\omega}{n} = 0 \quad \forall \omega \in \Omega \text{.} \]

Definition 8.2 (Almost Surely) A sequence of random variables \(\{X_n\}_{n \ge 1}\) is said to be convergent almost surely to a limit \(X\) if: \[ \mathbb{P}\{\omega \in \Omega : \lim_{n \to \infty} X_n(\omega) = X(\omega)\} = 1 \text{.} \] Usually, such kind of convergence is denoted as: \[ X_n(\omega) \overset{\text{a.s.}}{\underset{n \to \infty}{\longrightarrow}} X(\omega) \text{.} \] In other terms, almost sure convergence implies that the relation must hold for all \(\omega \in \Omega\) except for some \(\omega\)’s that are in \(\Omega\), but whose probability of occurrence is zero.

Example 8.2 Let \(\Omega = [0,1]\) with \(\omega \sim \text{Uniform}[0,1]\). Define the sequence of random variables \[ X_n(\omega) = \mathbf{1}_{[\omega \leq \frac{1}{n}]}(\omega) \text{,} \]

  • If \(\omega > 0\), then for sufficiently large \(n\) we have \(\omega > 1/n\), hence \(X_n(\omega)=0\) eventually and \(X_n(\omega)\to 0\).
  • If \(\omega = 0\), then \(X_n(0)=1\) for all \(n\), so \(X_n(0) \to 1\).

Thus the pointwise limit is \(0\) for all \(\omega > 0\) and \(1\) for \(\omega=0\). Since the exceptional set \(\{\omega=0\}\) has probability zero under the uniform law, the limit is \(0\) almost surely: \[ X_n \overset{\text{a.s.}}{\underset{n \to \infty}{\longrightarrow}} 0 \text{.} \]

Definition 8.3 (In Probability) A sequence of random variables \(\{X_n\}_{n\ge1}\) is said to be convergent in probability to a limit \(X\) if, for a fixed \(\epsilon > 0\): \[ \lim_{n \to \infty}\mathbb{P}\{\omega \in \Omega : |X_n(\omega) - X(\omega)|> \epsilon\} = 0 \text{.} \] Usually, such kind of convergence is denoted as: \[ X_n(\omega) \overset{\text{p}}{\underset{n \to \infty}{\longrightarrow}} X(\omega) \text{.} \]

Example 8.3 Let \(X_n \sim \text{Bernoulli}(1/n)\), independent across \(n\); then \(X_n\) converges in probability to zero. In fact, fixing an \(\epsilon > 0\), \[ \mathbb{P}(|X_n-0| > \epsilon) = \mathbb{P}(X_n=1) = \frac{1}{n} \overset{\text{p}}{\underset{n \to \infty}{\longrightarrow}} 0 \text{.} \]

Definition 8.4 (In L_p) A sequence of events \(X_n\) such that: \[ \mathbb{E}\{|X_n|^p\} < \infty, \quad \mathbb{E}\{|X|^p\} < \infty \text{,} \] is said to be convergent in \(L_p\), with \(p > 0\), to a random variable \(X\) iff \[ X_n(\omega) \overset{L_p}{\underset{n \to \infty}{\longrightarrow}} X(\omega) \iff \lim_{n \to \infty}\mathbb{E}\{|X_n - X|^p\} = 0 \text{.} \] Usually, such kind of convergence is denoted as: \[ X_n \underset{n\to\infty}{\overset{L_p}{\longrightarrow}} X \text{.} \]

Note that it can be proved that there is no relation between almost sure convergence and \(L_p\) convergence, i.e. one does not imply the other and vice versa. However, convergence in a bigger space, say \(q > s\), implies convergence in the smaller space, i.e. \[ X_n \underset{n\to\infty}{\overset{L_q}{\longrightarrow}} X \implies X_n \underset{n\to\infty}{\overset{L_p}{\longrightarrow}} X, \quad 0 < p < q \text{.} \]

Example 8.4 Let \(X_n = 1/n\) almost surely, then \(X_n \to 0\) in \(L_p\) for any \(p \ge 1\) since \[ \mathbb{E}\{|X_n - 0|^p\} = \left(\frac{1}{n}\right)^p \overset{L_p}{\underset{n \to \infty}{\longrightarrow}} 0 \text{.} \]

Definition 8.5 (In Distribution) A sequence of random variables \(X_n\) is said to be convergent in distribution to a random variable \(X\) if the distribution of \(F_{X_n}\) to \(F_{X}\) for all \(x\), i.e. \[ \lim_{n \to \infty} F_{X_n}(x) = F_{X}(x) \text{,} \] where \(x\) is a continuity point of \(F_X\). Usually, such kind of convergence is denoted as: \[ X_n(\omega) \overset{\text{d}}{\underset{n \to \infty}{\longrightarrow}} X(\omega) \text{.} \]

In other terms, we have convergence in distribution if the distribution of \(X_n\), namely \(F_{X_n}\), converges as \(n \to \infty\) to the distribution of \(X\), namely \(F_{X}\). Note that convergence in distribution is not related to the probability space but involves only the distribution functions.

Example 8.5 Let \(X_n\) be a sequence of normal random variables, i.e. \[ X_n \sim \mathcal{N}\left(\frac{\mu}{n}, \sigma^2 + \frac{1}{n}\right) \text{.} \] As \(n \to \infty\), the distribution of \(X_n\) collapses to a normal with mean zero, variance \(\sigma^2\), i.e. \[ X_n \overset{\text{d}}{\underset{n \to \infty}{\longrightarrow}} \mathcal{N}\left(0, \sigma^2\right) \text{.} \]

8.2 Laws of Large Numbers

There are many versions of laws of large numbers (LLN). In general, a sequence \(\{X_n\}_{n\ge 1}\) is said to satisfy a LLN if: \[ \frac{S_n}{n} = \frac{1}{n}\sum_{i = 1}^{n} X_i \longrightarrow X \text{.} \]

Strong vs weak laws of large numbers

In general, if convergence happens almost surely (Definition 8.2) we speak about strong laws of large numbers (SLLN). Otherwise, if convergence happens in probability we speak about weak laws of large numbers (WLLN). A crucial difference to be noted is that when convergence happens almost surely we are dealing with a limit of a sequence of sets (limit is inside \(\mathbb{P}\)), instead if convergence happens in probability we are dealing with a limit of a sequence of real numbers in \([0,1]\) (limit is outside \(\mathbb{P}\)).

8.2.1 Strong Laws of Large Numbers

Proposition 8.1 (Kolmogorov SLLN) Let’s consider a sequence of IID random variables \(\{X_n\}_{n \ge 1}\). Then, there exists a constant \(c \in \mathbb{R}\) such that: \[ \frac{1}{n} \sum_{i = 1}^{n} X_i = \frac{S_n}{n} \overset{\text{a.s.}}{\underset{n\to \infty}{\longrightarrow}} c \text{.} \] Then, if \(\mathbb{E}\{|X_1|\}< \infty\) in which case \(c = \mathbb{E}\{|X_1|\}\).

Proposition 8.2 (SLLN without independence) Let’s consider a sequence of identically distributed random variables \(\{X_n\}_{n \ge 1}\), i.e. \(\mathbb{E}\{X_n\} = \mathbb{E}\{X_1\}\) for all \(n\), such that:

  1. \(\mathbb{E}\{X^2\} < c\) where \(c > 0\) is a constant independent from \(n\).
  2. \(\mathbb{C}v\{X_i, X_j\} = 0 \quad \forall i \neq j\).

\[ \frac{1}{n} \sum_{i = 1}^{n} X_i = \frac{S_n}{n} \overset{\text{a.s.}}{\underset{n\to \infty}{\longrightarrow}} \mathbb{E}\{X_1\} \text{.} \]

Note that the existence of the first moment and the fact that it is finite, i.e. \(\mathbb{E}\{X_1\} < \infty\), implies that the characteristic function of the random variable exists at zero, i.e. \(\exists \phi_{X_1}^{\prime}(0)\). On the other hand, the existence of the characteristic function at zero does not ensure that the first moment is finite.

8.2.2 Weak Laws of Large Numbers

Let’s repeat a random experiment many times, always ensuring the same conditions, in such a way that the sequence of experiments is IID. Then, each random variable \(X_i\) comes from the same population with an unknown mean \(\mathbb{E}\{X\}\) and variance \(\mathbb{V}\{X\}\). Thanks to the WLLN, by repeating the experiment many times we have that the sample mean of the experiment converges in probability to the true population mean. Convergence in probability means that: \[ \lim_{n \to \infty}\mathbb{P}\left\{\omega \in \Omega : \left|\frac{1}{n}\sum_{i = 1}^{n}X_i(\omega) - \mathbb{E}\{X(\omega)\} \right|> \epsilon\right\} = 0 \text{.} \]

Proposition 8.3 (WLLN with variances) Given a sequence of independent and identically distributed random variables \(\{X_n\}_{n \ge 1}\) such that:

  1. \(\mathbb{E}\{X_1\} = \mu\).
  2. \(\mathbb{E}\{X_1^2\} < \infty\).

\[ \bar{X}_n = \frac{1}{n} \sum_{i = 1}^{n} X_i = \frac{S_n}{n} \overset{\text{p}}{\underset{n\to \infty}{\longrightarrow}} \mathbb{E}\{X_1\} = \mu \text{.} \]

Proof. Let’s consider the random variable \(\bar{X}_n = \frac{1}{n} \sum_{i = 1}^{n} X_i\); then, since by assumption the mean and variance are finite, let’s apply Chebyshev’s inequality (Equation 5.20), i.e. \[ \mathbb{P}(|\bar{X}_n - \mu| \ge \lambda) \le \frac{1}{\lambda^2}\mathbb{V}\{\bar{X}_n - \mu\} \text{.} \] Using a well known scaling property of variance let’s simplify it as: \[ \begin{aligned} \mathbb{V}\{\bar{X}_n - \mu\} & {} = \mathbb{V}\left\{\frac{1}{n} \sum_{i = 1}^{n} X_i - \mu\right\} = {} && (\text{Constant})\\ & = \mathbb{V}\left\{\frac{1}{n} \sum_{i = 1}^{n} X_i\right\} = && (\text{Scaling})\\ & = \frac{1}{n^2} \mathbb{V}\left\{\sum_{i = 1}^{n} X_i\right\} = && (\text{Independence})\\ & = \frac{1}{n^2} \sum_{i = 1}^{n} \mathbb{V}\{X_i\} = && (\text{Identically distribution}) \\ & = \frac{n \sigma^2}{n^2} = \frac{\sigma^2}{n} \end{aligned} \] Therefore, Chebyshev’s inequality becomes \[ \mathbb{P}(\mid \bar{X}_n - \mu \mid \ge \lambda) \le \frac{\sigma^2}{n\lambda^2} \text{.} \] Taking the limit as \(n\to\infty\) proves the convergence in probability, i.e. \[ \lim_{n\to\infty}\mathbb{P}(\mid\bar{X}_n - \mu \mid \ge \lambda) \le \lim_{n\to\infty} \frac{\sigma^2}{n\lambda^2} = 0 \text{.} \]

Proposition 8.4 (Khintchin’s WLLN under first moment hypothesis) Given a sequence of independent and identically distributed random variables \(\{X_n\}_{n \ge 1}\) such that:

  1. \(\mathbb{E}\{X_1\} < \infty\).
  2. \(\mathbb{E}\{X_n\} = \mu\).

\[ \bar{X}_n = \frac{1}{n} \sum_{i = 1}^{n} X_i = \frac{S_n}{n} \overset{\text{p}}{\underset{n\to \infty}{\longrightarrow}} \mathbb{E}\{X_1\} = \mu \text{.} \]

Proposition 8.5 (Feller’s WLLN without first moment) Given a sequence of independent and identically distributed random variables \(\{X_n\}_{n \ge 1}\) such that: \[ \lim_{x\to\infty}x\mathbb{P}\{|X_1| > x\} = 0 \text{,} \] then \[ \bar{X}_n = \frac{1}{n} \sum_{i = 1}^{n} X_i = \frac{S_n}{n} \overset{\text{p}}{\underset{n\to \infty}{\longrightarrow}} \mathbb{E}\{X_1 \mathbb{1}_{[|X_1| \le n]}\} \text{.} \] Note that this result makes no assumptions about a finite first moment.

Proof. Let’s verify that under the assumptions of the SLLN without independence (Proposition 8.2) we will always have convergence in probability, i.e. \[ \bar{X}_n = \frac{1}{n} \sum_{i = 1}^{n} X_i \overset{\text{p}}{\underset{n\to \infty}{\longrightarrow}} \mathbb{E}\{X_1\} \text{.} \]

Using Chebyshev’s inequality (Equation 5.20), fix an \(\varepsilon > 0\) such that: \[ \mathbb{P}(|\bar{X}_n - \mathbb{E}\{X_1\}| > \varepsilon) \le \frac{\mathbb{V}\{\bar{X}_n\}}{\varepsilon^2} \text{.} \] Let’s explicit the computations, i.e. \[ \begin{aligned} \frac{\mathbb{V}\{\bar{X}_n\}}{\varepsilon^2} & {} = \frac{1}{n^2 \varepsilon^2} \mathbb{V}\left\{\sum_{i=1}^{n} X_i\right\} = \\ & = \frac{1}{n^2 \varepsilon^2} \left[\sum_{i=1}^{n} \mathbb{V}\{X_i\} + \sum_{i = 1}^{n} \sum_{j\neq i}^{n} \mathbb{C}v\{X_i, X_j\} \right] \end{aligned} \] By assumption the covariances are zero \(\mathbb{C}v\{X_i, X_j\} = 0 \forall i \neq j\). Moreover, since \(\mathbb{V}\{X_i\} = \mathbb{E}\{X_i^2\} - \mathbb{E}\{X_i\}^2\) it is possible to upper bound the variance with the second moment, namely \(\mathbb{V}\{X_i\} \le \mathbb{E}\{X_i^2\}\), i.e. \[ \frac{1}{n^2 \varepsilon^2} \sum_{i=1}^{n} \mathbb{V}\{X_i\} \le \frac{1}{n^2 \varepsilon^2} \sum_{i=1}^{n} \mathbb{E}\{X_i^2\} \text{.} \] Since by the assumption of the SLLN we have that \(\mathbb{E}\{X^2\} < c\) where \(c > 0\) is a constant independent from \(n\) we can further upper bound the probability by: \[ \frac{1}{n^2 \varepsilon^2} \sum_{i=1}^{n} \mathbb{E}\{X_i^2\} \le \frac{1}{n^2 \varepsilon^2} \sum_{i=1}^{n} c = \frac{nc}{n^2 \varepsilon^2} = \frac{c}{n \varepsilon^2} \text{.} \] Finally, if we take the limit for \(n\to\infty\), it is equal to zero, implying convergence in probability: \[ 0 \le \lim_{n\to\infty}\mathbb{P}(|\bar{X}_n - \mathbb{E}\{X_1\}| > \varepsilon) \le \lim_{n\to\infty} \frac{c}{n \varepsilon^2} = 0 \text{.} \]

8.3 Central Limit Theorem

Theorem 8.1 (Central Limit Theorem (CLT) - IID case) Let’s consider a sequence of \(n\) random variables, \(X_n = (X_1, \dots, X_n)\), where each \(X_i\) is independent and identically distributed (IID), i.e. \[ \begin{aligned} X_i \sim \text{IID}(\mu, \sigma^2) {} & \implies \mathbb{E}\{X_i\} = \mathbb{E}\{X_1\} = \mu \\ & \implies \mathbb{V}\{X_i\} = \mathbb{V}\{X_1\} = \sigma^2 \end{aligned} \] Then, the CLT states that, when the sample is large, the random variable \(S_n\) \[ S_n = \sum_{i=1}^{n} X_i \text{,} \] defined by the sum of all the \(X_i\), is normally distributed, i.e. \[ S_n \overset{\text{d}}{\underset{n\to\infty}{\sim}} \mathcal{N}(\mathbb{E}\{S_n\}, \mathbb{V}\{S_n\}) \text{.} \] Since the \(X_i\) are IID, the moments of \(S_n\) read explicitly \[ \mathbb{E}\{S_n\} = n \mathbb{E}\{X_1\} = n \mu, \quad \mathbb{V}\{S_n\} = n \mathbb{V}\{X_1\} = n \sigma^2 \text{.} \] Alternatively, the CLT can be written in terms of the standardized random variable \(Z_n\) \[ Z_n = \frac{S_n - \mathbb{E}\{S_n\}}{\sqrt{\mathbb{V}\{S_n\}}} = \frac{\sum_{i=1}^{n} X_i - n \mu }{\sqrt{n} \, \sigma} \overset{\text{d}}{\underset{n\to\infty}{\sim}} \mathcal{N}(0, 1) \text{,} \] which, on large samples, is distributed as a standard normal.