2  Probability measure

Reference: Chapter 2. Resnick (2005).

A probability space is a triple \((\Omega, \mathcal{B}, \mathbb{P})\) where

  1. \(\Omega\), the sample space.
  2. \(\mathcal{B}\), a \(\sigma\)-field of subsets of \(\Omega\) where each element is called event.
  3. \(\mathbb{P}\) is a probability measure.

Definition 2.1 (\(\color{magenta}{\textbf{Probability measure}}\))
A probability measure \(\mathbb{P}\) is any function \(\mathbb{P}: \mathcal{B} \rightarrow [0,1]\) such that

  1. \(\mathbb{P}(A) \ge 0\) for all sets \(A \in \mathcal{B}\).
  2. \(\mathbb{P}(\Omega) = 1\).
  3. \(\mathbb{P}\) is \(\sigma\)-additive: if \(\{A_n\}_{n \ge 1}\) is a sequence of disjoint events in \(\mathcal{B}\), then: \[ \mathbb{P}\left(\overset{\infty}{\underset{n = 1}{{\color{red}{\bigsqcup}}}} A_n\right) = \sum_{n= 1}^{\infty}\mathbb{P}(A_n) \text{.} \tag{2.1}\]

In general, a probability measure \(\mathbb{P}\) is a function that always goes from a \(\sigma\)-field of subsets of \(\Omega\) to \([0,1]\).

2.1 Consequences of the axioms

Proposition 2.1 (\(\color{magenta}{\textbf{Probability of the complement}}\))
The probability of the complement of a set \(A\) reads \[ \mathbb{P}(A^{\mathsf{c}}) = 1 - \mathbb{P}(A) \text{.} \tag{2.2}\]

Proof. Since it is possible to write \(\Omega = A {\color{red}{\cup}} A^{\mathsf{c}}\) as the union of disjoint set, we can apply \(\sigma\)-additivity (Equation 2.1) to obtain: \[ \begin{aligned} \Omega = A {\color{red}{\sqcup}} A^{\mathsf{c}} & {} \overset{\mathbb{P}}{\longrightarrow} \mathbb{P}(\Omega) = \mathbb{P}(A) + \mathbb{P}(A^{\mathsf{c}}) \\ & \implies 1 = \mathbb{P}(A) + \mathbb{P}(A^{\mathsf{c}}) \\ & \implies \mathbb{P}(A^{\mathsf{c}}) = 1 -\mathbb{P}(A)\\ \end{aligned} \]

Proposition 2.2 (\(\color{magenta}{\textbf{Probability of the empty set}}\))
The probability of the empty set \(\emptyset\) is zero, i.e. \(\mathbb{P}(\emptyset) = 0\).

Proof. Using the fact that \(\mathbb{P}(\Omega) = 1\) by assumption and applying Equation 2.2: \[ \mathbb{P}(\emptyset) = 1 - \mathbb{P}(\emptyset^{c}) = 1 - \mathbb{P}(\Omega) = 0 \text{.} \]

Proposition 2.3 (\(\color{magenta}{\textbf{Probability of the union}}\))
The Probability of the union of two sets: \[ \mathbb{P}(A {\color{red}{\cup}} B) = \mathbb{P}(A) + \mathbb{P}(B) - \mathbb{P}(A {\color{blue}{\cap}} B) \text{.} \]

Proof. Let’s write the sets \(A\) and \(B\) in terms of union of disjoint events (Equation 1.9) and apply \(\mathbb{P}\) on both side and \(\sigma\)-additivity (Equation 2.1). \[ \begin{aligned} & {} \mathbb{P}(A) = \mathbb{P}(A {\color{blue}{\cap}} B) + \mathbb{P}(A {\color{blue}{\cap}} B^{c}) \implies \mathbb{P}(A {\color{blue}{\cap}} B^{c}) = \mathbb{P}(A) - \mathbb{P}(A {\color{blue}{\cap}} B) \\ & \mathbb{P}(B) = \mathbb{P}(A {\color{blue}{\cap}} B) + \mathbb{P}(B {\color{blue}{\cap}} A^{c}) \implies \mathbb{P}(B {\color{blue}{\cap}} A^{c}) = \mathbb{P}(B) - \mathbb{P}(A {\color{blue}{\cap}} B) \end{aligned} \tag{2.3}\] Let’s now decompose \(A {\color{red}{\cup}} B\) in the disjoint union of 3 events (Equation 1.8) and again, apply \(\mathbb{P}\) on both side and \(\sigma\)-additivity: \[ \mathbb{P}(A {\color{red}{\cup}} B) = \mathbb{P}(A {\color{blue}{\cap}} B) + \mathbb{P}(A {\color{blue}{\cap}} B^{\mathsf{c}}) + \mathbb{P}(A^{\mathsf{c}} {\color{blue}{\cap}} B) \text{.} \] Substituting \(\mathbb{P}(A {\color{blue}{\cap}} B^{c})\) and \(\mathbb{P}(A {\color{blue}{\cap}} B^{c})\) from Equation 2.3 gives the result: \[ \begin{aligned} \mathbb{P}(A {\color{red}{\cup}} B) & {} = \mathbb{P}(A {\color{blue}{\cap}} B) + \mathbb{P}(B) - \mathbb{P}(A {\color{blue}{\cap}} B) + \mathbb{P}(A) - \mathbb{P}(A {\color{blue}{\cap}} B) = \\ & = \mathbb{P}(B) + \mathbb{P}(A) - \mathbb{P}(A {\color{blue}{\cap}} B) \end{aligned} \]

Proposition 2.4 (\(\color{magenta}{\textbf{Monotonicity of probability measure}}\))
The probability measure \(\mathbb{P}\) is non-decreasing, in the sense that given two events \(A\) and \(B\), then \[ A \subset B \implies \mathbb{P}(A) \le \mathbb{P}(B) \text{.} \]

Proof. The proof of the statements follows once the set \(B\) is written as disjoint union of subsets of \(A\) and \(B\) (Equation 1.9). Then, applying the probability \(\mathbb{P}\) and \(\sigma\)-additivity on both sides one obtain: \[ \mathbb{P}(B) = \mathbb{P}(A) + \mathbb{P}(B-A) \ge \mathbb{P}(A) \text{.} \]

Further properties are:

  1. Subadditivity: the measure \(\mathbb{P}\) is \(\sigma\)-subadditive. For a sequence of events \(\{A_n\}_{n\ge 1}\) in \(\mathcal{B}\) then: \[ \mathbb{P}\left(\overset{\infty}{\underset{n = 1}{{\color{red}{\bigcup}}}} A_n\right) \le \sum_{n= 1}^{\infty}\mathbb{P}(A_n) \text{.} \tag{2.4}\]

  2. Continuity: the measure \(\mathbb{P}\) is continuous for a monotone sequence of sets \(A_n \in \mathcal{B}\), i.e.  \[ A_n \uparrow A \implies \mathbb{P}(A_n) \uparrow \mathbb{P}(A) \text{,}\quad A_n \downarrow A \implies \mathbb{P}(A_n) \downarrow \mathbb{P}(A) \text{.} \tag{2.5}\]

  3. Fatou’s lemma: consider a sequence of events \(\{A_n\}_{n\ge 1}\) in \(\mathcal{B}\), then we have the following result: \[ \mathbb{P}(\underset{n\to\infty}{\lim \inf} A_n) \le \underset{n\to\infty}{\lim \inf} \; \mathbb{P}(A_n) \le \underset{n\to\infty}{\lim \sup} \; \mathbb{P}(A_n) \le \mathbb{P}(\underset{n\to\infty}{\lim \sup} A_n) \text{.} \tag{2.6}\]