7 Moment generating functions
Reference: Pietro Rigo (2023), Chapter 9. Resnick (2005).
Definition 7.1 (\(\color{magenta}{\textbf{Moment Generating Function}}\))
Consider an uni dimensional random variable \(X\), then the moment generating function is defined as: \[
\psi_{X}(t) = \mathbb{E}\{e^{t\mathbf{X}}\} \quad \forall t \in \mathbb{R}-\{0\}
\]
Proposition 7.1 (\(\color{magenta}{\textbf{Moment generating function and sequence of moments}}\))
Consider a random variable \(X\), such that it’s moment generating function exists and it’s finite around zero, i.e. \[
\psi_{X}(t) = \mathbb{E}\{e^{tX}\} < \infty \quad \epsilon > 0, \forall t \in (-\epsilon, \epsilon)
\]
The result presented in Proposition 7.1 implies that the sequence of moments are finite \(\mathbb{E}\{|X|^n\} < \infty\) for all \(n\) and the sequence of moments uniquely determine the distribution of \(X\). According to the result, if we consider another random variable \(Y\) such that \(\mathbb{E}\{|X|^n\} = \mathbb{E}\{|Y|^n\}\) for all \(n\), then the distribution of \(X\) and \(Y\) is are equal.
7.1 Characteristic functions
Definition 7.2 (\(\color{magenta}{\textbf{Characteristic function}}\))
Consider an \(n\)-dimensional vector \(\mathbf{X}\), then the characteristic function is defined \(\forall t \in \mathbb{R}^{n}\) as: \[
\phi_{\mathbf{X}}(\mathbf{t}) = \mathbb{E}\{ e^{i \mathbf{t}^T \mathbf{X}}\} \quad \text{where} \quad \mathbf{t}^T \mathbf{X} = \begin{pmatrix}t_1, t_2, \dots, t_n\end{pmatrix} \begin{pmatrix} X_1 \\ X_2 \\ \vdots \\ X_n \end{pmatrix}
\]
The characteristic function always exists when treated as a function of a real-valued argument, unlike the moment-generating function. The characteristic function uniquely determines the probability distribution of the correspondent random vector \(\mathbf{X}\). More precisely, saying that two random variables has the same distribution is equivalent to say that their characteristic functions are equal. It follows that we can always work under characteristic functions to prove that two distribution of some random vectors are equal or that a distribution converges to another distribution. Formally, \[ F_{X} = F_{Y} \iff \phi_{X}(t) = \phi_{Y}(t) \text{,} \quad \forall t \in \mathbb{R} \text{.} \]
Here, we list some properties considering the random variable case, i.e. \(n = 1\), with \(t \in \mathbb{R}\).
Independence: The characteristic function of the sum of \(n\)-independent random variables, namely \(S_n = \sum_{i = 1}^{n} X_n\), is exactly equal to the product of the single characteristics functions, i.e. \[ X_i \perp X_j \quad \forall i, j \in \{1, \dots n\}, \; i \neq j \iff \phi_{S_n}(t) = \prod_{i = 1 }^{n}\phi_{X_i}(t) \text{,} \tag{7.1}\] for all \(t \in \mathbb{R}\).
Existence of the \(j\)-th moment: If the \(j\)-th moment of the random variable is finite then the characteristic function is \(j\)-times differentiable and continuous in 0, i.e. \(\phi_{X} \in \mathbb{C}^{(j)}\). Formally, \[ \mathbb{E}\{|X|^r\} < \infty \implies \phi_{X} \in \mathbb{C}^{(r)} \quad \text{and} \quad \phi_{X}^{(r)} (t) = \mathbb{E}\{(i X)^{r} e^{itX}\} \quad r = 1,2,\dots, j \] Note that, if \(j\) is even, it became an if: \[ \mathbb{E}\{|X|^r\} < \infty \implies \phi_{X} \in \mathbb{C}^{(r)}, \quad \phi_{X}^{(r)} (t) = \mathbb{E}\{(i X)^{r} e^{itX}\} \quad r = 2, 4, \dots, j \]
Inversion theorem: The characteristic function \(\phi_X\) uniquely determines the probability distribution \(F_X\) of a random variable \(X\), i.e. \[ F_X(b) - F_X(a) = \frac{1}{2 \pi i} \lim_{c \to \infty} \int_{-c}^{c} \frac{e^{-ita} - e^{itb}}{t} \phi(t) dt \quad \forall a < b \text{,} \] and the density function is obtained as: \[ f(x) = \frac{1}{2\pi} \int_{-\infty}^{\infty} e^{-it\mathbf{X}} \phi_{\mathbf{X}}(t)dt \text{.} \]
Convergence in distribution: A sequence of random variables \(X_n\) converges in distribution to a random variable \(X\) if and only if the limit as \(n \to \infty\) of the characteristic function of \(X_n\) converges to the characteristic function of \(Y\), i.e. \[ X_n \overset{\text{d}}{\underset{n \to \infty}{\longrightarrow}} X \iff \phi_{X}(t) = \lim_{n \to \infty} \phi_{X_n}(t) \quad \forall t \in \mathbb{R} \]
Scaling and centering: Given \(Y = {\color{red}{a}} + {\color{blue}{b}} X\), the effect of scaling and centering on the characteristic function is such that: \[ \phi_{Y}(t) = \mathbb{E}\{e^{itY}\} = \mathbb{E}\{e^{it({\color{red}{a}} + {\color{blue}{b}} X)} \} = e^{it{\color{blue}{b}}} \mathbb{E}\{e^{i(t{\color{red}{a}}) X}\} = e^{it{\color{blue}{b}}} \phi_{X}({\color{red}{a}}t) \quad \forall {\color{red}{a}}, {\color{blue}{b}}, t \in \mathbb{R} \text{.} \]
Weak Law of Large Numbers: Consider a sequence of independent and identically distributed random variables such that the first derivative of the characteristic function in zero exists, namely \(\exists \phi_{X}^{\prime}(0)\), and the first moment of \(X_1\) is finite, i.e. \(\mathbb{E}\{|X_1|\} = \mu < \infty\), then for some \(\alpha \in \mathbb{R}\) the sample mean converges in probability to a degenerate (constant) random variable \(\alpha \in \mathbb{R}\), i.e. \[ \bar{X}_n = \frac{1}{n}\sum_{i=1}^{n}X_i \underset{n\to\infty}{\overset{\text{p}}{\longrightarrow}} \alpha \text{.} \]