8  Convergence concepts

Let’s consider a sequence of real number, say an, then stating that the associated series converges, formally k=1αk<, implies that from a certain k awards ak=0, i.e.  k=1αk<limNk=nNak=0

8.1 Types of convergence

Definition 8.1 (Pointwise)
A sequence of random variables {Xn}n1 is said to be convergent point wise to a limit X iff for all ωΩ: Xn(ω)nX(ω)limnXn(ω)=X(ω) This kind of definition requires that convergence happen for every ωΩ.

Definition 8.2 (Almost Surely)
A sequence of random variables {Xn}n1 is said to be convergent almost surely to a limit X iff: P{ωΩ:limnXn(ω)=X(ω)}=1 Usually, such kind of convergence is denoted as: Xn(ω)na.s.X(ω)

In other terms, an almost surely convergence implies the relation must holds for all ωΩ with the exception of some ω’s, that are in Ω, but whose probability of occurrence is zero.

Definition 8.3 (In Probability)
A sequence of random variables {Xn}n1 is said to be convergent in probability to a limit X if, for a fixed ϵ>0: limnP{ωΩ:|Xn(ω)X(ω)|>ϵ}=0 Usually, such kind of convergence is denoted as: Xn(ω)npX(ω)

Definition 8.4 (Lp)
A sequence of events Xn such that: E{|Xn|p}<,E{|X|p}<, is said to be convergent in Lp, with p>0, to a random variable X iff Xn(ω)nLpX(ω)limnE{|XnX|p}=0 Usually, such kind of convergence is denoted as: XnLpnX

Note that, it can be proved that there is no relation between almost sure convergence and Lp convergence, i.e. one do not imply the other and viceversa. However, a convergence in a bigger space, say q>s implies the convergence in the smaller space, i.e.  XnLqnXXnLpnX,0<p<q

Definition 8.5 (In Distribution)

A sequence of random variables Xn is said to be convergent in distribution to a random variable X if the distribution of FXn to FX , i.e.  limnFXn(x)=FX(x)x where x is a continuity point of F. Usually, such kind of convergence is denoted as: Xn(ω)ndX(ω)

In other terms, we have convergence in distribution if the distribution of Xn, namely FXn, converges as n to the distribution of X, namely FX. Note that the convergence in distribution is not related with probability space but involves only the distribution functions.

8.2 Laws of Large Numbers

There are many versions of laws of large numbers (LLN). In general, a sequence {Xn}n1 is said to satisfy a LLN iff: X¯n=Snn=1ni=1nXiX

Strong vs weak laws of large numbers

In general, if convergence happens almost surely () we speak about strong laws of large numbers (SLLN). Otherwise, if convergence happens in probability we speak about weak laws of large numbers (WLLN). A crucial difference to be noted is that when convergence happens almost surely we are dealing with a limit of a sequence of sets (limit is inside P), instead if convergence happens in probability we are dealing with a limit of a sequence of real numbers in [0,1] (limit is outside P).

8.2.1 Strong Laws of Large Numbers

Definition 8.6 (Kolmogorov SLLN)
Let’s consider a sequence of IID random variables {Xn}n1. Then, there exist a constant cR such that: X¯n=1ni=1nXi=Snnna.s.c Then, if E{|X1|}< in which case c=E{|X1|}.

Definition 8.7 (SLLN without independence)
Let’s consider a sequence of identically distributed random variables {Xn}n1, i.e. E{Xn}=E{X1} for all n, such that:

  1. E{X2}<c where c>0 is a constant independent from n.
  2. Cv{Xi,Xj}=0ij.

X¯n=1ni=1nXi=Snnna.s.E{X1}

Note that the existence of the first moment and the fact that it is finite, i.e. E{X1}<, implies that there exists the characteristic function of the random variable in zero, i.e. ϕX1(0). On the other hand, the existence of the characteristic function in zero do not ensure that the first moment is finite.

8.2.2 Weak Laws of Large Numbers

Let’s repeat a random experiment many times, every time ensuring the same conditions in such a way that the sequence of the experiment are IID. Then, each random variable Xi comes from the same population with a unknown mean E{X} and variance V{X}. Thanks to the WLLN and repeating the experiment many times we have that the sample mean of the experiment converges in probability to the true mean in population. Convergence in probability means that: limnP{ωΩ:|1ni=1nXi(ω)E{X(ω)}|>ϵ}=0

Definition 8.8 (WLLN with variances)
Given a sequence of independent and identically distributed random variables {Xn}n1 such that:

  1. E{X1}=μ.
  2. E{X12}<.

X¯n=1ni=1nXi=SnnnpE{X1}=μ

Proof. Let’s consider the random variable X¯n=1ni=1nXi, then since by assumption the mean and variance are finite, let’s apply the Chebychev inequality (), i.e.  P(|X¯nμ|λ)1λ2V{X¯nμ} Using a well known scaling property of variance let’s simplify it as: V{X¯nμ}=V{1ni=1nXiμ}=(Constant)=V{1ni=1nXi}=(Scaling)=1n2V{i=1nXi}=(Independence)=1n2i=1nV{Xi}=(Identically distribution)=nσ2n2=σ2n Therefore the Chebychev inequality became P(|X¯nμ|λ)σ2nλ2 Taking the limit as n proves the convergence in probability, i.e.  limnP(|X¯nμ|λ)limnσ2nλ2=0

Definition 8.9 (Khintchin’s WLLN under first moment hypothesis)
Given a sequence of independent and identically distributed random variables {Xn}n1 such that:

  1. E{X1}<.
  2. E{Xn}=μ.

X¯n=1ni=1nXi=SnnnpE{X1}=μ

Definition 8.10 (Feller’s WLLN without first moment)
Given a sequence of independent and identically distributed random variables {Xn}n1 such that: limxxP{|X1|>x}=0 then X¯n=1ni=1nXi=SnnnpE{X11[|X1|n]} Note that this result makes not assumptions about a finite first moment.

SLLN (without independence) implies WLLN

Let’s verity that under the assumptions of the SLLN without independence () we will always have convergence in probability, i.e. X¯n=1ni=1nXinpE{X1}

Proof. Using Chebychev inequality (), fix an ε>0 such that: P(|X¯nE{X1}|>ε)V{X¯n}ε2 Let’s explicit the computations, i.e.  V{X¯n}ε2=1n2ε2V{i=1nXi}==1n2ε2[i=1nV{Xi}+i=1njinCv{Xi,Xj}] By assumption the covariances are zero Cv{Xi,Xj}=0ij. Moreover, since V{Xi}=E{Xi2}E{Xi}2 it is possible to upper bound the variance with the second moment, namely V{Xi}E{Xi2}, i.e.  1n2ε2i=1nV{Xi}1n2ε2i=1nE{Xi2} Since by the assumption of the SLLN we have that E{X2}<c where c>0 is a constant independent from n we can further upper bound the probability by: 1n2ε2i=1nE{Xi2}1n2ε2i=1nc=ncn2ε2=cnε2 Finally if we take the limit for n it is equal to zero implying convergence in probability: 0limnP(|X¯nE{X1}|>ε)limncnε2=0

8.3 Central Limit Theorem

Theorem 8.1 (Central Limit Theorem (CLT) - IID case)
Let’s consider a sequence of n random variables, Xn=(X1,,Xn), where each Xi is independent and identically distributed (IID), i.e. XiIID(μ,σ2)E{Xi}=E{X1}=μV{Xi}=V{X1}=σ2 Then, let’s define a random variable, namely Sn, given by the sum of all the Xi, i.e.
Sn=i=1nXi It is easy to see that due to the fact that the random variables are IID the moments of Sn are: E{Sn}=nE{X1}=nμ,V{Sn}=nV{X1}=nσ2 Hence, the standardized variable Zn on large samples is normally distributed, i.e.
Zn=SnE{Sn}V{Sn}=i=1nXinμnσndN(0,1)