3  Measurable maps

Reference: Chapter 3. Resnick ().

A measurable space is composed by a sample space Ω and a σ-field of subsets of Ω, namely B.

3.1 Maps and inverse maps

Let’s be very general and consider two measurable spaces (Ω,B), (Ω,B) and a map (function) X(ω) that associate to every ωΩ an outcome ωΩ, i.e.  X:(Ω,B)(Ω,B). Then, X determine a function X1 called inverse map, i.e.  X1:P(Ω)P(Ω). where P denotes the power set, i.e. the set of all subsets and the largest σ-field available. Then, X1 is defined such that for every set B in Ω (3.1)X1(B)={ωΩ:X(ω)B}.

Exercise 3.1 Let’s consider a deck of poker cards with 52 cards in total. We have 4 groups of 13 distinct cards, where the Jack (J) is 11, the Queen (Q) is 12, the King (K) is 13 and Ace (A) is 14. Then, let’s consider function of the form X(ω)={+1ifω{2,3,4,5,6}0ifω{7,8,9}1ifω{10,11,12,13,14} In this case the sample space Ω is composed by 14 unique elements (52 elements in total, i.e. all the cards), while Ω={1,0,1} represents the possible outcomes. Let’s say that one observe a value X(ω)={+1}Ω, define X1({+1}) according to .

Solution 3.1. Let’s say that one observe a value X(ω)={+1}Ω, then the inverse map X1 identifies the set of ω such that X(ω)={+1}, i.e.  X1({+1})={ωΩ:X(ω){+1}}={2,3,4,5,6}.

The inverse map X1 has many properties. Among them, it preserves complementation, union and intersection.

  1. X1(Ω)=Ω.
  2. X1()=.
  3. X1(Ac)=(X1(A))c.
  4. X1(ΩA)=ΩX1(Ac).
  5. X1(nAn)=nX1(An) for all AnB.

Let’s consider two sets A and B both in Ω. Then, by definition: X1(AB)={ωΩ:X(ω)AB}=={ωΩ:X(ω)AORX(ω)B}=={ωΩ:X(ω)A}{ωΩ:X(ω)B}==X1(A)X1(B) Similarly for the intersection, i.e.  X1(AB)={ωΩ:X(ω)AB}=={ωΩ:X(ω)AANDX(ω)B}=={ωΩ:X(ω)A}{ωΩ:X(ω)B}==X1(A)X1(B)

Proposition 3.1 If B is a σ-field of subsets of Ω, then X1(B) is a σ-field of subsets of Ω. Moreover, if C is a class of subsets of Ω, then X1(σ(C))=σ(X1(C)), that is. the inverse image of the σ-field generated by the class CΩ is the same as the σ-field generated in Ω by the inverse image X1. In practice, the counter image and the generators commute. Usually can be difficult to know all about the σ-field B, however if we know a class of subset that generate it, namely CΩ, we are able to recreate the σ-field. (References: propositions 3.1.1, 3.1.2. A prob path).

3.1.1 Measurable maps

Definition 3.1 (Measurable map)
Let’s consider the function X:(Ω,B)(Ω,B), then X is B-measurable, namely XB/B, iff: XB/BX1(B)B.

Note that the measurability concept is very important since only if X is measurable it is possible to make probability statements about X. In fact, the probability is nothing more than a type of measure, only if X1(B)B for all BB it is possible to assign a measure (probability) to the events that are contained in B.

Definition 3.2 (Test for measurability)
Consider a map X:(Ω,B)(Ω,B) and the class C that generates the σ-field B, i.e. B=σ(C). Then X is B-measurable iff: XB/BX1(C)B.

3.2 Random variables

Definition 3.3 (Random variable)
Let’s consider a probability space (Ω,F,P), then a random variable is a map (function) where (Ω,B)=(R,B(R)). Therefore it takes values on the real line, i.e. X:(Ω,B)(R,B(R)) and for every set B in the Borel σ-field generated by the real line B(R), the counter image of B, namely X1(B) is in the σ-field B generated by Ω. Formally, BB(R)X1(B)={ωΩ:X(ω)B}B. More precisely, when X is a random variable the test for measurability (), became: X1((,y])=[X(ω)y]ByR. In practice, B=(,y] that depends on the real number y while the event [X(ω)y]=X1(B) is exactly the counter image of B.

3.2.1 σ-field generated by a map

Let X:(Ω,B)(Ω,B) be a measurable map, then the σ-field generated by X is defined as: σ(X)=X1(B)={ωΩ:X(ω)B,BB} When X is a random variable, then (Ω,B)=(R,B(R)) and the σ-field generated by X σ(X)=X1(B(R))={ωΩ:X(ω)B,BB(R)}

Definition 3.4 Let X be a random variable with set of possible outcomes Ω. Then, X is called

  • discrete random variable if Ω is either a finite set or a countably infinite set.
  • continuous random variable if Ω is either a an uncountable infinite set.

Definition 3.5 A random vector is a map Xn:(Ω,B)(Rn,B(Rn)) It’s counter image X1 is defined as: BB(Rn)Xn1(B)={ωΩ:Xn(ω)B}B Note that, for every ω, the random vector has n-components, i.e.  Xn(ω)=(X1(ω)X2(ω)Xn(ω))

Proposition 3.2 The map Xn:(Ω,B)(Rn,B(Rn)) is a random vector, if and only if each component X1,X2,,Xn is a random variable.

Proof. Let’s prove that if X1,X2,,Xn are random variables, then X is a random vector. Since we are assuming that each i-th component is a random variable we have that for all i=1,2,,n yiRXi1((,yi])B. Therefore, to prove that Xn is a random vector we have to prove that the cartesian product of all the counter images, i.e.  {ωΩ:X1(,y1]X2(,y2]Xk(,yk]}, is in B. That is equivalent to the intersection i=1k{ωΩ:Xi(,yi]}=i=1kXi1((,yi])B. Since, each the counter image of each Xi1((,yi])B and B is a σ-field, hence closed under countable intersection, also the intersection of all the counter images will be in B.

3.3 Induced distribution function

Consider a probability space (Ω,B,P) and a measurable map X:(Ω,B)(Ω,B), then the composition PX1 is again a map. In this way at each element ωΩ is attached a probability measure. In fact, the composition is a map such that PX1:(Ω,B)[0,1](Ω,B)X1(Ω,B)P[0,1] In general, the probability of a subset AB is denoted equivalently as: PX1(A)=P(X1(A))=P(X(ω)A)

Exercise 3.2 Let’s continue from the and compute the probability of PX1({+1}). Let’s consider one random extraction from the 52 cards, then for each distinct number we have 4 copies. Compute P(X1({+1})) and P(X1({0})).

Solution 3.2. The probability is computed as: P(X1({+1}))=P({ωΩ:X(ω){1}})==P({2,3,4,5,6})==5452=51338.46% Let’s consider the probability of observing either {+1} or {1}, then P(X1({1,+1}))=P({ωΩ:X(ω){1,+1}})==P({2,3,4,5,6,10,11,12,13,14})==10452=101376.92% Finally, by property of the probability measure P(X(ω){0})=1P(X(ω){1,+1})23.08%.

3.3.1 Distribution function on R

When X is a random variable the composition P(X1(A)) is a probability measure induced on R by the composition: P(X1((,y])=P(Xy)=FX(y), for all y in R. Hence, the distribution function FX of a random variable X is a function FX:(R,B(R))[0,1] and represents a probability measure on the real line R, i.e.  FX(y)=P(X(ω)[,y))=P(X(ω)y), or for short P(Xy).

Warning

In general, the probability distribution is a function FX:(Ω,B)[0,1], however when X is a random variable this means that (Ω,B)=(R,B(R)).

(a) Distribution functions.
(b) Quantile functions.
Figure 3.1: Different distribution functions in R.

Proposition 3.3 (Density of a random variable)
If a random variable X has a continuous and differentiable distribution function, then its probability density function (pdf) is defined as the first derivative of the distribution function with respect to y, i.e. (3.2)fX(y)=ddyFX(y)dFX(y)=fX(y)dy. Instead, for a discrete random variable we call it probability mass function (pmf), i.e.  fX(y)=P(X=y). Considering a generic domain DR for the random variable X, then the function fX satisfies two fundamental properties, i.e.

  1. Positivity: fX(y)0 for all yD.
  2. Normalization: DfX(y)dy=1 or yDP(X=y)=1.
Warning

In general, any function fX that satisfies properties 1. and 2. in is the density function of some (unknown) random variable.

3.3.2 Survival function

For any random variable with distribution FX, the survival distribution is defined as: F¯X(y)=P(Xy)=1FX(y).

Definition 3.6 (Criterion I for heavy tails)
A distribution function FX is said to be:

  1. Light tailed if for some λ>0 limyF¯X(y)eλy={0faster than exponential0<l<same speed as the exponential

  2. Heavy tailed if for all λ>0 limyF¯X(y)eλy=slower than the exponential.