4. Convergence Almost-Surely and in Probability

4.1 Motivation

Recall the notion of convergence for real-valued sequences.

Recap (Convergent sequence): A sequence (xn)nN\sequence{x} converges to \ell iff ε  NN  nN:xnε \forall \epsilon \sep \exists N \in \N \sep \forall n \in \N : \abs{x_n - \ell} \leq \epsilon and we write limnxn=\liminfty x_n = \ell.

How does the notion of convergence extend to random variables? Let (Xn)nN\sequence{X} be a sequence of random variables. Intuitively, XnX_n is a random point in R\R and we want to express that “XnX_n is close to XX”, where XX is possibly random as well. Formally, Xn:ΩRX_n : \Omega \to \R is a function and there are several ways to define convergence towards another function X:ΩRX : \Omega \to \R. We list some possible definitions:

  1. Pointwise convergence: ωΩ:limnXn(ω)=X(ω)\forall \omega \in \Omega : \liminfty X_n(\omega) = X(\omega)
  2. Uniform convergence: limnsupωΩXn(ω)X(ω)=0\liminfty \sup_{\omega \in \Omega} \abs{X_n(\omega) - X(\omega)} = 0
  3. Almost-sure convergence: P ⁣({ωΩ | limnXn(ω)=X(ω)})=1\probP{\set{\omega \in \Omega \mid \liminfty X_n(\omega) = X(\omega)}} = 1
  4. Convergence in probability: ε>0:limnP ⁣(XnX>ε)=0\forall \epsilon > 0 : \liminfty \probP{\abs{X_n - X} > \epsilon} = 0
  5. Convergence in Lp\funcspace{p}: limnE ⁣[XnXp]=0\liminfty \E{\abs{X_n - X}^p} = 0

We note that 1. and 2. are not suitable for probability, because random variables are only defined almost surely. In this chapter, we study 3. and 4., while 5. will be the topic of a later chapter. In general, the study of convergnece of random variables is related to functional analysis. To give sense to XnXX_n \to X, we choose a functional space S\setS where XnX_n and XX “live” and equip this space with a topology.

4.2 Almost Sure Convergence

Let (Xn)nN\sequence{X} and XX be random variables. If we fix ωΩ\omega \in \Omega, then (Xn(ω))nN\pabig{X_n(\omega)}_{n \in \N} is simply a sequence of real numbers and we know how to give sense to limnXn(ω)=X(ω)\liminfty X_n(\omega) = X(\omega). This equation means that the sequence converges in R\R and its limit is X(ω)X(\omega). The existence and the value of the limit generally depends on the underlying ω\omega. Now, we may consider the set of all ω\omega for which this holds {limnXn=X}={ωΩ | limnXn(ω)=X(ω)} \set{ \liminfty X_n = X } = \set{ \omega \in \Omega \mid \liminfty X_n(\omega) = X(\omega) } When this event occurs almost-surely, we say that the sequence converges almost surely.

Definition 4.1 (Almost-sure convergence): Let (Xn)nN\sequence{X} and XX be random variables. We say that XnX_n converges to XX almost surely if P ⁣({limnXn=X})=1 \probP{\set{\liminfty X_n = X}} = 1
Note:
  • We also write limnXn=a.s.X\liminfty X_n \aseq X or Xna.s.XX_n \asto X.
  • Note that limnXn=a.s.X\liminfty X_n \aseq X iff limnXnX=a.s.0\liminfty \abs{X_n-X} \aseq 0.
Example (Deterministic sequence): Let (an)nN\sequence{a} and \ell be real numbers and consider a sequence (Xn)nN\sequence{X} of deterministic random variables satisfying Xn=a.s.anX_n \aseq a_n. Then limnXn=a.s.\liminfty X_n \aseq \ell iff limnan=\liminfty a_n = \ell.
Example (Dyadic approximation): Let XX be a non-negative real random variable. We define Xn=min{n,2n2nX}X_n = \min\set{n, 2^{-n} \floor{2^n X}} for every nNn \in \N. Then we have limnXn=a.s.X\liminfty X_n \aseq X.
Example (Functional point of view): Let Ω=[0,1]\Omega = [0,1] and P=λ\P = \lambda. Define X(ω)=ωX(\omega) = \omega and Xn(ω)=ω1{ω1212n}X_n(\omega) = \omega \cdot \ind{\abs{\omega - \frac 1 2} \geq \frac{1}{2n}}. For all ω[0,1]{12}\omega \in [0,1] \setminus \set{\frac 1 2} we have limnXn(ω)=X(ω)\liminfty X_n(\omega) = X(\omega). Since P ⁣([0,1]{12})=1\probP{[0,1] \setminus \set{\frac 1 2}} = 1, it follows that Xna.s.XX_n \asto X. In other word, “we do not care what happens when ω=12\omega = \frac 1 2, because it does not happen”.
Proposition 4.2 (Criterion for a.s. convergence): Let (Xn)nN\sequence{X} and XX be random variables. If ε>0:nNP ⁣({XnXε})< \forall \epsilon > 0 : \sumN \probP{\set{\abs{X_n - X} \geq \epsilon}} < \infty then Xna.s.XX_n \asto X.
Example (Minimum of uniform rvs): Let (Un)nNiidU([0,1])\sequence{U} \simiid \lawU([0,1]). For every nNn \in \N define Xn=min{Ui | in}X_n = \min\set{U_i \mid i \leq n}. For ε>1\epsilon > 1 we have P ⁣({Xnε})=0\probP{\set{\abs{X_n} \geq \epsilon}} = 0 and for ε[1,0)\epsilon \in [1,0) we have P ⁣({Xnε})=P ⁣({U1ε})n=(1ε)n\probP{\set{\abs{X_n} \geq \epsilon}} = \probP{\set{U_1 \geq \epsilon}}^n = (1-\epsilon)^n. Since nN(1ε)n<\sumN (1 - \epsilon)^n < \infty, the criterion applies and Xna.s.0X_n \asto 0.

4.3 Convergence in Probability

Definition 4.3 (Convergence in probability): Let (Xn)nN\sequence{X} and XX be random variables. We say that XnX_n converges to XX in probability if ε>0:limnP ⁣({XnXε})=0 \forall \epsilon > 0 : \liminfty \probP{\set{\abs{X_n - X} \geq \epsilon}} = 0
Note: We also write limnXn=PX\liminfty X_n \Peq X or XnPXX_n \Pto X.
Proposition 4.4 (A.s. implies in probability): If Xna.s.XX_n \asto X, then XnPXX_n \Pto X.

We now give two examples illustrating that convergence in probability does not generally imply almost-sure convergence. The first example considers a sequence of independent Bernoulli random variables with decreasing success probabilities.

Example (Bernoulli with decreasing success): Let XnBer ⁣(1n)X_n \sim \lawBer\of{\frac 1 n} independent. Trivially, ε>1:P ⁣({Xn>ε})=0\forall \epsilon > 1 : \probP{\set{\abs{X_n} > \epsilon}} = 0 and for ε[1,0)\forall \epsilon \in [1, 0) we have P ⁣({Xnε})=P ⁣({Xn=1})=1nn0 \probP{\set{\abs{X_n} \geq \epsilon}} = \probP{\set{X_n = 1}} = \frac{1}{n} \convginfty 0 Hence XnP0X_n \Pto 0. However nNP ⁣({Xn=1})=\sum_{n \in \N} \probP{\set{X_n = 1}} = \infty and since the events {Xn=1}\set{X_n = 1} are, by Borel-Cantelli II we have lim supnXn=a.s.1    Xn(ω) almost-surely does not converge    Xn↛a.s.0\begin{align*} & \limsupinfty X_n \aseq 1 \\ \implies & X_n(\omega) \text{ almost-surely does not converge} \\ \implies & X_n \stackrel{\text{a.s.}}{\not\to} 0 \end{align*}

The second example also involves a sequence of Bernoulli random variables, but adopts a functional viewpoint by constructing a sequence of indicator functions on an explicit probability space Ω=[0,1]\Omega = [0,1].

Example (Typesetter): Let Ω=[0,1]\Omega = [0,1], F=B([0,1])\sigmaF = \borelB([0,1]) and P=λ\P = \lambda. Define
  • X1=1{[0,1]}X_1 = \ind{[0,1]}
  • X2=1{[0,12]}X_2 = \ind{[0,\frac 1 2]}, X3=1{[12,1]}X_3 = \ind{[\frac 1 2,1]}
  • X4=1{[0,14]}X_4 = \ind{[0,\frac 1 4]}, X5=1{[14,24]}X_5 = \ind{[\frac 1 4,\frac 2 4]}, X6=1{[24,34]}X_6 = \ind{[\frac 2 4,\frac 3 4]}, X7=1{[34,1]}X_7 = \ind{[\frac 3 4,1]}
  • X2i+j=1{[j2i,j+12i]}X_{2^i + j} = \ind{[\frac j {2^i},\frac{j+1}{2^i}]} for all i,jNi,j \in \N with j<2ij < 2^i
For all ε>0\epsilon > 0, we have P ⁣({Xnε})=12log2(n)<12log2(n)1=2nn0 \probP{\set{\abs{X_n} \geq \epsilon}} = \frac{1}{2^{\floor{\log_2\pa{n}}}} < \frac{1}{2^{\log_2\pa{n}-1}} = \frac{2}{n} \convginfty 0 but for any x[0,1]x \in [0,1] we have {Xn=x}\set{X_n = x} for infinitely many nn, thus Xn(ω)X_n(\omega) almost-surely does not converge. Hence XnP0X_n \Pto 0 but Xn↛a.s.0X_n \stackrel{\text{a.s.}}{\not\to} 0.
Note: Let XX be a random variable, then we introduce the notation Xm=max{X,m}X \wedge m = \max \set{X,m} for the random variable that caps XX to mRm \in \R.
Theorem 4.5 (Characterisation of convergence in probability): Let (Xn)nN\sequence{X} and XX be random variables. Then XnPX    limnE ⁣[XnX1]=0 X_n \Pto X \iff \liminfty \E{\abs{X_n - X} \wedge 1} = 0
Note: XnX1\abs{X_n - X} \wedge 1 is being used as the expectation always exists. Indeed, XnX1\abs{X_n - X} \wedge 1 has values in [0,1][0,1], hence E ⁣[XnX1]\E{\abs{X_n - X} \wedge 1} is well defined and lies in [0,1][0,1].

4.4 Converging Subsequence

Proposition 4.6 (Converging subsequence): Let (Xn)nN\sequence{X} and XX be random variables. If XnPXX_n \Pto X, then there exists a subsequence (Xn(k))kN\sequence*{X}{n(k)}{k} that converges almost-surely to XX.
Example (Bernoulli with decreasing success): Let XnBer ⁣(1n)X_n \sim \lawBer\of{\frac 1 n} independent. Then the subsequence Xk2X_{k^2} converges to 00 almost-surely.
Example (Typesetter): For the typesetter example, the subsequence X2kX_{2^k} converges to 00 almost-surely.

In summary, almost-sure convergence always implies convergence in probability, while convergence in probability only implies almost-sure convergence along a subsequence or if the convergence is “strong enough”.