Probability Theory 2025-10-14

4. Convergence Almost-Surely and in Probability

4.1 Motivation

Recall the notion of convergence for real-valued sequences.

Recap (Convergent sequence): A sequence

\sequence{x}

converges to

\ell

iff

\forall \epsilon \sep \exists N \in \N \sep \forall n \in \N : \abs{x_n - \ell} \leq \epsilon

and we write

\liminfty x_n = \ell

How does the notion of convergence extend to random variables? Let $\sequence{X}$ be a sequence of random variables. Intuitively, $X_n$ is a random point in $\R$ and we want to express that “ $X_n$ is close to $X$ ”, where $X$ is possibly random as well. Formally, $X_n : \Omega \to \R$ is a function and there are several ways to define convergence towards another function $X : \Omega \to \R$ We list some possible definitions:

Pointwise convergence: $\forall \omega \in \Omega : \liminfty X_n(\omega) = X(\omega)$
Uniform convergence: $\liminfty \sup_{\omega \in \Omega} \abs{X_n(\omega) - X(\omega)} = 0$
Almost-sure convergence: $\probP{\set{\omega \in \Omega \mid \liminfty X_n(\omega) = X(\omega)}} = 1$
Convergence in probability: $\forall \epsilon > 0 : \liminfty \probP{\abs{X_n - X} > \epsilon} = 0$
Convergence in $\funcspace{p}$ $\liminfty \E{\abs{X_n - X}^p} = 0$

We note that 1. and 2. are not suitable for probability, because random variables are only defined almost surely. In this chapter, we study 3. and 4., while 5. will be the topic of a later chapter. In general, the study of convergnece of random variables is related to functional analysis. To give sense to $X_n \to X$ we choose a functional space $\setS$ where $X_n$ and $X$ “live” and equip this space with a topology.

4.2 Almost Sure Convergence

Let $\sequence{X}$ and $X$ be random variables. If we fix $\omega \in \Omega$ then $\pabig{X_n(\omega)}_{n \in \N}$ is simply a sequence of real numbers and we know how to give sense to $\liminfty X_n(\omega) = X(\omega)$ This equation means that the sequence converges in $\R$ and its limit is $X(\omega)$ The existence and the value of the limit generally depends on the underlying $\omega$ Now, we may consider the set of all $\omega$ for which this holds $\set{ \liminfty X_n = X } = \set{ \omega \in \Omega \mid \liminfty X_n(\omega) = X(\omega) }$ When this event occurs almost-surely, we say that the sequence converges almost surely.

Definition 4.1 (Almost-sure convergence): Let

\sequence{X}

and

X

be random variables. We say that

X_n

converges to

X

almost surely if

\probP{\set{\liminfty X_n = X}} = 1

Note:

We also write $\liminfty X_n \aseq X$ or $X_n \asto X$
Note that $\liminfty X_n \aseq X$ iff $\liminfty \abs{X_n-X} \aseq 0$

Example (Deterministic sequence): Let

\sequence{a}

and

\ell

be real numbers and consider a sequence

\sequence{X}

of deterministic random variables satisfying

X_n \aseq a_n

Then

\liminfty X_n \aseq \ell

iff

\liminfty a_n = \ell

Example (Dyadic approximation): Let

X

be a non-negative real random variable. We define

X_n = \min\set{n, 2^{-n} \floor{2^n X}}

for every

n \in \N

Then we have

\liminfty X_n \aseq X

Example (Functional point of view): Let

\Omega = [0,1]

and

\P = \lambda

Define

X(\omega) = \omega

and

X_n(\omega) = \omega \cdot \ind{\abs{\omega - \frac 1 2} \geq \frac{1}{2n}}

For all

\omega \in [0,1] \setminus \set{\frac 1 2}

we have

\liminfty X_n(\omega) = X(\omega)

Since

\probP{[0,1] \setminus \set{\frac 1 2}} = 1

it follows that

X_n \asto X

In other word, “we do not care what happens when

\omega = \frac 1 2

because it does not happen”.

Proposition 4.2 (Criterion for a.s. convergence): Let

\sequence{X}

and

X

be random variables. If

\forall \epsilon > 0 : \sumN \probP{\set{\abs{X_n - X} \geq \epsilon}} < \infty

then

X_n \asto X

Example (Minimum of uniform rvs): Let

\sequence{U} \simiid \lawU([0,1])

For every

n \in \N

define

X_n = \min\set{U_i \mid i \leq n}

For

\epsilon > 1

we have

\probP{\set{\abs{X_n} \geq \epsilon}} = 0

and for

\epsilon \in [1,0)

we have

\probP{\set{\abs{X_n} \geq \epsilon}} = \probP{\set{U_1 \geq \epsilon}}^n = (1-\epsilon)^n

Since

\sumN (1 - \epsilon)^n < \infty

the criterion applies and

X_n \asto 0

4.3 Convergence in Probability

Definition 4.3 (Convergence in probability): Let

\sequence{X}

and

X

be random variables. We say that

X_n

converges to

X

in probability if

\forall \epsilon > 0 : \liminfty \probP{\set{\abs{X_n - X} \geq \epsilon}} = 0

Note: We also write

\liminfty X_n \Peq X

X_n \Pto X

Proposition 4.4 (A.s. implies in probability): If

X_n \asto X

then

X_n \Pto X

We now give two examples illustrating that convergence in probability does not generally imply almost-sure convergence. The first example considers a sequence of independent Bernoulli random variables with decreasing success probabilities.

Example (Bernoulli with decreasing success): Let

X_n \sim \lawBer\of{\frac 1 n}

independent. Trivially,

\forall \epsilon > 1 : \probP{\set{\abs{X_n} > \epsilon}} = 0

and for

\forall \epsilon \in [1, 0)

we have

\probP{\set{\abs{X_n} \geq \epsilon}} = \probP{\set{X_n = 1}} = \frac{1}{n} \convginfty 0

Hence

X_n \Pto 0

However

\sum_{n \in \N} \probP{\set{X_n = 1}} = \infty

and since the events

\set{X_n = 1}

are, by Borel-Cantelli II we have

\begin{align*} & \limsupinfty X_n \aseq 1 \\ \implies & X_n(\omega) \text{ almost-surely does not converge} \\ \implies & X_n \stackrel{\text{a.s.}}{\not\to} 0 \end{align*}

The second example also involves a sequence of Bernoulli random variables, but adopts a functional viewpoint by constructing a sequence of indicator functions on an explicit probability space $\Omega = [0,1]$

Example (Typesetter): Let

\Omega = [0,1]

\sigmaF = \borelB([0,1])

and

\P = \lambda

Define

$X_1 = \ind{[0,1]}$
$X_2 = \ind{[0,\frac 1 2]}$ $X_3 = \ind{[\frac 1 2,1]}$
$X_4 = \ind{[0,\frac 1 4]}$ $X_5 = \ind{[\frac 1 4,\frac 2 4]}$ $X_6 = \ind{[\frac 2 4,\frac 3 4]}$ $X_7 = \ind{[\frac 3 4,1]}$
$X_{2^i + j} = \ind{[\frac j {2^i},\frac{j+1}{2^i}]}$ for all $i,j \in \N$ with $j < 2^i$

For all

\epsilon > 0

we have

\probP{\set{\abs{X_n} \geq \epsilon}} = \frac{1}{2^{\floor{\log_2\pa{n}}}} < \frac{1}{2^{\log_2\pa{n}-1}} = \frac{2}{n} \convginfty 0

but for any

x \in [0,1]

we have

\set{X_n = x}

for infinitely many

n

thus

X_n(\omega)

almost-surely does not converge. Hence

X_n \Pto 0

but

X_n \stackrel{\text{a.s.}}{\not\to} 0

Note: Let

X

be a random variable, then we introduce the notation

X \wedge m = \max \set{X,m}

for the random variable that caps

X

m \in \R

Theorem 4.5 (Characterisation of convergence in probability): Let

\sequence{X}

and

X

be random variables. Then

X_n \Pto X \iff \liminfty \E{\abs{X_n - X} \wedge 1} = 0

Note:

\abs{X_n - X} \wedge 1

is being used as the expectation always exists. Indeed,

\abs{X_n - X} \wedge 1

has values in

[0,1]

hence

\E{\abs{X_n - X} \wedge 1}

is well defined and lies in

[0,1]

4.4 Converging Subsequence

Proposition 4.6 (Converging subsequence): Let

\sequence{X}

and

X

be random variables. If

X_n \Pto X

then there exists a subsequence

\sequence*{X}{n(k)}{k}

that converges almost-surely to

X

Example (Bernoulli with decreasing success): Let

X_n \sim \lawBer\of{\frac 1 n}

independent. Then the subsequence

X_{k^2}

converges to

0

almost-surely.

Example (Typesetter): For the typesetter example, the subsequence

X_{2^k}

converges to

0

almost-surely.

In summary, almost-sure convergence always implies convergence in probability, while convergence in probability only implies almost-sure convergence along a subsequence or if the convergence is “strong enough”.