1. Distributions and Supports

To keep things simple and the notation as light as possible, we only consider unconditional distributions of univariate responses from independent observations in this chapter.

In this text, we assume an underlying probability space (Ω,F,P)(\Omega, \sigmaF, \P) exists and only consider real random varibles, e.g. X:ΩRX : \Omega \to \R.

Note: Unless stated otherwise, we will write “random variable” for real random variables.

μX\mu_{X} denotes the push-forward measure P ⁣(X1)=X#P\probP{X^{-1}} = X \# \P of the random variable XX.

Recap (Support of a random variable): The support of a random variable XX denoted as supp ⁣(X)\supp{X} is the smallest closed set SR\evS \subset R such that P ⁣(XS)=1\probP{X \in \evS} = 1.
Note: For discrete random variables, supp ⁣(X)\supp{X} is countable.

1.1 Distribution Functions

The response YY we are interested in follows a distribution μY\mu_{Y} and we write YμYY \sim \mu_{Y}. It is important to note that for non-finite sample spaces an observation is conceptually always an event, i.e. a set AB(R)\evA \in \borelB(\R). We never observe the outcomes yRy \in \R directly. For discrete sample spaces with finite cardinality however, events A={y}\evA = \set{y} might very well be observed.

Recap (Cumulative distribution function): The cumulative distribution function or cdf FY:R[0,1]F_Y : \R \to [0,1] is FY(y)=μY((,y]) F_Y(y) = \mu_Y\pabig{(-\infty, y]}
Note: The cdf is a monotonically, but not necessarily strictly, increasing function with y1<y2:FY(y1)FY(y2)\forall y_1 < y_2 : F_Y(y_1) \leq F_Y(y_2).

1.2 Supports and Measurement Scales

The choice of an appropriate support S=supp ⁣(Y)\evS = \supp{Y} very much depends on the measurement scale of the response YY. Most situations can be classified into binary, ordered categorical, unordered categorical, count or absolutely continuous responses YY.

Definition 1.1 (Binary response): The support is S={y1,y2}\evS = \set{y_1, y_2} with y1Ry_1 \in \R and y2Ry_2 \in \R.
Note: We understand these two outcomes as being truely categorical and explicitly exclude dichotomisation, e.g. binary variables like “younger than 65 years” vs. “older than 65 years” should be modelled by a sample space appropriate for age as a numeric variable.
Example (Binary response): y1=failurey_1 = \text{failure} and y2=successy_2 = \text{success}.
Definition 1.2 (Ordered categorical response): The support is S={y1,,yK}\evS = \set{y_1, \ldots, y_K} with K<K < \infty and yiRy_i \in \R for all i{1,,K}i \in \set{1,\ldots,K}.
Example (Ordered categorical response): The happines scores, e.g. y1=very unhappyy_1 = \text{very unhappy}, y2=not too happyy_2 = \text{not too happy}, y3=somewhat happyy_3 = \text{somewhat happy} and y4=very happyy_4 = \text{very happy} with y1<y2<y3<y4y_1 < y_2 < y_3 < y_4.
Note: For unordered categorical data S={s1,,sK}\setS = \set{s_1, \ldots, s_K} with K<K < \infty one first needs to define an injection f:SRf : \setS \to \R that defines the support S=f(S)\evS = f(\setS). The distribution μY\mu_Y will thus depend on ff.
Example (Ordered categorical response): The faculties at a university, e.g. s1=Medicines_1 = \text{Medicine}, s2=Natural Sciencess_2 = \text{Natural Sciences}, s3=Philosophys_3 = \text{Philosophy} and so on.
Definition 1.3 (Count response): The support is S=N\evS = \N.
Example (Count response): The number of wildlife-vehicle collsions counted per year on a specific road segment.
Definition 1.4 (Absolutely continuous response): The support S\evS is a contiguous subset of R\R.
Example (Absolutely continuous response): The age YY of a person has support S=(0,)\evS = (0,\infty). The event “the person is 44 years old” is represented by the interval [44,45)[44,45).
Example (Mixed distribution): The amount of precipitation YY for a meteorological station at one day has support S=[0,)\evS = [0, \infty). Note that μY(0)>0\mu_Y(0) > 0 because the probability of no rain or snow at all is larger than zero. This is an example of a distribution with a discrete part at 00 and a continuous part at (0,)(0,\infty).

1.3 Other Distribution Characterizations

The cdf FYF_Y, the pdf and pmf are only some of the available functions to fully characterize the distribution μY\mu_Y. Here, we list more of such functions.

Definition 1.5 (Survivor function): The survivor function SY:R[0,1]S_Y : \R \to [0,1] is SY(y)=1FY(y)=μY([y,)) S_Y(y) = 1 - F_Y(y) = \mu_Y([y, \infty))
Definition 1.6 (Odds function): The odds function OY:RR+O_Y : \R \to \R^+ is OY(y)=FY(y)SY(y)=μY((,y])μY([y,)) O_Y(y) = \frac{F_Y(y)}{S_Y(y)} = \frac{\mu_Y((-\infty, y])}{\mu_Y([y, \infty))}
Definition 1.7 (Quantile function): The quantile function QY:[0,1]RQ_Y : [0,1] \to \R is the generalized inverse of the cdf FYF_Y, i.e. QY(p)=inf{yR:FY(y)p} Q_Y(p) = \inf \set{y \in \R : F_Y(y) \geq p}
Note: We have QY(p)=FY1(p)Q_Y(p) = F_Y^{-1}(p) if FYF_Y is invertible.
Definition 1.8 (Hazard function): The hazard function hY:RR+h_Y : \R \to \R^+ is hY(y)=fY(y)μY([y,)) h_Y(y) = \frac{f_Y(y)}{\mu_Y([y, \infty))}
Note: If YY is absolutely continuous we have hY(y)=fY(y)SY(y)h_Y(y) = \frac{f_Y(y)}{S_Y(y)} or equivalently hY(y)=ddylogSY(y)h_Y(y) = -\dv{}{y} \log S_Y(y).
Definition 1.9 (Cumulative hazard function): The cumulative hazard function HY:RR+H_Y : \R \to \R^+ is HY(y)=yhY(t)dt=yfY(t)μY([t,))dt H_Y(y) = \int_{-\infty}^y h_Y(t) \dd t = \int_{-\infty}^y \frac{f_Y(t)}{\mu_Y([t, \infty))} \dd t
Note: If YY is absolutely continuous we have HY(y)=logSY(y)H_Y(y) = -\log S_Y(y).