Likelihood and Regression 2025-10-08

3. Categorical Likelihoods

Although the likelihood applies to all measurement scales, we need to discuss aspects of categorical likelihoods in a little more detail.

3.1 Likelihood for Categorical Responses

Let $Y$ be ordered categorical with support $\evS = \set{y_1, \ldots, y_K}$ Then $Y$ follows the Categorical distribution $Y \sim \lawCateg(\gvec \pi)$ where $\gvec \pi = (\pi_1, \ldots, \pi_K)$ with $\sum_{k=1}^K \pi_k = 1$ are the probabilities for the $K$ categories.

Assuming that we have $n$ observations consisting of single outcomes, i.e. $\evA_i = \set{y_{k_i}}$ the likelihood contributions are $l_i(\gvec\pi) = \probPwrt{\gvec\pi}{Y_i = y_{k_i}} = \pi_{k_i}$ and the joint likelihood function is $L(\gvec \pi) = \prod_{i=1}^n l_i(\gvec\pi) = \prod_{i=1}^n \pi_{k_i} = \pi_{k}^{n_k}$ with $n_k = \sum_{i=1}^n \ind{y_{k_i} = y_k}$ This is equivalent to the single likelihood function of the multinomial distribution $\lawMult(\gvec \pi, n)$

Proposition 3.1 (Maximum likelihood for multinomial): The likelihood

l(\gvec \pi) = \pi_{k}^{n_k}

of a random vector distributed via

\lawMult(\gvec \pi, n)

is maximised by

\hat{\gvec \pi} = \pa{\frac{n_1}{n}, \ldots, \frac{n_K}{n}}

One may observe events of the form $\evA_i = \set{y_{p_i},y_{q_i}}$ for which the likelihood contribution becomes $l_i(\gvec \pi) = \probPwrt{\gvec \pi}{Y_i \in \set{y_{p_i},y_{q_i}}} = \pi_{p_i} + \pi_{q_i}$

3.2 The Nonparametric Likelihood

TODO