Although the likelihood applies to all measurement scales, we need to discuss aspects of categorical likelihoods in a little more detail.
3.1 Likelihood for Categorical Responses
Let Y be ordered categorical with support S={y1,…,yK}. Then Y follows the Categorical distribution Y∼Categ(π), where π=(π1,…,πK) with ∑k=1Kπk=1 are the probabilities for the K categories.
Assuming that we have n observations consisting of single outcomes, i.e. Ai={yki}, the likelihood contributions are
li(π)=Pπ(Yi=yki)=πki
and the joint likelihood function is
L(π)=i=1∏nli(π)=i=1∏nπki=πknk
with nk=∑i=1n1{yki=yk}. This is equivalent to the single likelihood function of the multinomial distribution Mult(π,n).
Proposition 3.1 (Maximum likelihood for multinomial)
: The likelihood
l(π)=πknk of a random vector distributed via
Mult(π,n) is maximised by
π^=(nn1,…,nnK) One may observe events of the form Ai={ypi,yqi} for which the likelihood contribution becomes
li(π)=Pπ(Yi∈{ypi,yqi})=πpi+πqi
3.2 The Nonparametric Likelihood
TODO