In this chapter, unless otherwise noted, we will always assume the WCLM and will specify if additional assumptions are made.
3.1 Moments under WCLM
Proposition 3.1 (Expectations)
: The expectations are:
- E[β^]=β
- E[Y^]=E[Y]=Xβ
- E[ε^]=0
Proposition 3.2 (Covariance matrices)
: The covariance matrices are:
- Cov[β^, β^]=σ2(X⊤X)−1
- Cov[Y^, Y^]=σ2P
- Cov[ε^, ε^]=σ2Q
- Cov[β^, Y^]=σ2(X⊤X)−1X⊤
- Cov[β^, ε^]=0
- Cov[Y^, ε^]=0
Note: Recall that generally
yˇ and
εˇ are empirically uncorrelated only if the model includes an intercept. If treated as random vectors
Y^ and
ε^ however, they are uncorrelated without restrictions.
3.2 Estimator for Deviation Variance
We can now define an ubiased estimator for the variance of the deviations.
Definition 3.3 (Unbiased estimator for deviation variance)
: The estimator
σ^2=n−pε^⊤ε^=n−p1i=1∑nε^i2
is an unbiased estimator for the variance
σ2 of the deviations.
Proof: E[σ^2]=n−p1i=1∑nVar[ε^i]=n−p1tr(σ2Q)=σ2 We note that we can write iid samples of any random variable Y with finite first and second moment via the location model Yi=μ+εi where μ=E[Y] and εi is the zero-mean stochastic component. As such, the empirical variance emerges as the estimator σ^2 for the variance of the deviations in the location model.
Recap (Empirical variance)
: The empirical variance of an iid sample
y1,…,yn of
Y is
var(y)=n−11∑i=1n(yi−y)2 Proposition 3.4 (Empirical variance ubiased)
: The empirical variance is unbiased, i.e.
E[var(Y)]=Var[Y]. For iid samples of two random variables Y and Z with finite first and second moments we have a similar result for the covariance.
Recap (Empirical covariance)
: The empirical covariance of an iid sample
y1,…,yn of
Y and an iid sample
z1,…,zn of
Z is
cov(y,z)=n−11∑i=1n(yi−y)(zi−z) Proposition 3.5 (Empirical covariance unbiased)
: The empirical covariance is unbiased, i.e.
E[cov(Y,Z)]=Cov[Y, Z]. Proof: Let
Yi=μY+εY,i and
Zi=μZ+εZ,i be the iid samples of
Y and
Z with
E[Y]=μY and
Zi=μZ. For both models we have
P=n11, thus
Y=PY+QY=n11Y+Q(μY1+εY)=Y1+QεY
and similarly
Z=Z1+QεZ. Hence
E[cov(Y,Z)]=n−11E[(Y−Y1)⊤(Z−Z1)]=n−11E[(QεY)⊤QεZ]=n−11E[εY⊤Q⊤QεZ]=n−11E[tr(εY⊤QεZ)]=n−11tr(QE[εZεY⊤])=Cov[Y, Z] 3.3 Distributions under SCLM
We now assume the SCLM, thus ε∼N(0,σ2I).
Recap: The sum of Gaussian random variables is again Gaussian.
Proposition 3.6 (Distributions)
: The distributions are:
- β^∼N(β,σ2(X⊤X)−1)
- Y^∼N(Xβ,σ2P)
- ε^∼N(0,σ2Q)
- σ2n−pσ^2=σ21∑i=1nε^i2∼χn−p2
Note:- Y^ and ε^ are independent as they are uncorrelated Gaussian.
- β^ and ε^ are independent as they are uncorrelated Gaussian.
- Thus, σ^2 is independent of β^, as σ^2 is defined by ε^.
3.4 Asymptotic Normality
SCLM are strong assumptions. Asymptotically they hold true under the following conditions:
- The smallest eigenvalues of X⊤X goes λmin→∞ as n→∞
- The maximum element max{P[1,1],…,P[n,n]}→0 as n→∞
Then Lidenbergs CLT applies and β^∼aN(β,σ2(X⊤X)−1). However, OLS may not be efficient in presence of non-Gaussian ε and power of tests and length of CIs can be very wrong.