We assume for this section that X and Y are two random variables with finite first and second moments. We recap the notion of correlation.
Recap (Correlation): The correlation between X and Y is defined as
Cor[X,Y]=Var[X]Var[Y]Cov[X,Y]
Note: We use the notation ρX,Y=Cor[X,Y].
Proposition 5.1 (Properties of correlation): Some properties of the correlation are:
Cor[X,Y]∈[−1,1]
Cor[X,Y]=1 if and only if Y=a+bX for some a,b∈R,b=0
If X and Y are independent, then Cor[X,Y]=0
Note: If ρX,Y=0 we call X and Y uncorrelated.
We find the correlation of two random variables X and Y by fitting the SLMI model Y=a1+a2X+ε via
a⋆=a1,a2∈RargminE[(Y−(a1+a2X))2]
Then the slope is a2⋆=Var[X]Cov[X,Y]=ρX,YσXσY. Note that if we fit X=b1+b2Y+ε in the same fashion, then ∣ρX,Y∣=a2⋆b2⋆.
Note: If we normalize Y~=σYY−μY and X~=σXX−μX and fit the SLM Y=aX then a⋆=ρX,Y.
5.2 Empirical Correlation
Definition 5.2 (Empirical correlation): The empirical correlation of an iid sample x1,…,xn of X and y1,…,yn of Y is
cor(x,y)=var(x)var(y)cov(x,y)
Note: We use the notation ρ^X,Y=cor(x,y) which is called the Pearson's correlation coefficient.
Proposition 5.3 (Empirical correlation biased): The empirical correlation is biased with E[cor(X,Y)]≈ρX,Y−2nρX,Y(1−ρX,Y2).
Proposition 5.4 (Properties of the empirical correlation): Some properties of the empirical correlation are:
cor(x,y)∈[−1,1]
cor(x,y)=1 if and only if yi=a+bxi for some a,b∈R,b=0
If X and Y are uncorrelated, then E[cor(X,Y)]=0
We find the correlation of two random variables X and Y by fitting the OLS estimator of the SLMI model Yi=β1+β2xi+εi where Yi∼iidY and xi are realizations of Xi∼iidX. Then the slope is βˇ2=var(x)cov(x,y)=ρ^X,Yσ^Xσ^Y.
Proposition 5.5 (Fisher Z transformation): Let X and Y be jointly Gaussian and let Z=tanh−1(ρ^X,Y)=21log1−ρ^X,Y1+ρ^X,Y. Then we can approximate the distribution of Z via
Z∼aN(tanh−1(ρ^X,Y),n−31)
Note: The distribution for Z holds approximately true for n≥10.
If we want to test for uncorrelation, i.e. H0:ρX,Y=0, we have thus three tests available:
Looking at a confidence limit diagram for ρ^X,Y
The t-test or F-test of H0:β2=0
Using the Fisher Z transformation and testing H0:Z=0
Note: As correlation measures linear dependence very different patterns may lead to the same value.
5.3 Partial Correlation
Definition 5.6 (Partial correlation): Let X1,X2,Y be random variables. Then the partial correlation between Y and X2 given X1 is
ρY,X2∣X1=(1−ρY,X12)(1−ρX2,X12)ρY,X2−ρY,X1ρX2,X1
Note: The estimated partial correlation is defined analogously.
The partial correlation measures the linear dependence between Y and X2 after accounting for the linear dependence of Y and X2 on X1. The empirical partial correlation ρ^Y,X2∣X1 can be computed in a Frisch-Waugh-Lovell way:
Regress Y versus X1 with intercept to obtain the residuals by ε¬2.
Regress X2 versus X1 with intercept to obtain the residuals r2.
The empirical correlation between ε¬2 and r2 is exactly the empirical partial correlation ρ^Y,X2∣X1.
Thus, in OLS regression with an intercept, one can relate partial correlations to the estimated parameters via ρ^Y,Xj∣X[:¬j]∝βˇj.
5.4 Rank Correlations
Since Pearson's correlation is not robust towards outliers, some sort of rank correlation is often used. There are two types.