Notes for Machine Learning and Quantitative Finance
Sunday, July 29, 2018
Covariance, Correlation Coefficient and R-squared
Covariance, Correlation Coefficient and R-squared
Covariance
The covariance measures of the degree to which two random variables (X, Y) change together.
Cov(X,Y)=E[(X−E(X))(Y−E(Y)]
=E[XY−XE(Y)−YE(X)+E(X)E(Y)]
=E(XY)−2E(X)E(Y)+E(X)E(Y)
=E(XY)−E(X)E(Y)
Correlation Coefficient ( R )
The correlation coefficient, or Pearson’s R, standardise the convariance and constraints its value between -1 and +1. The two random variables X and Y have strong negative correlation if R < -0.5, meanwhile they have strong position correlation if R > +0.5.
R(X,Y)=Var(X)Var(Y)Cov(X,Y)
Generating Correlated Random Variables
Suppose X is a standard normal random variable, we can generate another correlated strandard normal randdom variable Z, that has correlated coefficient ρ like this:
Z=ρX+1−ρ2Y
where Y is an independant standard random variable with X.
To Prove:
E(X)=E(Y)=0
Var(X)=Var(Y)=1
Var(X)=E(X2)−E(X)2=1
E(X2)=E(Y2)=1
E(XY)=0,∵independent
Var(ρX+1−ρ2Y)=ρ2Var(X)+(1−ρ2)Var(Y)=1
R(Z,X)=R(ρX+1−ρ2Y,X)=Cov(ρX+1−ρ2Y,X)
=E[X(ρX+1−ρ2Y)]−E(X)E(ρX+1−ρ2Y)
=ρE(X2)+1−ρ2E(XY)=ρ
For generating more than two correlated random variables, refer to Cholesky Decomposition.
R-squared
R-squared, as the name implies, can be calculated from squaring the correlation coefficient R. The result ranges from 0 to 1.
In regression, R-squared can also be calculated by:
R2=SSTSST−SSE
, where
SSE=∑(yi−yi^)2
SST=∑(yi−yi¯)2
It measures the reduction in residual error from using the regression line over the mean line, or just simply how well data points fit the regression line or curve.
It can also represents the variation in the dependent variable explained by the independent variable.
No comments:
Post a Comment