Multicollinearity

Summary

Setup:

  • The LRM with random sampling

\[y_i=β_1+β_2x_{i,2}+β_3x_{i,3}+…+β_kx_{i,k}+ε_i, i=1,…,n\]

Results

  • If at least one explanatory variable is an exact linear combination of the rest of the explanatory variables, then we say that the model suffers from perfect multicollinearity .
  • In a model with perfect multicollinearity, none of the OLS estimates can be computed (they will become “0/0”).
  • If one explanatory variable close to be equal to a linear combination of the rest of the explanatory variables, then we say that the model suffers from multicollinearity . While a model either suffers from perfect multicollinearity or not, multicollinearity is a matter of degree, not kind.
  • A sample correlation between two explanatory variables close to 1 or -1 will cause multicollinearity in the LRM.
  • A model with multicollinearity may result in high variances in \(b_1,…,b_k\) and therefore low \(t\) -values. A model with multicollinearity may therefore display a good general fit with a high \(R^2\) but with insignificant individual explanatory variables.