Multicollinearity
Summary
Setup:
- The LRM with random sampling
\[y_i=β_1+β_2x_{i,2}+β_3x_{i,3}+…+β_kx_{i,k}+ε_i, i=1,…,n\]
Results
- If at least one explanatory variable is an exact linear combination of the rest of the explanatory variables, then we say that the model suffers from perfect multicollinearity .
- In a model with perfect multicollinearity, none of the OLS estimates can be computed (they will become “0/0”).
- If one explanatory variable close to be equal to a linear combination of the rest of the explanatory variables, then we say that the model suffers from multicollinearity . While a model either suffers from perfect multicollinearity or not, multicollinearity is a matter of degree, not kind.
- A sample correlation between two explanatory variables close to 1 or -1 will cause multicollinearity in the LRM.
- A model with multicollinearity may result in high variances in \(b_1,…,b_k\) and therefore low \(t\) -values. A model with multicollinearity may therefore display a good general fit with a high \(R^2\) but with insignificant individual explanatory variables.