The variance of the OLS estimator

Problem

Setup: a linear regression model with Gauss-Markov and a random sample where \(k=3\)

\[y_i=β_1+β_2x_{i,2}+β_3x_{i,3}+ε_i\]

One can show that (you do not need to do that)

\[Var\left( b_2|x \right)= \frac{σ^2}{\sum_{i}^{n}{ {\left( x_{i,2}-{\bar{x}}_2 \right)}^2 }} \frac{1}{1-ρ_{1,2}^2}\]

where \(ρ_{1,2}^2\) is the correlation between \(x_{i,2}\) and \(x_{i,3}\) (which is the same for all \(i\) due to random sampling).

  1. Find \(Var\left( b_2|x \right)\) when \(x_{i,2}\) and \(x_{i,3}\) are independent
  2. Show that \(Var\left( b_2|x \right)\) increases as \(ρ_{1,2}^2\) increases.
  3. What would happen to \(Var\left( b_2|x \right)\) when \(ρ_{1,2}^2=1\) ?

Solution

  1. a. If \(x_{i,2}\) and \(x_{i,3}\) are independent, then they are uncorrelated and \(ρ_{1,2}^2=0\) . We then have

\[Var\left( b_2|x \right)= \frac{σ^2}{\sum_{i}^{n}{ {\left( x_{i,2}-{\bar{x}}_2 \right)}^2 }}\]

b. As \(ρ_{1,2}^2\) increases, \(1-ρ_{1,2}^2\) decreases and \(1/(1-ρ_{1,2}^2) \) increases. The higher the correlation between the x-variables, the higher the variance in \(b_2\) (which is bad). The intuitive reason for this is that as the correlation increases, it gets harder to distinguish the effect of one variable on y holding the other constant.

c. As \(ρ_{1,2}^2\) gets close to 1, \(1-ρ_{1,2}^2\) gets closer to 0 and \(1/(1-ρ_{1,2}^2)\) goes to infinity. If the x-variables are perfectly correlated, then we have perfect multicollinearity and the OLS estimator cannot be calculated.