Testing for endogeneity

Summary

  • Setup: more instruments than explanatory variables
    • Random sample \(\left( y_i,x_i,z_i \right)\) for \(i=1, \ldots ,n\) where \(x_i\) is \(k×1\) and \(z_i\) is \(r×1\) with \(r≥k\)
    • A linear regression model, \(y_i=x'_iβ+ε_i\)
    • Exogeneity may or may not fail
    • z-variables are instruments, \(E\left( ε_i|z_i \right)=0\) and \(E\left( z_ix'_i \right)=Σ_{zx'}\) is \(r×k\) with rank \(k\) .
    • \(Var\left( ε_i|z_i \right)=σ^2\)
  • If exogeneity fails, then the (generalized) IV estimator is consistent while the OLS estimator is not
  • If exogeneity holds, then the (generalized) IV estimator as well as the OLS estimator is consistent but the OLS estimator is more efficient.
  • We can use this to create a test for endogeneity. The test is called the Durbin-Wu-Hausman test.
  • The null hypothesis of this test is that exogeneity holds.
  • To perform the test in Stata, do “estat endogenous” after estimating the model using ivregress