Testing for endogeneity

Summary

Random sample \(\left( y_i,x_i,z_i \right)\) for \(i=1, \ldots ,n\) where \(x_i\) is \(k×1\) and \(z_i\) is \(r×1\) with \(r≥k\)
A linear regression model, \(y_i=x'_iβ+ε_i\)
Exogeneity may or may not fail
z-variables are instruments, \(E\left( ε_i|z_i \right)=0\) and \(E\left( z_ix'_i \right)=Σ_{zx'}\) is \(r×k\) with rank \(k\) .
\(Var\left( ε_i|z_i \right)=σ^2\)

If exogeneity fails, then the (generalized) IV estimator is consistent while the OLS estimator is not
If exogeneity holds, then the (generalized) IV estimator as well as the OLS estimator is consistent but the OLS estimator is more efficient.
We can use this to create a test for endogeneity. The test is called the Durbin-Wu-Hausman test.
The null hypothesis of this test is that exogeneity holds.
To perform the test in Stata, do “estat endogenous” after estimating the model using ivregress