Measurement errors

Summary

  • True relation:

\[E\left( y_i \mid w_i \right)=β_1+β_2w_i\]

  • such that exogeneity would be satisfied in the model

\[y_i=β_1+β_2w_i+ν_i\]

  • with error terms \(ν_i\) .
  • However, \(w_i\) is unobserved . Instead, we observe an inaccurate measurement of \(w_i\) denoted by \(x_i\) and \(\left( y_i,x_i \right)\) for \(i=1, \ldots ,n\) is a random sample.
  • The difference \(x_i-w_i\) is called the measurement error , denoted by \(u_i\) :

\[x_i=w_i+u_i\]

  • In the standard errors in variables model it is assumed that, (for all \(i\) )
    • \(E\left( u_i \right)=0\)
    • \(Var\left( u_i \right)=σ_u^2\)
    • \(u_i\) is independent of \(w_i\)
    • \(u_i\) is independent of \(ν_i\)
  • We do not realize that \(x_i\) has measurement errors so we believe that

\[E\left( y_i \mid x_i \right)=β_1+β_2x_i\]

  • and that \(x\) is exogenous in the model

\[y_i=β_1+β_2x_i+ε_i\]

  • But this is wrong . In fact, \(x_i\) will be correlated with \(ε_i\) :

\[Cov\left( x_i,ε_i \right)=Cov\left( w_i+u_i,ν_i-β_2u_i \right)=-β_2σ_u^2\]

  • \(x\) will be endogenous and the OLS estimator will be biased and inconsistent.
  • Result:

\[plim b_2=β_2\left( 1- \frac{σ_u^2}{σ_w^2+σ_u^2} \right)\]

  • where \(σ_w^2=Var\left( w_i \right)\)
  • \(b_2\) is asymptotically biased towards zero ( attenuation towards zero ). This is not true in general if there is more than one explanatory variable.