Error terms and the regression model

Summary

  • Definition of (additive) error terms for \(i=1, \ldots ,n\) :

\[ε_i=y_i-E\left( y_i \right|x_i)\]

  • Under the statistical model, \(E\left( y_i \right|x_i)=g(x_i,β)\) and

\[ε_i=y_i-g(x_i,β)\]

  • or

\[y_i=g(x_i,β)+ε_i\]

  • This is called a regression model (RM) .
  • Under the linear statistical model, \(E\left( y_i \right|x_i)=x'_iβ\) , for \(i=1, \ldots ,n\) :

\[ε_i=y_i-x'_iβ\]

  • or

\[y_i=x'_iβ+ε_i\]

  • This is called a linear regression model ( LRM ).
  • Result for \(i=1, \ldots ,n\) : under the linear statistical model ,

\[E\left( ε_i \right|x_i)=0\]

  • We say that the explanatory variables are exogenous with respect to the error terms if this holds.
  • Definition of the vector of error terms:

\[ε=\begin{bmatrix}ε_1 \\ ⋮ \\ ε_n\end{bmatrix}\]

  • \(ε\) is \(n×1\) and we have

\[ε=y-E\left( y \right|X)\]

  • Result: Under the linear statistical model, \(E\left( y \right|X)=Xβ\) , we have

\[y=Xβ+ε\]

  • This is called the linear regression model in vector form .
  • Under the linear statistical model

\[E\left( ε \right|X)=0\]

  • This is the exogeneity condition in matrix form.
  • Some notes:
    • If \(x_i\) is non-stochastic then \(E\left( ε_i|x_i \right)=E\left( ε_i \right)\) for \(i=1, \ldots ,n\)
    • If \(x_i\) is independent of \(ε_i\) then \(E\left( ε_i|x_i \right)=E\left( ε_i \right)\) for \(i=1, \ldots ,n\)
    • In these cases, the explanatory variables are exogenous if \(E\left( ε_i \right)=0\) for \(i=1, \ldots ,n\) or \(E\left( ε \right)=0\) .