The conditional expectation of squared residuals

Problem

If \(X,Y\) are two random variables, then the following result always holds:

\[Var\left( Y \mid X \right)=E\left( Y^2 \mid E \right)-E{\left( Y \mid X \right)}^2\]

Consider \(y_i=β_1+β_2x_i+ε_i\) where \(x\) is exogenous but we have heteroscedasticity, \(Var\left( ε_i|x_i \right)=σ^2x_i^2\) . Show that

\[E\left( ε_i^2|x_i \right)=σ^2x_i^2\]

That is, the conditional expectation of squared error terms depends on \(x_i\) .

Solution

In the general formula, use \(Y=ε_i\) and \(X=x_i\) :

\[Var\left( ε_i \mid x_i \right)=E\left( ε_i^2 \mid x_i \right)-E{\left( ε_i \mid x_i \right)}^2\]

Because of exogeneity, \(E\left( ε_i \mid x_i \right)=0\) so \(E{\left( ε_i \mid x_i \right)}^2=0\) and

\[Var\left( ε_i \mid x_i \right)=E\left( ε_i^2 \mid x_i \right)\]

and the result follows since \(Var\left( ε_i|x_i \right)=σ^2x_i^2\) by assumption.

Now, since \(E\left( ε_i^2|x_i \right)=σ^2x_i^2\) , if \(ε_i\) was known, we would regress \(ε_i^2\) on \(x_i^2\) to estimate \(σ^2\) and to test for heteroscedasticity (just like we regress \(y_i\) on \(x_i\) if \(E\left( y_i \mid x_i \right)=β_1+β_2x_i\) ). Since \(ε_i\) is not known, we use \(e_i\) instead. This is the foundation for the auxiliary regression we regress \(e_i^2\) on variables we believe are responsible for the heteroscedasticity.