Linear regression with several explanatory variables

Summary

Several explanatory variables

If we have several explanatory variables

  • \(y\) is the dependent or explained variable (no change)
  • \(x_2\) is the first explanatory variable and \(x_{i2}\) is the \(i'th\) observation for this explanatory variable
  • \(x_3\) is the second explanatory variable and so on.
  • \(x_k\) is the final explanatory variable. In total, there are \(k-1\) explanatory variables.

The LRM assumption

  • Given a random sample with several explanatory variables, a linear regression model (LRM) can be formulated as follows

\[y_i=β_1+β_2x_{i,2}+β_3x_{i,3}+…+β_kx_{i,k}+ε_i\]

  • From the LRM assumption,

\[E\left(y_i | x_i \right)=β_1+β_2x_{i,2}+β_3x_{i,3}+…+β_kx_{i,k}+E\left(\varepsilon_i | x_i \right)\]

  • where \(E\left(y_i | x_i \right)\) is shorthand notation for \(E\left(y_i | x_{i,2},…,x_{i,k} \right)\) similarly for \(E\left(\varepsilon_i | x_i \right)\) .
  • We say that the \(x\) -variables are exogenous if

\[E\left(\varepsilon_i | x_i \right)=0\]

  • If the \(x\) -variables are exogenous then

\[E\left(y_i | x_i \right)=β_1+β_2x_{i,2}+β_3x_{i,3}+…+β_kx_{i,k}\]

  • and

\[y_i=E\left(y_i | x_i \right)+ε_i\]

  • or

\[ε_i=y_i-E\left(y_i | x_i \right)\]

An arbitrary observation

  • For an arbitrary observation we write

\[E\left(y| x \right)=β_1+β_2x_2+β_3x_3+…+β_kx_k \]

  • \(E\left(y| x \right)\) is shorthand notation for \(E\left(y| x_2,…,x_k \right)\) .

Interpreting \(β\) -parameters

  • \(β_1\) is given by

\[β_1=E\left(y| x=0 \right)\]

  • where “ \(x=0\) ” is short for “all explanatory variables are zero” or \(x_2=0,…,x_k=0\) .
  • \(β_j\) for \(j=2,…,k\) is given by

\[β_j= \frac{∂E\left(y | x \right)}{∂x_j} , j=2,…,k\]

LRM, main point:

  • Observe data on \(x\) -variables and a \(y\) -variable.
  • Make the LRM assumption.
  • Find estimates of \(β_1,…,β_k\) to discover the relationship between the \(x\) -variables and \(E\left(y | x \right)\)