Random sample, statistical model and linear statistical model

Summary

  • Setup: A sample \(\left( y_i,x_i \right)\) for \(i=1, \ldots ,n\) ( or \(y, X\) in matrix form) (see Regressand and regressor data)
  • We say that our sample is a random sample if the random vectors \(\left( y_i,x_{i,1}, \ldots ,x_{i,k} \right)\) are independent and identically distributed (IID) for \(i=1, \ldots ,n\) .
  • The conditional expectation \(E\left( y_i \right|x_i)\) must be a function of \(x_i\) .
  • We postulate a parametric statistical model by assuming that

\[E\left( y_i \right|x_i)=g(x_i,β)\]

  • for some given function \(g\) where \(β\) is a vector of unknown parameters . We will refer to this simply as our “statistical model”. Note that our postulate may be correct, or it may be incorrect.
  • If

\[g\left( x_i,β \right)=β_1x_i+β_2x_{i,2}+ \ldots +β_kx_{i,k}=x'_iβ\]

  • where \(β={\left( β_1, \ldots ,β_k \right)}'\) is \(k×1\) then

\[E\left( y_i \right|x_i)=x'_iβ\]

  • and we say that we are postulating a “linear model” for the conditional expectation. We will refer to this simply as our “linear statistical model”.