Random sample, statistical model and linear statistical model
Summary
- Setup: A sample \(\left( y_i,x_i \right)\) for \(i=1, \ldots ,n\) ( or \(y, X\) in matrix form) (see Regressand and regressor data)
- We say that our sample is a random sample if the random vectors \(\left( y_i,x_{i,1}, \ldots ,x_{i,k} \right)\) are independent and identically distributed (IID) for \(i=1, \ldots ,n\) .
- The conditional expectation \(E\left( y_i \right|x_i)\) must be a function of \(x_i\) .
- We postulate a parametric statistical model by assuming that
\[E\left( y_i \right|x_i)=g(x_i,β)\]
- for some given function \(g\) where \(β\) is a vector of unknown parameters . We will refer to this simply as our “statistical model”. Note that our postulate may be correct, or it may be incorrect.
- If
\[g\left( x_i,β \right)=β_1x_i+β_2x_{i,2}+ \ldots +β_kx_{i,k}=x'_iβ\]
- where \(β={\left( β_1, \ldots ,β_k \right)}'\) is \(k×1\) then
\[E\left( y_i \right|x_i)=x'_iβ\]
- and we say that we are postulating a “linear model” for the conditional expectation. We will refer to this simply as our “linear statistical model”.