Regression towards the mean

Problem

Childcm is the height (in cm) of a child and parentcm is the average height of the parents.

  1. How would you interpret the estimated parameters (coefficients)
  2. Test \(H_0:β_2>1\)
  3. Explain why it makes sense for \(β_2\) to be between 0 and 1. (See https://en.wikipedia.org/wiki/Regression_toward_the_mean#History for an interesting story)

Solution

  1. \(β_2\) is the expected increase in the height of a child as the height of the parents increase by 1 cm. Basic idea is that if you have two families where the parents in the first family is a centimeter taller than the parents of the second family, then the child in the first family will tend to grow to be 0.65 cm taller than the child in the second family. Technically, \(β_1\) is to be interpreted as the expected height of a child whose parents are 0 cm tall. We expect such a child to grow to be 60 cm tall.
  2. The t-value is (0.646-1)/0.041 = - 8.59 and we reject H0 at any reasonable level of significance.
  3. Tall parents tend to have tall children. However, their children will, on average, not be as tall as they are. There is a tendency for the height to return to the mean or to regress towards the mean. This is where the word regression originates from.