Binary choice models

Problem

Given:

  • a random sample: \(\left( y_i,x_i \right)\) where \(y_i\) can only take values 0 or 1 and \(x_i\) is \(k×1\)
  • Statistical model: \(E\left( y_i|x_i \right)=P\left( y_i=1 \mid x_i \right)=F\left( x'_iβ \right)\) where \(β\) is a is \(k×1\) vector of unknown parameters.
  1. a. Show that the log-likelihood function is given by

\[l\left( β \right)=\sum_{i=1}^{n}{ \left( y_ilog F\left( x'_iβ \right)+\left( 1-y_i \right)log \left( 1-F\left( x'_iβ \right) \right) \right) }\]

b. Estimate \(P\left( y_i=1 \mid x_i \right)\) for a logit model if we have one explanatory variable, \(x'_iβ=β_1+β_2x_i\) with estimates \(b_1=-0.2\) and \(b_2=0.1\) for an individual with \(x=10\) .

c. Show that the marginal effect can be calculated using

\[ \frac{∂P\left( y=1 \mid x \right)}{∂x_j}=F'\left( x'β \right)β_j\]

where \(F'\) is the derivative of \(F\) . Find expressions for the marginal effect for logit and probit models respectively.

d. Estimate the marginal effect for a logit model if we have one explanatory variable, \(x'_iβ=β_1+β_2x_i\) with estimates \(b_1=-0.2\) and \(b_2=0.1\) for an individual with \(x=10\) . Do the same this for \(x=20\) and \(x=30\) to convince yourself that the marginal effect is decreasing in \(x\) .

Solution

a. Denote \(P\left( y_i=1 \mid x_i \right)\) by \(p_i\) . The probability mass function for \(y_i\) given \(x_i\) is

\[f\left( y_i \mid x_i \right)=p_i^{y_i}{\left( 1-p_i \right)}^{1-y_i}\]

This is also \(L_i\left( β \right)\) . We have

\[l_i\left( β \right)=log \left( p_i^{y_i}{\left( 1-p_i \right)}^{1-y_i} \right)=y_ilog p_i+\left( 1-y_i \right)log \left( 1-p_i \right)\]

The result follows since \(l\left( β \right)\) is the sum of all the \(l_i\left( β \right)\) and \(p_i=P\left( y_i=1 \mid x_i \right)=F\left( x'_iβ \right)\)

b. We estimate \(P\left( y_i=1 \mid x_i \right)\) using \(F\left( x'_ib \right)\) . In our case, \(x'_ib=b_1+b_2x_i=-0.2+0.1⋅10=0.8\) . We have the logistic model so

\[F\left( x'_ib \right)=L\left( x'_ib \right)= \frac{1}{1+e^{-x'_ib}}= \frac{1}{1+e^{-0.8}}=0.69\]

We estimate the probability that \(y_i=1\) to be 69% (given \(x=10\) ).

c. This is just the chain rule. \(F\left( x'β \right)=F\left( β_1x_1+ \ldots +β_kx_k \right)\) and the derivative of this function with respect to, say \(x_1\) , is \(F'\left( β_1x_1+ \ldots +β_kx_k \right)\) times the inner derivative \(β_1\) .

For the logit model, \(F\left( w \right)=L\left( w \right)= \frac{1}{1+e^{-w}}\) with derivative

\[F'\left( w \right)= \frac{e^{-w}}{{\left( 1+e^{-w} \right)}^2}= \frac{e^w}{{\left( 1+e^w \right)}^2}\]

and

\[ \frac{∂P\left( y=1 \mid x \right)}{∂x_j}= \frac{e^{x'β}}{{\left( 1+e^{x'β} \right)}^2}β_j\]

For the probit model, \(F\left( w \right)=Φ\left( w \right)\) . The derivative of the cdf-function is the pdf, which, for a standard normal, is denoted by \(ϕ\left( w \right)\) where

\[ϕ\left( w \right)= \frac{1}{\sqrt{2π}}exp \left( -w^2/2 \right)\]

We have

\[ \frac{∂P\left( y=1 \mid x \right)}{∂x_j}=ϕ\left( x'β \right)β_j\]

d. For \(x=10\) we estimate \(x'β\) to 0.8. The estimated marginal effect is

\[ \frac{e^{0.8}}{{\left( 1+e^{0.8} \right)}^2}⋅0.1=0.0214\]

Interpretation: increasing \(x\) by one unit from 10 to 11 will increase the probability that \(y=1\) by about 2 percentage points. For \(x=20\) we get the marginal effect 0.0122 and for \(x=30\) we get the marginal effect 0.0054.