Introduction to Econometrics
by Lund University
This course will cover material comparable to a typical first course in econometrics.
This chapter introduces the least squares method which is used to fit a straight line through a scatter plot. This chapter focuses on the algebra of least squares. There is no probability theory or statistics in this chapter. The chapter begins with sample moments, goes through and derives the OLS formula. Important concepts introduced in this chapter: Trendline, residuals, fitted values and R-squared. In addition to Excel, we will also introduce EViews in this chapter and look at how to find trendlines using Excel and EViews.
In order to make more sense of the concepts introduced in chapter 1, we need some probability theory and statistics. We want to be able to explain observed deviations from the trendline and we will do that with random variables called error terms. This chapter covers the absolute minimum from probability theory: random variables, distribution functions, expected value, variance and covariance. This chapter also introduces conditional moments which will turn out to be of great importance in econometrics as the fundamental assumption on the error terms will be stated as a conditional expectation.
This chapter formalizes the most important model in econometrics, the linear regression model. The entire chapter is restricted to a special case, nameley when you have only one explanatory variable. The key assumtion of the linear regression model, exogeneity is introduced. Then, the OLS formula from chapter 1 is reinterpreted as an estimator of unknown parameters in the linear regression model. This chapter also introduces the variance of the OLS estimator under an important set of assumptions, the Gauss-Markov assumptions. The chapter concludes with inference in the linear regression model, specifically discussing hypothesis testing and confidence intervals.
This chapter is an extension of chapter 3 allowing for several explanatory variables. First, the linear regression with several explanatory variables, the focus of this chapter, is thoroughly introduced and an extension of the OLS formula is discussed. Since we are not using matrix algebra in this course, we will not be able to present the general formulas such as the OLS formula. Instead, we rely on the fact that they have been correctly programmed into software such as Excel, EVies, Stata and more. We need to make small changes to the inference of this model and we will also introduce some new tests. A new problem that will appear in this model is that of multicolinearity. Next, we look at some nonlinear regression models followed by dummy variables. This chapter is concluded with an anlysis of the data problem heteroscedasticity.
This chapter is an introduction to econometrics with time series data. Chapters 1 to 4 have been restricted to cross sectional data, data for individuals, firms, countries and so on. Working with time series data will introduce new problems, the first and most important being that time series data may be nonstationary which may lead to spurios (misleading) results. However, this chapter will only look at stationary time series data. Time series models may be static or dynamic, where the latter maeans that the dependent variable may depend on values from previous periods. We will look at some dynamic models, most importantly ADL (autoregressive distributed lag) models and AR (autoregressive) models. Another problem with time series data is that the error terms may be correlated over time (autocorrelation). The chapter concludes with a discussion of autocorrelation, how to test for autocorrelation and how to estimate models in the presence of autocorrelation.
Throughout the course so far, we have assumed that the explanatory variables are exogenous. This is the most critical assumption in econometrics. In this chapter we will look at cases when explanatory variables cannot be expected to be exogenous (we then say that they are endogenous). We will also look at the consequence of econometric analysis with endogenous variables. Specifically, we will look at misspecification of our model, errors in variables and the simultaneity problem. When we have endogenous variables, we can sometimes find instruments for them, variables which are correlated with our endogenous variable but not with the error term. This opens for the possibility of consistently estimate the parameters in our model using the instrumental variable estimator and the generalized instrumental variable estimator.
This chapter is an introduction to microeconometric models. We will look at the simplest of these types of models, the binary choice model, a model where your dependent variable is a dummy variable. It turns out that we can use the same methods described in chapter 4, the model is then called the linear probability model. However, the linear probability model has some problems. For example, predict probabilities may be less than zero and/or larger than 100%. In order to rectify this problem, new models are presented (the probit- and the logit model) and a new technique for estimating these models is introduced (maximum likelihood).
In chapter 5 working with time series data, stationarity was a critical assumption. In this chapter we investigate data that is not stationary, the consequences of using non-stationary data and how to determine if your data is stationary.
Panel data is data over cross-section as well as time. This chapter is only an introduction to models using panel data. The focus of this chapter is on the error component model where we look at the fixed effect estimator as well as the random effects model. The chapter concludes with a discussion of how to choice between s and how to choice between the fixed effect estimator and the random effects estimator (including the Hausman test).