Introduction to Econometrics

Chapter 2 : Least squares principle

By Lund University

This chapter introduces the least squares principle. The basic problem is how to fit a straight line through a scatter plot. We will cover the ordinary least squares (OLS) formula which will provide us with an intercept and a slope. We will the derive the OLS formula from the least squares principle. This chapter focuses on the algebra of least squares. There is no probability theory or statistics in this chapter. Important concepts introduced in this chapter: Trendline, residuals, fitted values and R-squared. In addition to Excel, we will also use demonstrate how to find trendlines using EViews and Stata.

Ordinary least squares (OLS)

This section introduces the OLS formula, the formula we use to find a straight trend line through a scatter plot. In this section, the formula is presented without any derivation. We will also look at how to find trendlines in EViews, Stata and Excel. Once we have our scatter plot and trendline, we can define residuals and fitted values. Using the residuals, we can introduce the least squares principle, the principle behind the OLS formula. Finally, we look at the special case when the intercept of our trend line is zero ("no intercept").

The OLS formula

Introduction to equations in Eviews

Residuals and fitted values

The least squares principle

Trendline with no intercept

Problem: OLS trendline in Excel

Problem: A few observations on the OLS formula

Problem: OLS trendline in Eviews

Problem: Fitted values and residuals in Eviews

Problem: A few observations on fitted values and residuals

Deriving the OLS formula

This section presents the least squares principle mathematically as a minimization problem in two variables (intercept and slope). We will solve this problem analytically, which will result in the OLS formula. Based on the first order condition from the optimization problem, we can derive several important OLS results.

Deriving the OLS formula

Global minimum of RSS

Some OLS results

Problem: Minimize RSS by hand

Problem: Derivative of RSS

Problem: Deriving the OLS formula for a model with no intercept

Measures of fit

In some cases, our trendline will fit our data well and, in some cases, it will not. In order to derive a measure of fit, we begin by identify an important result: the total variation in the data will be equal to the variation that we can explain (with the trend line) and variation that we cannot explain. From this, we define the measure of fit, R-squared, as the proportion of the total variation that we can explain.

Measure of fit

Residuals and fit in Eviews