Probability theory and statistics

Chapter 6 : Statistics

By Lund University

The final chapter of this course is an introduction to statistics. The basic idea of statistics is to make inference about a population given a sample drawn from this population. We begin by translating the population/sample concepts into a framework consistent with formal probability theory in section one. Section two looks at some important distributions that we often end up using in statistics. Section three looks at the simplest problems in statistics, making inference about the mean and the variance. We then move on to a more general study of statistics, introducing estimators and their small-sample properties. The following section is devoted to the more important, but also more difficult, large-sample properties of estimators. The final section will generalize what we have done so far in this chapter by looking at several estimators collected in a vector.

Sample and statistic

The first section contains two topics. First, we define a sample as a sequence of random variables. This definition of a sample turns out to be more general and more useful than defining a sample as something you draw from a population. Any function of these random variables constituting our sample is called a statistic. Based on probability theory, we can find properties such as moments of a statistic from the PDF/PMF of our sample.

Sample as a sequence of random variables


Important distributions

The distribution of the statistic will depend on the distribution of the random variables making up our sample. In many situations, this distribution turns out to be one of three important distributions which will be analyzed in this section. They are the chi-square distribution, the t-distribution and the F- distribution.

The chi-square distribution

The t-distribution

The F-distribution

Critical values

Properties of the sample mean and the sample variance

Probably the most important statistic is the sample mean. We will look at the properties of the sample mean under simplifying assumptions on our sample. We will also look at the properties of another important statistic namely the sample variance.

Properties of the sample mean

Properties of the sample variance

Theory of estimators

Given a random sample, the sample mean is often used to estimate the common expected value and the sample variance is used to estimate the common variance. In this section, we will have a closer look at exactly what we mean by an estimator. Good estimators have nice properties and one such nice property we will look at is unbiasedness. Another nice property that an unbiased estimator can have is efficiency.


Unbiased estimator


Large-sample properties of estimators

Unbiasedness and efficiency discussed in the previous section or what is called small sample properties of an estimator. This section looks at large sample properties of an estimator. These properties are defined in terms of the performance of the estimator as the sample size close to infinity. First, we define the general concept called convergence in probability. An infinite sequence of random variables can be said to converge to a constant if certain conditions are satisfied. If our estimator converges improbability to the true value we say that the estimator is consistent. We have several important convergence rules, called PL I am rules that we often use in order to show that an estimator is consistent. In the last section, we look at the central limit theorem and convergence in distribution.

Convergence in probability



Convergence in distribution

Vector-valued estimator

In this chapter so far, we have only looked at a single estimator estimating a single parameter. In general, we have several parameters to estimate and our estimator will be vector valued. This section is an introduction to vector valued estimators looking at unbiasedness and consistency of such estimators.

Vector-valued estimator