Teach/Me Data Analysis

You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.

Table of Contents Math Background Introduction to Probability Random Sampling	Index
See also: sample space, representative samples, random number generators

Random Sampling

The way samples are selected from a population is very important for statistical inference, since we use the probability of a sample to infer the characteristics of the sample population. The most frequently applied sampling technique is random sampling.

Definition If n elements are selected from a population in such a way that every set of n elements in the population has an equal probability of being selected, the n elements are said to be a (simple) random sample.

When the population is not too large, one can mix the sample thoroughly and blindly pick a sample. When playing cards, we shuffle the deck before each game and then hand out the cards. This should give you a random selection of the cards. For large populations, however, direct mixing is not appropriate. One should therefore use random number generators to select a random sample.

The number of possible samples determines the probability of each sample being drawn and therefore is important for the inference on the whole population. Often the listing of all possible events is very tedious or prohibitive in the amount of necessary time and space, e.g. in most lotteries the chances of winning the major prize is 1 out of several million. So we need a more efficient method to determine the number of all samples (combinatorial mathematics).

Random samples are drawn by assigning each of the sample points a number (from 1 to N) and then selecting k numbers, which are obtained from a random number generator. Numbers that are selected twice have to be eliminated and another random number drawn.

Last Update: 2006-Jän-17