You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.
|
|
Representative Samples
Drawing representative samples can be quite demanding. An experimenter
should always ask himself whether the drawn samples are representative
of the population he is interested in. A few examples should clarify the
situation:
-
If you have a truck load of iron ore and you are the chemist who is responsible
for checking the quality of the ore, you have to make sure that the samples
drawn from the truckload are representative of the whole amount of ore.
-
Suppose you are a researcher who is interested in getting information on
car usage in several countries, worldwide. While drawing a random sample
of the French population by selecting people at random from telephone books
might work (virtually all inhabitants of France have a telephone), this
is certainly not the case when you select your sample the same way for
Bangladesh.
-
The Literary Digest poll for the 1936 presidental elections in the United
States predicted that Landon would defeat Roosevelt, which was disproved
by reality. The vote in that election split along economic lines, with
wealthier people favoring Landon and poorer people favoring Roosevelt.
The samples for the investigation were taken from telephone books, which
resulted in a non-representative sample (in 1936, telephone subscribers
tended to be wealthier than the general population, and thus the sampling
procedure oversampled Landon voters and undersampled Roosevelt voters).
One prerequisite for a representative sample is that the sampling
process is done randomly. An example may clarify this:
A gardener changed the method of cultivation of tulips. In order to
know whether the new method was successful, some statistical tests were
performed. As the size of the population of tulips (= all available tulips)
was approx. 4000, she decided to draw a selection of 100 flowers to calculate
an estimate of the average length of the new cultivation population.
How could she select 100 out of about 4000 flowers, without distorting
the measurement by subjective influences? Note: sampling by personal "standards"
almost always causes errors due to psychological reasons. Maybe she was
convinced of the new method, or rejected it for some reason. Even if she
tried to be objective, it is questionable whether an unconscious manipulation
of the sampling occurred anyway.
A usual method for creating representative samples is to use random
numbers for the selection of individual test objects:
-
Assign consecutive numbers to each object
-
Calculate as many random numbers as the size of the sample requires. If
you don't have a reliable random number
generator, use random numbers from a table.
-
Pick the objects with the corresponding numbers.
Last Update: 2004-Jul-03