You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.

Outlier Tests
Basic Rules

Based on the standard deviation

If we assume a normal distribution, a single value may be considered as an outlier if it falls outside a certain range of the standard deviation. In many cases a factor of 2.5 is used, which means that approx. 99 % of the data belonging to a normal distribution fall inside this range:

+/- 2.5s

If the data values do not belong to a normal distribution, we have to be more careful in selecting the thresholds for outliers. According to Chebyshev's theorem we have to use an interval of +/- 4 standard deviations to ensure that at least 94 % of the data (of an arbitrary distribution)  fall inside this interval. Please note that these basic tests require at least 10 observations (better 25, or more).

Based on the interquartile range

The above-mentioned strategies for identifying outliers are probably most appropriate for symmetric unimodal distributions. If a distribution is skewed, it is recommended to calculate the threshold for outliers from the interquartile distance:

x0.25 - 1.5 [x0.75 - x0.25]  < xi < x0.75 + 1.5 [x0.75 - x0.25]


Last Update: 2006-Jšn-17