Teach/Me Data Analysis

You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.

Table of Contents Bivariate Data Correlation Introduction	Index
See also: autocorrelation, Challenger disaster, correlation coefficient, covariance

Correlation

The aim of correlation analysis is to detect relationships among variables. We have to keep in mind that the term "correlation" usually means "linear correlation". The measure of linear correlation is the correlation coefficient. The importance of correlation is sometimes overestimated, since people often do not know the restrictions and the implications of a high correlation. On the other hand, a correlation coefficient of zero does not necessarily mean that there is no relationship between two variables, as is shown in the figure below. The left figure shows an uncorrelated data set, whereas the right diagram shows a perfect parabolic relationship, although the correlation coefficient in both cases is close to zero.

And again, a high correlation coefficient may not be due to high correlation within the data (as in the left figure below), but may be due to a single outlier which is located away from the uncorrelated rest of the data samples (right figure below).

More real examples on correlations can be found elsewhere.

Last Update: 2006-Jän-17