You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information. |
Table of Contents Multivariate Data Modeling PCA PCA - Different Forms | |
See also: PCA, eigenvectors, loadings and scores |
The principal components may be calculated by eigenanalysis of one of
three different matrices:
Which method is chosen to perform the PCA depends on the problem
at hand. Most often the best results are obtained by experimenting
with all three approaches. Generally speaking, the matrix to be used is
determined by the importance of either the absolute numbers in the data
(scatter matrix), or the relationships between the variables (correlation
matrix). If a fixed offset in the variables causes problems, one may use
the covariance matrix.
Details about these matrices
can be obtained on a separate page.
In order to see the effects of different scalings, take as an example the data set WORLDPOP, which contains some demographic data on all countries of the world (as of 1988). It is quite natural that the absolute numbers are important in this case, so go to the and look at the first two principal components using the three different matrices. For this data set, the standardization prior to the PCA does not make any sense and results in badly differentiated PC plots. However, keep in mind that the opposite may be true for other data sets.
Another good approach worth checking is the 3D rotational display using
the first three principal components (start the PCA, then copy the scores
into the data matrix, and view the first three PCs by the command "3D Rotation")
Last Update: 2006-Jän-17