You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.

Variable Selection

Sometimes a large number of independent variables, Xi,  is available for a given modeling problem, and not all of these predictor variables may contribute equally well to the explanation of the predicted variable Y. Some of the independent variables may not contribute at all to the model. Thus we have to select from these variables to obtain a model which contains as little variables as possible while still being the "best" model. In principle, all possible combinations of independent variables should be tried for calculating a suitable model. This could turn out to be a formidable task, even if high performance computers are available. Besides the practicability of this approach, there are also several theoretical considerations which should be taken into account:

Depending on the type of model being used, there are several strategies to (partially) solve the problem:

Using all possible subsets of variables:

Stepwise procedures:

