|You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information.|
|Table of Contents Multivariate Data Modeling Neural Networks Common Questions about ANNs|
|See also: ANN - Introduction, Interpolation and Extrapolation|
The following paragraphs contain some commonly asked questions about ANNs.
When should neural networks be used?
In order to avoid solving simple tasks with complex
models, traditional statistical methods dealing with linear mappings should
be exploited first. To check whether non-linear methods provide better
results and reveal more information, experiments with simple standard versions
of neural networks should be conducted. Later, the model can be improved
by adding more layers, more units, or feedback loops.
What about pre-processing the data?
Inexperienced users of neural networks tend to
overestimate the capabilities of ANNs. Actually, pre-processing is very
important and may be the key to success (also see the section about the
of data space).
Is it true that neural networks can approximate almost any function?
Even though it can be proven that certain types
of models can actually approximate almost any given function, this does
not automatically imply that the implemented neural network can always
be trained appropriately. Typically, neural networks get caught in local
minima during the training process. So, they may not come close to the
How many data points are necessary to train a neural network?
Since neural networks are trained through examples, large data sets are required. Before starting the experiments, try to collect as many examples as possible. Especially models with many degrees of freedom (e.g. many connections in the network), require a large number of examples. There exist heuristics for finding out the maximum number of degrees of freedom for a given number of examples, or the minimum number of required examples for a given number of degrees of freedom. However, this criteria is hardly ever met in practice. When the available data set is not large enough, the results are not reliable.
Is the number of examples per class important ?
The relative amount of examples per class influences the resulting network. The more often a type of pattern is presented, the better it is learned. If you want all classes to be equally well learned, use the same number of examples per class. If the number of examples per class is close, you may not want to consider this issue. But if 90% of the examples belong to one class, the network may not learn to recognize the other class, because only 10% of the examples belong to it.
Warning: When changing
the number of examples per class, the performance of the net may not be
adequate for the original problem. For example, if the distribution of
classes in the natural environment differs from that in the training set,
you have to test the trained network with an independent test set taken
from the natural environment to find out how it performs there.
What about special cases in the training data set?
Usually, the neural network does not learn to treat the special cases correctly, because they are not presented to the network often enough. The neural network takes the statistical distribution of the data into account, and tends to neglect outliers. Here are a few tips on how to handle special cases:
Can ANNs be re-trained with new examples ?
Whether this is possible depends on the neural
network model, e.g. the multi-layer perceptron is not well-suited to re-training.
Starting from scratch is usually faster and provides better results. Other
models may be applicable to the additional integration of examples.
Can a neural network handle several tasks at once?
Whether this is reasonable depends on the tasks.
If the tasks are closely related, this can improve the performance, because
the weights leading to the hidden units pre-structure the task appropriately.
However, tasks which are too different usually interfere. In general,
use separate neural networks with single output units for each task. This
provides a better overview, and allows smaller networks to be used.
How many hidden units should be used?
The hidden units pre-structure the inputs so that
they are useful for solving the task. You should try to use as few hidden
units as possible. When there are many hidden units, the network tends
to adapt too well to the training set. Thus, it is less suited to generalizing.
The removal of a single hidden unit considerably reduces the size of the
network (and thus the number of degrees of freedom), because a single hidden
unit is connected with all the input units and all the output units.
Last Update: 2006-Jšn-17