L. Parra et al., STATISTICAL INDEPENDENCE AND NOVELTY DETECTION WITH INFORMATION PRESERVING NONLINEAR MAPS, Neural computation, 8(2), 1996, pp. 260-269
According to Barlow (1989), feature extraction can be understood as fi
nding a statistically independent representation of the probability di
stribution underlying the measured signals. The search for a statistic
ally independent representation can be formulated by the criterion of
minimal mutual information, which reduces to decorrelation in the case
of gaussian distributions. If nongaussian distributions are to be con
sidered, minimal mutual information is the appropriate generalization
of decorrelation as used in linear Principal Component Analyses (PCA).
We also generalize to nonlinear transformations by only demanding per
fect transmission of information. This leads to a general class of non
linear transformations, namely symplectic maps. Conservation of inform
ation allows us to consider only the statistics of single coordinates.
The resulting factorial representation of the joint probability distr
ibution gives a density estimation. We apply this concept to the real
world problem of electrical motor fault detection treated as a novelty
detection task.