REDUCTION OF THE SIZE OF THE LEARNING DATA IN A PROBABILISTIC NEURAL-NETWORK BY HIERARCHICAL-CLUSTERING - APPLICATION TO THE DISCRIMINATIONOF SEEDS BY ARTIFICIAL VISION

Citation
Y. Chtioui et al., REDUCTION OF THE SIZE OF THE LEARNING DATA IN A PROBABILISTIC NEURAL-NETWORK BY HIERARCHICAL-CLUSTERING - APPLICATION TO THE DISCRIMINATIONOF SEEDS BY ARTIFICIAL VISION, Chemometrics and intelligent laboratory systems, 35(2), 1996, pp. 175-186
Citations number
21
Categorie Soggetti
Computer Application, Chemistry & Engineering","Instument & Instrumentation","Chemistry Analytical","Computer Science Artificial Intelligence","Robotics & Automatic Control
ISSN journal
01697439
Volume
35
Issue
2
Year of publication
1996
Pages
175 - 186
Database
ISI
SICI code
0169-7439(1996)35:2<175:ROTSOT>2.0.ZU;2-9
Abstract
The control of seed batches is necessary before their commercializatio n. In the present work, we attempted to apply computer vision to this goal. A pattern recognition system formed by a color image analysis de vice combined with a neural network classifier was tested on a practic al problem which consisted of the discrimination between 4 seed specie s (2 cultivated and 2 adventitious seed species). A probabilistic neur al network (PNN) was used as a classifier. PNN has many advantages, bu t it requires the storage of all the learning patterns. The main goal of this work was the reduction of the learning data in order to decrea se the memory and time requirements of this kind of network. This was achieved by reducing the number of both features and learning patterns . Principal component analysis (PCA) was used for feature extraction. A small number of relevant components were selected as inputs for the PNN. A further data reduction was performed by a hierarchical clusteri ng technique based on reciprocal neighbors (RN). The effects of reduci ng the training set size on the classification performances of the PNN were tested. From color images of seeds, seventy-three features (incl uding size, shape, and textural features) were measured. By considerin g the sum of their eigenvalues, the 4 first principal components were selected. The training set size was then reduced by RN from 1600 patte rns to 1176 patterns after one iteration, and to 543 after 5 iteration s. Without any reduction of the training set, PNN correctly classified 93.0% and 91.9% of the training and the test sets, respectively. Afte r 5 reductions, the classification results were 91.9% and 89.1% of the training and the test sets. The classification results slightly decre ased after 5 reductions of the training set. It was concluded from sim ulations that the beneficial effect of reductions is only valid when a few reductions have been performed, because the classification perfor mances notably decreased when many iterations of RN were applied. The combination of PCA and RN (5; iterations) made it possible to reduce t he learning data to 1.85% of the initial available data, with only a s light decrease of classification performances.