L. Al-haddad et al., Training radial basis function neural networks: effects of training set size and imbalanced training sets, J MICROB M, 43(1), 2000, pp. 33-44
Obtaining training data for constructing artificial neural networks (ANNs)
to identify microbiological taxa is not always easy. Often, only small data
sets with different numbers of observations per taxon are available. Here,
the effect of both size of the training data set and of an imbalanced numb
er of training patterns for different taxa is investigated using radial bas
is function ANNs to identify up to 60 species of marine microalgae. The bes
t networks trained to discriminate 20, 40 and 60 species respectively gave
overall percentage correct identification of 92, 84 and 77%. From 100 to 20
0 patterns per species was sufficient in networks trained to discriminate 2
0, 40 or 60 species. For 40 and 60 species data sets an imbalance in the nu
mber of training patterns per species always affected training success, the
greater the imbalance the greater the effect. However, this could be large
ly compensated for by adjusting the networks using a posteriori probabiliti
es, estimated as network output values. (C) 2000 Elsevier Science B.V. All
rights reserved.