The average probability of error is used to demonstrate the performance of
a Bayesian classification test (referred to as the Combined Bayes Test (CBT
)) when the training data of each class are mislabeled. The CBT combines th
e information in discrete training and test data to infer symbol probabilit
ies, where a uniform Dirichlet prior (i.e., a noninformative prior of compl
ete ignorance) is assumed for all classes. Using the CBT, classification pe
rformance is shown to degrade when mislabeling exists in the training data,
and this occurs with a severity that depends upon the mislabeling probabil
ities. With this, it is shown that as the mislabeling probabilities increas
e M*, which is the best quantization fineness related to the Hughes phenome
non of pattern recognition, also increases. Notice, that even when the actu
al mislabeling probabilities are known by the CBT it is not possible to ach
ieve the classification performance obtainable without mislabeling. However
, the negative effect of mislabeling can be diminished, with more success f
or smaller mislabeling probabilities, if a data reduction method called the
Bayesian Data Reduction Algorithm (BDRA) is applied to the training data.
Published by Elsevier Science Ltd.