El. Eisenstein et F. Alemi, AN EVALUATION OF FACTORS INFLUENCING BAYESIAN LEARNING-SYSTEMS, Journal of the American Medical Informatics Association, 1(3), 1994, pp. 272-284
Citations number
34
Categorie Soggetti
Information Science & Library Science","Medicine Miscellaneus","Computer Science Information Systems
Objectives: To examine the influences of situational and model factors
on the accuracy of Bayesian learning systems. Design: This study exam
ines the impacts of variations in two situational factors, training sa
mple size and number of attributes, and in two model factors, choice o
f Bayesian model and criteria for excluding model attributes, on the o
verall accuracy of Bayesian learning systems. Measurements: The test d
ata were derived from myocardial infarction patients who were admitted
to eight hospitals in New Orleans during 1985. The test sample consis
ted of: 339 cases; the training samples included 100, 400, and 800 cas
es. APACHE II variables were used for the model attributes and patient
discharge status as the outcome predicted. Attribute sets were select
ed in sizes of 4, 8, and 12. The authors varied the Bayesian models (p
roper and simple) and the attribute exclusion criteria (optimism and p
essimism). Results: The simple Bayes model, which assumes conditional
independence, consistently equalled or outperformed the proper (maxima
lly dependent) Bayes model, which assumes conditional dependence, acro
ss all training sample and attribute set sizes. Not excluding model at
tributes was found to be preferable to using sample theory as an attri
bute exclusion criterion in both the simple and the proper models. Con
clusion: In the domain tested, the simple Bayes model with optimistic
exclusion is more robust than previously assumed and increasing the nu
mber of attributes in a model had a greater relative impact on model a
ccuracy than did increasing the number of training sample cases. Asses
sment of applicability of these findings to other domains will require
further study. In addition, other models that are between these two e
xtremes must be investigated. These include models that approximate pr
oper Bayes' conditional dependence computations while requiring fewer
training sample cases, attribute exclusion criteria between optimism a
nd pessimism that improve accuracy, and ordering techniques for introd
ucing attributes into Bayes models that optimize the information value
associated with the attributes in test-sample cases.