AN EVALUATION OF FACTORS INFLUENCING BAYESIAN LEARNING-SYSTEMS

Citation
El. Eisenstein et F. Alemi, AN EVALUATION OF FACTORS INFLUENCING BAYESIAN LEARNING-SYSTEMS, Journal of the American Medical Informatics Association, 1(3), 1994, pp. 272-284
Citations number
34
Categorie Soggetti
Information Science & Library Science","Medicine Miscellaneus","Computer Science Information Systems
ISSN journal
10675027
Volume
1
Issue
3
Year of publication
1994
Pages
272 - 284
Database
ISI
SICI code
1067-5027(1994)1:3<272:AEOFIB>2.0.ZU;2-Q
Abstract
Objectives: To examine the influences of situational and model factors on the accuracy of Bayesian learning systems. Design: This study exam ines the impacts of variations in two situational factors, training sa mple size and number of attributes, and in two model factors, choice o f Bayesian model and criteria for excluding model attributes, on the o verall accuracy of Bayesian learning systems. Measurements: The test d ata were derived from myocardial infarction patients who were admitted to eight hospitals in New Orleans during 1985. The test sample consis ted of: 339 cases; the training samples included 100, 400, and 800 cas es. APACHE II variables were used for the model attributes and patient discharge status as the outcome predicted. Attribute sets were select ed in sizes of 4, 8, and 12. The authors varied the Bayesian models (p roper and simple) and the attribute exclusion criteria (optimism and p essimism). Results: The simple Bayes model, which assumes conditional independence, consistently equalled or outperformed the proper (maxima lly dependent) Bayes model, which assumes conditional dependence, acro ss all training sample and attribute set sizes. Not excluding model at tributes was found to be preferable to using sample theory as an attri bute exclusion criterion in both the simple and the proper models. Con clusion: In the domain tested, the simple Bayes model with optimistic exclusion is more robust than previously assumed and increasing the nu mber of attributes in a model had a greater relative impact on model a ccuracy than did increasing the number of training sample cases. Asses sment of applicability of these findings to other domains will require further study. In addition, other models that are between these two e xtremes must be investigated. These include models that approximate pr oper Bayes' conditional dependence computations while requiring fewer training sample cases, attribute exclusion criteria between optimism a nd pessimism that improve accuracy, and ordering techniques for introd ucing attributes into Bayes models that optimize the information value associated with the attributes in test-sample cases.