ITA
ENG

Data mining techniques applied to medical information

Authors

Lee, IN Liao, SC Embrechts, M

Citation

In. Lee et al., Data mining techniques applied to medical information, MED INF IN, 25(2), 2000, pp. 81-102

Citations number

Categorie Soggetti

Research/Laboratory Medicine & Medical Tecnology

Journal title

MEDICAL INFORMATICS AND THE INTERNET IN MEDICINE

ISSN journal

14639238 → ACNP

Volume

Issue

Year of publication

2000

Pages

81 - 102

Database

ISI

SICI code

1463-9238(200004/06)25:2<81:DMTATM>2.0.ZU;2-M

Abstract

Knowledge discovery from the dramatically increased data of an auto-stored medical information system is still in its infancy. The purpose of this stu dy is to use widely available and easily operated techniques that can satis fy general users in extracting specific knowledge to make the medical infor mation system more functional. Data mining techniques, including data visua lisation, correlation analysis, discriminant analysis, and neural networks supervised classification, were applied to heart disease databases. These t echniques can help to identify high risk patients, define the most importan t factors (variables) in heart disease, and build a multivariate relationsh ip model to show the relationship between any two variables in a way that s uch relationships are easy to view. Simple visualization techniques were ut ilised to construct this model, which corresponds with current medical know ledge. Two nonparametric (distribution assumption fret) classification tool s were employed to identify high risk heart disease patients. Both the neur al networks supervised classification methods and thr discriminant analysis method produced reliable classification rates for heart disease patients. However, neural networks yielded a higher percentage of correct classificat ions (averaging 89%) than discriminant analysis (79%). Data visualisation a nd correlation anal! sis resulted in similar conclusions regarding the most important factors in heart disease. These data milling tools provide simpl e and effective methods of extracting knowledge from general medical inform ation. The treatment of missing data is also discussed.