This study examined the characteristics of the knowledge discovery and data
mining algorithms to demonstrate how they can be used to predict health ou
tcomes and provide policy information for hypertension management using the
Korea Medical Insurance Corporation database. Specifically, this study val
idated the predictive power of data mining algorithms by comparing the perf
ormance of logistic regression and two decision tree algorithms, CHIAD (Chi
-squared Automatic Interaction Detection) and C5.0 (a variant of C4.5) usin
g the test set of 4588 beneficiaries and the training set of 13,689 benefic
iaries. Contrary to the previous study, the CHIAD algorithm performed bette
r than the logistic regression in predicting hypertension, and C5.0 had the
lowest predictive power. In addition, the CHIAD algorithm and the associat
ion rule also provided the segment-specific information for the risk factor
s and target group that may be used in a policy analysis for hypertension m
anagement. (C) 2001 Elsevier Science Ireland Ltd. All rights reserved.