Predicting cesarean delivery with decision tree models

Citation
Cj. Sims et al., Predicting cesarean delivery with decision tree models, AM J OBST G, 183(5), 2000, pp. 1198-1206
Citations number
14
Categorie Soggetti
Reproductive Medicine","da verificare
Journal title
AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY
ISSN journal
00029378 → ACNP
Volume
183
Issue
5
Year of publication
2000
Pages
1198 - 1206
Database
ISI
SICI code
0002-9378(200011)183:5<1198:PCDWDT>2.0.ZU;2-G
Abstract
OBJECTIVE: The purpose of this study was to determine whether decision tree -based methods can be used to predict cesarean delivery. STUDY DESIGN: This was a historical cohort study of women delivered of live -born singleton neonates in 1995 through 1997 (22,157). The frequency of ce sarean delivery was 17%; 78 variables were used for analysis. Decision tree rule-based methods and logistic regression models were each applied to the same 50% of the sample to develop the predictive training models and these models were tested on the remaining 50%. RESULTS: Decision tree receiver operating characteristic curve areas were a s follows: nulliparous, 0.82; parous, 0.93. Logistic receiver operating cha racteristic curve areas were as follows: nulliparous, 0.86; parous, 0.93. D ecision tree methods and logistic regression methods used similar predictiv e variables; however, logistic methods required more variables and yielded less intelligible models. Among the 6 decision tree building methods tested , the strict minimum message length criterion yielded decision trees that w ere small yet accurate. Risk factor variables were identified in 676 nullip arous cesarean deliveries (69%) and 419 parous cesarean deliveries (47.6%). CONCLUSION: Decision tree models can be used to predict cesarean delivery. Models built with strict minimum message length decision trees have the fol lowing attributes: Their performance is comparable to that of logistic regr ession; they are small enough to be intelligible to physicians; they reveal causal dependencies among variables not detected by logistic regression; t hey can handle missing values more easily than can logistic methods; they p redict cesarean deliveries that lack a categorized risk factor variable.