Coding neuroradiology reports for the Northern Manhattan Stroke Study: A comparison of natural language processing and manual review

Citation
Js. Elkins et al., Coding neuroradiology reports for the Northern Manhattan Stroke Study: A comparison of natural language processing and manual review, COMPUT BIOM, 33(1), 2000, pp. 1-10
Citations number
24
Categorie Soggetti
Multidisciplinary
Journal title
COMPUTERS AND BIOMEDICAL RESEARCH
ISSN journal
00104809 → ACNP
Volume
33
Issue
1
Year of publication
2000
Pages
1 - 10
Database
ISI
SICI code
0010-4809(200002)33:1<1:CNRFTN>2.0.ZU;2-L
Abstract
Automated systems using natural language processing may greatly speed chart review tasks for clinical research, bur their accuracy in this setting is unknown. The objective of this study was to compare the accuracy of automat ed and manual coding in the data acquisition tasks of an ongoing clinical r esearch study. the Northern Manhattan Stroke Study (NOMASS). We identified 471 neuroradiology reports of brain images used in the NOMASS study. Using both automated and manual coding, we completed a standardized NOMASS imagin g form with the information contained in these reports. We then generated R OC curves for both manual and automated coding by comparing our results to the original NOMASS data, where study investigators directly coded their in terpretations of brain images. The areas under the ROC curves for both manu al and automated coding were the main outcome measure. The overall predicti ve value of the automated system (ROC area 0.85, 95% CI 0.84-0.87) was nor statistically different from the predictive value of the manual coding (ROC area 0.87, 95% CI 0.83-0.91). Measured in terms of accuracy, the automated system performed slightly worse than manual coding. The overall accuracy o f the automated system was 84% (CI 83-85%). The overall accuracy of manual coding was 86% (CI 84-88%). The difference in accuracy between the two meth ods was small but statistically significant (P = 0.026). Errors in manual c oding appeared to be due to differences between neurologists' and neuroradi ologists' interpretations, different use of detailed anatomic terms, and la ck of clinical information. Automated systems can use natural language proc essing to rapidly perform complex data acquisition tasks. Although there is a small decrease in the accuracy of the data as compared to traditional me thods, automated systems may greatly expand the power of chart review in cl inical research design and implementation. (C) 2000 Academic Press.