UNLOCKING CLINICAL-DATA FROM NARRATIVE REPORTS - A STUDY OF NATURAL-LANGUAGE PROCESSING

Citation
G. Hripcsak et al., UNLOCKING CLINICAL-DATA FROM NARRATIVE REPORTS - A STUDY OF NATURAL-LANGUAGE PROCESSING, Annals of internal medicine, 122(9), 1995, pp. 681-688
Citations number
20
Categorie Soggetti
Medicine, General & Internal
Journal title
ISSN journal
00034819
Volume
122
Issue
9
Year of publication
1995
Pages
681 - 688
Database
ISI
SICI code
0003-4819(1995)122:9<681:UCFNR->2.0.ZU;2-0
Abstract
Objective: To evaluate the automated detection of clinical conditions described in narrative reports. Design: Automated methods and human ex perts detected the presence or absence of six clinical conditions in 2 00 admission chest radiograph reports. Study Subjects: A computerized, general-purpose natural language processor; 6 internists; 6 radiologi sts; 6 lay persons; and 3 other computer methods. Main Outcome Measure s: Intersubject disagreement was quantified by ''distance'' (the avera ge number of clinical conditions per report on which two subjects disa greed) and by sensitivity and specificity with respect to the physicia ns. Results: Using a majority vote, physicians detected 101 conditions in the 200 reports (0.51 per report); the most common condition was a cute bacterial pneumonia (prevalence, 0.14), and the least common was chronic obstructive pulmonary disease (prevalence, 0.03). Pairs of phy sicians disagreed on the presence of at least 1 condition for an avera ge of 20% of reports. The average intersubject distance among physicia ns was 0.24 (95% CI, 0.19 to 0.29) out of a maximum possible distance of 6. No physician had a significantly greater distance than the avera ge. The average distance of the natural language processor from the ph ysicians was 0.26 (CI, 0.21 to 0.32; not significantly greater than th e average among physicians). Lay persons and alternative computer meth ods had significantly greater distance from the physicians (all >0.5). The natural language processor had a sensitivity of 81% (CI, 73% to 8 7%) and a specificity of 98% (CI, 97% to 99%); physicians had an avera ge sensitivity of 85% and an average specificity of 98%. Conclusions: Physicians disagreed on the interpretation of narrative reports, but t his was not caused by outlier physicians or a consistent difference in the way internists and radiologists read reports. The natural languag e processor was not distinguishable from the physicians and was superi or to all other comparison subjects. Although the domain of this study was restricted (six clinical conditions in chest radiographs), natura l language processing seems to have the potential to extract clinical information from narrative reports in a manner that will support autom ated decision-support and clinical research.