Improving the reliability of stroke subgroup classification using the Trial of ORG 10172 in Acute Stroke Treatment (TBAST) criteria

Citation
Lb. Goldstein et al., Improving the reliability of stroke subgroup classification using the Trial of ORG 10172 in Acute Stroke Treatment (TBAST) criteria, STROKE, 32(5), 2001, pp. 1091-1096
Citations number
7
Categorie Soggetti
Neurology,"Cardiovascular & Hematology Research
Journal title
STROKE
ISSN journal
00392499 → ACNP
Volume
32
Issue
5
Year of publication
2001
Pages
1091 - 1096
Database
ISI
SICI code
0039-2499(200105)32:5<1091:ITROSS>2.0.ZU;2-J
Abstract
Background and Purpose-We sought to improve the reliability of the Trial of ORG 10172 in Acute Stroke Treatment (TOAST) classification of stroke subty pe for retrospective use in clinical, health services, and quality of care outcome studies. The TOAST investigators devised a series of 11 definitions to classify patients with ischemic stroke into 5 major etiologic/pathophys iological groupings, Interrater agreement was reported to be substantial in a series of patients who were independently assessed by pairs of physician s. However, the investigators cautioned that disagreements in subtype assig nment remain despite the use of these explicit criteria and that trials sho uld include measures to ensure the most uniform diagnosis possible. Methods-In preparation for a study of outcomes and management practices for patients with ischemic stroke within Department of Veterans Affairs hospit als, 2 neurologists and 2 internists first retrospectively classified a ser ies of 14 randomly selected stroke patients on the basis of the TOAST defin itions to provide a baseline assessment of interrater agreement. A 2-phase process was then used to improve the reliability of subtype assignment. In the first phase, a computerized algorithm was developed to assign the TOAST diagnostic category. The reliability of the computerized algorithm was tes ted with a series of synthetic cases designed to provide data fitting each of the Il definitions. In the second phase, critical disagreements in the d ata abstraction process were identified and remaining variability was reduc ed by the development of standardized procedures for retrieving relevant in formation from the medical record. Results-The 4 physicians agreed in subtype diagnosis for only 2 of the 14 b aseline cases (14%) using all 11 TOAST definitions and for 4 of the 14 case s (29%) when the classifications were collapsed into the 5 major etiologic/ pathophysiological groupings (kappa =0.42; 95% CI, 0.32 to 0,53), There was 100% agreement between classifications generated by the computerized algor ithm and the intended diagnostic groups for the 11 synthetic cases. The alg orithm was then applied to the original 14 cases, and the diagnostic catego rization was compared with each of the 4 physicians' baseline assignments. For the 5 collapsed subtypes, the algorithm-based and physician-assigned di agnoses disagreed for 29% to 50% of the cases, reflecting variation in the abstracted data and/or its interpretation. The use of an operations manual designed to guide data abstraction improved the reliability subtype assignm ent (kappa =0.54; 95% CI, 0.26 to 0.82). Critical disagreements in the abst racted data were identified, and the manual was revised accordingly. Reliab ility with the use of the 5 collapsed groupings then improved for both inte rrater (kappa =0.68; 95% CI, 0.44 to 0.91) and intrarater (kappa =0.74; 95% CI, 0.61 to 0.87) agreement. Examining each remaining disagreement reveale d that half were due to ambiguities in the medical record and half were rel ated to otherwise unexplained errors in data abstraction. Conclusions-Ischemic stroke subtype based on published TOAST classification criteria can be reliably assigned with the use of a computerized algorithm with data obtained through standardized medical record abstraction procedu res. Some variability in stroke subtype classification will remain because of inconsistencies in the medical record and errors in data abstraction. Th is residual variability can be addressed by having 2 raters classify each c ase and then identifying and resolving the reason(s) for the disagreement.