Dr. Dirschl et Gl. Adams, A CRITICAL-ASSESSMENT OF FACTORS INFLUENCING RELIABILITY IN THE CLASSIFICATION OF FRACTURES, USING FRACTURES OF THE TIBIAL PLAFOND AS A MODEL, Journal of orthopaedic trauma, 11(7), 1997, pp. 471-476
Objective: To investigate three factors that may influence the reliabi
lity of a fracture classification system: (a) the quality of the radio
graphs; (b) the ability of observers to identify the fracture fragment
s; and (c) the use of binary decision making. Design: Assessment of in
terobserver reliability of blinded observers. Setting: Medical school
department of orthopaedics. Participants: Two attending orthopaedists,
two PGY-5 orthopaedic residents, and two PGY-3 orthopaedic residents
served as observers. Intervention: Observers classified radiographs of
twenty-five tibial plafond fractures according to the Ruedi-Allgower
and binary classification systems, and also rated the quality of each
radiograph as adequate or inadequate for accurately classifying the fr
acture. At a second session, observers classified the same radiographs
after marking the fragments of the tibial articular surface, as well
as radiographs that had the articular fragments premarked by the senio
r author. Main Outcome Measures: Pairwise interobserver reliability wa
s analyzed by kappa statistics, and mean kappa values were compared fo
r each method of fracture classification. Results: No difference in in
terobserver reliability was detected between the Ruedi-Allgower and bi
nary classification systems. Interobserver agreement on the adequacy o
f the radiographs was poorer than agreement on the classification of t
he fractures themselves. Having observers mark the fragments of the ti
bial articular surface had no effect on interobserver reliability; hav
ing the articular fragments premarked, however, significantly improved
interobserver reliability in classifying the fractures. Conclusions:
The results of this study underscore the complexity of tibial plafond
fractures and the difficulty observers have in reliably interpreting f
racture radiographs. Fracture classification systems, such as the Rued
i-Allgower, predicated on identification of the number and displacemen
t of articular fragments, may inherently perform poorly on reliability
analyses because of observer difficulty in reliably identifying the f
ragments. Because binary decision making did not improve the reliabili
ty of fracture classification in this study, further investigation of
the effectiveness of binary decision making may be advisable before su
ch strategies are put into widespread use.