Pd. Molyneux et al., Visual analysis of serial T2-weighted MRI in multiple sclerosis: intra- and interobserver reproducibility, NEURORADIOL, 41(12), 1999, pp. 882-888
We evaluated the effect of consensus formation and training on the agreemen
t between observers in scoring the number of new and enlarging multiple scl
erosis (MS) lesions on serial T2-weighted MRI studies. The baseline and mon
th 9 MRI studies of 16 patients with a range of MRI activity were used (dua
l-echo conventional spin-echo sequence, TR 2000, TE 34 and 90 ms, 5 mm cont
iguous slices, inplane resolution 1 mm). First, the serial studies were vis
ually analysed for the presence of new and enlarging lesions, on two occasi
ons, by five experienced observers, without adopting any consensus strategy
and in isolation. Next, the observers met to identify the common sources o
f inconsistencies in reporting between observers and formulate consensus ru
les. Finally, a further independent reading session was performed on the sa
me MRI dataset, this time applying the consensus rules. Agreement between o
bservers was assessed using kappa scores. Without the consensus rules, inte
robserver kappa scores for the first and second reading sessions for new le
sions were only 0.51 and 0.39 respectively; agreement for enlarging lesions
was even worse. The mean intraobserver kappa score for new lesions was hig
her at 0.72, reflecting the fact that the observers were consistently apply
ing their individual assessment strategies. Application of the consensus ru
les did not lead to a significant improvement in inter observer kappas; the
kappa scores adopting the guidelines were 0.46 and 0.21 for new and enlarg
ing lesions respectively. Consensus guidelines thus did not improve the rep
roducibility of visual analysis of serial T2-weighted MRI, and the level of
agreement between observers remained only moderate. Suboptimal repositioni
ng is likely to be a major source of residual variability and this suggests
a future role for image registration strategies; until then, a single obse
rver, or pair of observers working in consensus, should be used in MS studi
es.