In studies on atopic dermatitis (AD), different scoring systems are us
ed to evaluate the severity of the disease. The objective of this stud
y was to investigate agreement between observers in the assessment of
the overall severity of AD, and interobserver variation in the assessm
ent of severity of AD for each scoring item separately, using the Simp
le Scoring System (SSS), the Scoring Atopic Dermatitis (SCORAD) index,
and the Basic Clinical Scoring System (BCSS), and,furthermore, to inv
estigate agreement between these three scoring systems in the assessme
nt of the overall severity of AD. Eighty-two patients (42 male) with A
D, mean age 13.4 years (range 0.2-67.0), were included. Agreement betw
een observers in assessing the overall AD severity scores, and interob
server variation in assessing AD severity of each scoring item separat
ely were determined in 34 of these 82 patients by two physicians scori
ng the severity of AD by the three scoring systems. To determine agree
ment between the scoring systems, one physician scored the severity of
AD in all patients with the three scoring systems. Agreement between
observers and agreement between the three scoring systems was calculat
ed by Cohen's kappa (kappa) and by the measure of agreement according
to Bland & Altman. kappa>0.4 represents fair agreement; kappa>0.75 exc
ellent agreement. In addition, interobserver variation for each scorin
g item separately was calculated by the Wilcoxon signed rank test. The
mean differences (d) and the Limits of agreement (d+/-2 SD of the dif
ferences) between observers by the SSS and the SCORAD were -0.82+/-5.5
8 and -0.28+/-7.49, respectively. kappa between observers for the BCSS
was 0.90 (95% CI 0.79-1.03). By the SSS, significant interobserver va
riation was found in assessing the severity of excoriations (P=0.02) a
nd scales (P=0.02). By the SCORAD, significant interobserver variation
was found in assessing the severity of edema/papulation (P=0.04), ery
thema (P=0.04), and excoriations (P=0.01). No significant interobserve
r variation was found in assessing the extent of AD. The mean differen
ce and the limits of agreement between the SSS and the SCORAD were -4.
17+/-9.52. kappa between the SSS and the BCSS was 0.21 (95% CI 0.09-0.
33), and kappa between the SCORAD and the BCSS was 0.38 (95% CI 0.26-0
.51). We found good agreement between observers assessing the overall
severity of AD in the lower and higher scoring rates by the SSS and th
e SCORAD, and excellent agreement by the BCSS. Significant interobserv
er variation was found on the isolated intensity items scales, excoria
tions, edema/papulation, and erythema. We found poor agreement between
the three scoring systems in assessing the overall severity of AD, in
dicating that the SSS, the SCORAD, and the BCSS cannot be used interch
angeably to assess the overall severity of AD.