M. Birenbaum et al., ON AGREEMENT OF DIAGNOSTIC CLASSIFICATIONS FROM PARALLEL SUBTESTS - SCORE RELIABILITY AT THE MICRO LEVEL, Educational and psychological measurement, 57(4), 1997, pp. 541-558
Citations number
22
Categorie Soggetti
Psychology, Educational","Psychologym Experimental","Mathematical, Methods, Social Sciences","Mathematics, Miscellaneous
The purpose of the present study was to examine the agreement of diagn
ostic classifications from two parallel subtests assessing a procedura
l skill in mathematics using three levels of scoring: (a) observed ite
m scores (correct/incorrect), (b) underlying rules of operation, and (
c) underlying task attributes. A bug analysis and a rule space analysi
s were employed to assess categories b and c, respectively. The result
s indicated that even when the parallel form reliability coefficient o
f a given test is relatively high, less agreement is evidenced when pe
rformance is evaluated at the micro level. This suggests that incorrec
t responses to equivalent items may result from application of differe
nt underlying mal-rules (''bugs''), which in turn may result from nonm
astery of the same task attribute(s). The results are discussed in lig
ht of their implications for diagnostic assessment and remediation.