ITA
ENG

Evaluating the accuracy of judgments obtained from item review committees

Authors

Engelhard, G Davis, M Hansche, L

Citation

G. Engelhard et al., Evaluating the accuracy of judgments obtained from item review committees, APPL MEAS E, 12(2), 1999, pp. 199-210

Citations number

Categorie Soggetti

Education

Journal title

APPLIED MEASUREMENT IN EDUCATION

ISSN journal

08957347 → ACNP

Volume

Issue

Year of publication

1999

Pages

199 - 210

Database

ISI

SICI code

0895-7347(1999)12:2<199:ETAOJO>2.0.ZU;2-#

Abstract

The purpose of this study is to examine whether the reviewers on item revie w committees can accurately identify test items that exhibit a variety of f laws. An instrument with 75 items was constructed and administered to 39 re viewers who were operational members of an item review committee. After und ergoing training, the 39 reviewers were asked to examine the 75 items and i ndicate whether each item exhibited cultural or technical flaws. There were 8 cultural flaw categories (e.g., "Does the item unfairly favor males or f emales?") and 8 technical flaw categories (e.g., "Is the item content inacc urate or factually incorrect?"). The accuracy of the reviewers was defined in terms of the match between the judged classifications and the a priori c lassifications of the items into flaw categories. A new approach based on i tem response theory for examining rater accuracy was used to analyze the da ta (Engelhard, 1996). The data suggest that it is easier to identify some t ypes of item flaws than others; specifically, the reviewers were more accur ate in identifying items with cultural flaws than with technical flaws. The reviewers exhibited fairly high accuracy rates overall that ranged from 83 % to 94%, and there are statistically significant differences in judgmental accuracy between the reviewers. Suggestions for future research on judgmen tal accuracy and the implications of this study for identifying biased item s are discussed.