Llb. Barnes et Mw. Barnes, ACADEMIC DISCIPLINE AND GENERALIZABILITY OF STUDENT-EVALUATIONS OF INSTRUCTION, Research in higher education, 34(2), 1993, pp. 135-149
Previous research on the generalizability of student ratings of instru
ction has raised questions about the effects of academic discipline an
d item types on the generalizability of these data for making relative
decisions about instructors and about courses. In particular, althoug
h student evaluation data appear to provide a reasonable basis for mak
ing decisions about instructors when generalizing across courses and s
tudents, when course is the object of measurement, the data appear to
be less generalizable. It was suggested in the literature that this ma
y be due to the type of evaluation items used or it may be due to acad
emic discipline differences in the type of courses selected for study.
This study used Biglan's (1973a) model for classifying disciplines al
ong the dimensions of paradigmatic/preparadigmatic (hard/soft) and pur
e/applied. A nested sampling procedure yielded two sample types: cours
es within teachers, in which individual instructors taught more than o
ne course; and teachers within courses, in which individual courses we
re taught by more than one instructor. For each sample type, evaluatio
n forms for twenty courses within each discipline classification were
sought. The evaluation items for this study were classified as measuri
ng six dimensions of instruction: organization, breadth of coverage, g
roup interaction, enthusiasm, grading, and individual rapport. General
izability and decision studies were conducted in which, for one sample
, teacher was the object of measurement, and for the second sample, co
urse was the object of measurement. Results indicated that reliable de
cisions about instructors could reasonably be made from all six of the
evaluation dimensions; however, reliability for course decisions vari
ed greatly with the evaluation dimension, being highest for breadth of
coverage and lowest for grading. The same general pattern was noted f
or the paradigmatic disciplines and the preparadigmatic-applied discip
lines but not for the preparadigmatic-pure disciplines. It is suggeste
d that a single evaluation instrument may not be uniformly applicable
to all discipline areas.