Am. Shapiro et Ds. Mcnamara, The use of latent semantic analysis as a tool for the quantitative assessment of understanding and knowledge, J EDUC COMP, 22(1), 2000, pp. 1-36
Latent Semantic Analysis (LSA) is a statistical model of word usage that ha
s been used for a variety of applications. One of these applications is the
quantitative assessment of the semantic content within written text. While
the technology has been successful in correlating with the qualitative rat
ings of human experts, it is unclear what aspect of knowledge is being refl
ected in an LSA output. The two experiments presented here were designed to
address this general question. We were particularly interested in whether
an LSA analysis more accurately reflects the factual or conceptual knowledg
e contained in written material. Experiment 1 explored this issue by compar
ing LSA analyses of essays to human-generated scores. It also compared the
LSA output to several measures of conceptual structure. Experiment 2 correl
ated LSA analyses of transcribed recall protocols with a series of comprehe
nsion measures that were designed to vary in the degree to which they refle
ct conceptual or factual knowledge. We found compelling evidence that LSA a
nalyses are a stronger reflection of the text-based knowledge represented b
y essays and recall protocols than conceptual knowledge. Both studies also
explored a methodological issue pertaining to the use of LSA. Specifically,
does LSA have to be "trained" in the particular content area of the text t
o be analyzed? This question was addressed by running multiple LSA analyses
, each performed with differing "semantic spaces" created through training
in domain specific or general content areas. We found that LSA performed be
st when trained in a content area specific to the material to be analyzed.
These results are discussed with respect to the application of LSA analyses
in the classroom and laboratory.