In another article (Wolfe et al., 1998/this issue) we showed how Laten
t Semantic Analysis (LSA) can be used to assess student knowledge-how
essays can be graded by LSA and how LSA can match students with approp
riate instructional texts. We did this by comparing an essay written b
y a student with one or more target instructional texts in terms of th
e cosine between the vector representation of the student's essay and
the instructional text in question. This simple method was effective f
or the purpose, but questions remain about how LSA achieves its result
s and how the results might be improved. Here, we address four such qu
estions: (a) What role does the use of technical vocabulary play? (b)
how long should the student essays be? (c) is the cosine the optimal m
easure of semantic relatedness? and (d) how does one deal with the dir
ectionality of knowledge in the high-dimensional space?