Researchers currently seek to improve Validity in speech and language studi
es by adopting naturalistic procedures In emotion-display research, validit
y is threatened by standard experimental controls which diminish the natura
lism of stimuli and response ranges. We report two experiments comparing th
e adequacy of naturalistic with standard procedures. Experiment I had 158 j
udges code 89 samples of naturally-occurring emotional speech with free-cho
ice emotion labels, and later with labels from a standard set. When free-ch
oice labels were similar across judges, they were consistent with standard
labels, but showed a range of intensity and contextual relevance. We recomm
end that future studies include wider options for judges when coding emotio
ns. Experiment 2 compared valency ratings of words when presented in, or ou
t of, context. Standard procedures score lexical valencies using affective
dictionaries, disregarding natural contexts. Experiment 2 compared 23 judge
s' valency ratings of wards presented individually, and later in their orig
inal context. Between 30% and 44% of words were rated differently in contex
t (depending on the statistical significance level adopted). We concluded f
rom Experiment 2 that, where small corpora adequately model a domain, the i
mproved accuracy of valency rating achieved by presenting words in their na
tural context justifies the extra procedures required.