P. Iverson et al., Modeling the interaction of phonemic intelligibility and lexical structurein audiovisual word recognition, SPEECH COMM, 26(1-2), 1998, pp. 45-63
Studies of audiovisual perception of spoken language have mostly modeled ph
oneme identification in nonsense syllables, but it is doubtful that models
or theories of phonetic processing can adequately account for audiovisual w
ord recognition. The present study took a computational approach to examine
how lexical structure may additionally constrain word recognition, given t
he phonetic information available under vocoded audio, visual and audiovisu
al stimulus conditions. Subjects made phonemic identification judgments on
recordings of spoken nonsense syllables. Hierarchical cluster analysis was
used first to select classes of perceptually equivalent phonemes for each o
f the stimulus conditions, and then a machine-readable phonemically transcr
ibed lexicon was retranscribed in terms of these phonemic equivalence class
es. Several statistics were computed for each of the transcriptions, includ
ing percent information extracted, percent words unique and expected class
size. The findings suggest that superadditive levels of audiovisual enhance
ment are more likely for monosyllabic than for multisyllabic words. That is
, impoverished phonetic information may be sufficient to recognize multisyl
labic words, but the recognition of monosyllabic words seems to require add
itional phonetic information. (C) 1998 Elsevier Science B.V. All rights res
erved.