The McGurk illusion effectively demonstrates the audiovisual nature of spee
ch perception. When an auditory syllable is dubbed onto an incongruous visu
al syllable the resulting percept is usually not either of the components,
but their combination or fusion. The present experiment investigated the pe
rsistence of the McGurk effect when the facial configuration context of the
audiovisual stimuli was manipulated. Two congruent and two incongruent aud
iovisual syllables were created from spoken /ipi/ and /iki/. These audiovis
ual tokens were uttered by seven facial configurations of five talkers. All
facial configurations produced a clear McGurk effect (reported /iti/ for h
eard !ipi/ + seen /iki/), but the effect was significantly less when uttere
d by an asymmetrically scrambled face configuration. The results also showe
d significant differences in the persistence of the McGurk effect between d
ifferent talkers. In sum, facial configural information can be used in the
audiovisual integration of certain speech segments. This information is not
necessary for the integration to occur, but the integration process can be
disrupted by face stimuli violating the normal configural information.