Three experiments were performed to examine listeners' thresholds for
identifying stimuli whose spectra were modeled after the vowels /I/ an
d /epsilon/, with the differences between these stimuli restricted to
the frequency of the first formant. The stimuli were presented in a lo
w-pass masking noise that spectrally overlapped the first formant but
not the higher formants. identification thresholds were lower when the
higher formants were present than when they were not, even though the
first formant contained the only distinctive information for stimulus
identification. This indicates that Listeners were more sensitive in
identifying the first formant energy through its contribution to the v
owel than as an independent percept; this effect is given the name coh
erence masking protection. The first experiment showed this effect for
synthetic vowels in which the distinctive first formant was supported
by a series of harmonics that progressed through the higher formants.
In the second two experiments, the harmonics in the first formant reg
ion were removed, and the first formant was simulated by a narrow band
of noise. This was done so that harmonic relations did not provide a
basis for grouping the lower formant with the higher formants; coheren
ce masking protection was still observed. However, when the temporal a
lignment of the onsets and offsets of the higher and lower formants wa
s disrupted, the effect was eliminated, although the stimuli were stil
l perceived as vowels. These results are interpreted as indicating tha
t general principles of auditory grouping that can exploit regularitie
s in temporal patterns cause acoustic energy belonging to a coherent s
peech sound to stand out in the auditory scene.