The cerebral representation of the temporal envelope of sounds was studied
in five normal-hearing subjects using functional magnetic resonance imaging
. The stimuli were white noise, sinusoidally amplitude-modulated at frequen
cies ranging from 4 to 256 Hz. This range includes low AM frequencies (up t
o 32 Hz) essential for the perception of the manner of articulation and syl
labic rate, and high AM frequencies (above 64 Hz) essential for the percept
ion of voicing and prosody. The right lower brainstem (superior olivary com
plex), the right inferior colliculus, the left medial geniculate body, Hesc
hl's gyrus, the superior temporal gyrus, the superior temporal sulcus, and
the inferior parietal lobule were specifically responsive to AM. Global tun
ing curves in these regions suggest that the human auditory system is organ
ized as a hierarchical filter bank, each processing level responding prefer
entially to a given AM frequency, 256 Hz for the lower brainstem, 32-256 Hz
for the inferior colliculus, 16 Hz for the medial geniculate body, 8 Hz fo
r the primary auditory cortex, and 4-8 Hz for secondary regions. The time c
ourse of the hemodynamic responses showed sustained and transient component
s with reverse frequency dependent patterns: the lower the AM frequency the
better the fit with a sustained response model, the higher the AM frequenc
y the better the fit with a transient response model. Using cortical maps o
f best modulation frequency, we demonstrate that the spatial representation
of AM frequencies varies according to the response type. Sustained respons
es yield maps of low frequencies organized in large clusters. Transient res
ponses yield maps of high frequencies represented by a mosaic of small clus
ters. Very few voxels were tuned to intermediate frequencies (32-64 Hz). We
did not find spatial gradients of AM frequencies associated with any respo
nse type. Our results suggest that two frequency ranges (up to 16 and 128 H
z and above) are represented in the cortex by different response types. How
ever, the spatial segregation of these two ranges is not systematic. Most c
ortical regions were tuned to low frequencies and only a few to high freque
ncies. Yet, voxels that show a preference for low frequencies were also res
ponsive to high frequencies. Overall, our study shows that the temporal env
elope of sounds is processed by both distinct (hierarchically organized ser
ies of filters) and shared (high and low AM frequencies eliciting different
responses at the same cortical locus) neural substrates. This layout sugge
sts that the human auditory system is organized in a parallel fashion that
allows a degree of separate routing for groups of AM frequencies conveying
different information and preserves a possibility for integration of comple
mentary features in cortical auditory regions.