Gaussian mixtures provide a convenient method of density estimation that li
es somewhere between parametric models and kernel density estimators. When
the number of components of the mixture is allowed to increase as sample si
ze increases, the model is called a mixture sieve. We establish a bound on
the rate of convergence in Hellinger distance for density estimation using
the Gaussian mixture sieve assuming that the true density is itself a mixtu
re of Gaussians; the underlying mixing measure of the true density is not n
ecessarily assumed to have finite support. Computing the rate involves some
delicate calculations since the size of the sieve-as measured by bracketin
g entropy-and the saturation rate, cannot be found using standard methods.
When the mixing measure has compact support, using k(n) similar to n(2/3)/(
log n)(1/3) components in the mixture yields a rate of order (log n)((1+eta
)/6)/n(1/6) for every eta > 0. The rates depend heavily on the tail behavio
r of the true density. The sensitivity to the tail behavior is diminished b
y using a robust sieve which includes a long-tailed component in the mixtur
e. In the compact case, we obtain an improved rate of (log n/n)(1/4). In th
e noncompact case, a spectrum of interesting rates arise depending on the t
hickness of the tails of the mixing measure.