K. Gopalan et al., A comparison of speaker identification results using features based on cepstrum and Fourier-Bessel expansion, IEEE SPEECH, 7(3), 1999, pp. 289-294
A compact representation of speech is possible using Bessel functions becau
se of the similarity between voiced speech and the Bessel functions, Both v
oiced speech and the Bessel functions exhibit quasiperiodicity and decaying
amplitude with time. This paper presents the results of speaker identifica
tion experiments using features obtained from 1) the Fourier-Bessel expansi
on and 2) the cepstral representation of speech frames. Identification scor
es of 65% and 76% were achieved using features based on J(1)(t) expansion o
f air-to-ground speech transmission databases of 143 and 1054 test utteranc
es, respectively. The corresponding scores for the two databases using ceps
tral coefficients. of a comparable size were 80% and 88%, A comparison of t
he two sets of features indicates that J(1)(t) can be used to model the hea
ring perception much like the mel cepstral coefficients.