ITA
ENG

SPEAKER IDENTIFICATION BASED ON THE USE OF ROBUST CEPSTRAL FEATURES OBTAINED FROM POLE-ZERO TRANSFER-FUNCTIONS

Authors

ZILOVIC MS RAMACHANDRAN RP MAMMONE RJ

Citation

Ms. Zilovic et al., SPEAKER IDENTIFICATION BASED ON THE USE OF ROBUST CEPSTRAL FEATURES OBTAINED FROM POLE-ZERO TRANSFER-FUNCTIONS, IEEE transactions on speech and audio processing, 6(3), 1998, pp. 260-267

Citations number

Categorie Soggetti

Engineering, Eletrical & Electronic",Acoustics

Journal title

IEEE transactions on speech and audio processing → ACNP

ISSN journal

10636676

Volume

Issue

Year of publication

1998

Pages

260 - 267

Database

ISI

SICI code

1063-6676(1998)6:3<260:SIBOTU>2.0.ZU;2-4

Abstract

A common problem in speaker identification systems is that a mismatch in the training and testing conditions sacrifices much performance. We attempt to alleviate this problem by proposing new features that show less variation when speech is corrupted by convolutional noise (chann el) and/or additive noise. The conventional feature used is the linear predictive (LP) cepstrum that is derived from an all-pole transfer fu nction which, in turn, achieves a good approximation to the spectral e nvelope of the speech. Recently, a new cepstral feature based on a pol e-zero function (called the adaptive component weighted or ACW cepstru m) was introduced. We propose four additional new cepstral features ba sed on pole-zero transfer functions. One is an alternative way of doin g adaptive component weighting and is called the ACW2 cepstrum. Two ot hers (known as the PFL1 cepstrum and the PFL2 cepstrum) are based on a pole-zero postfilter used in speech enhancement. Finally, an autoregr essive moving-average (ARMA) analysis of speech results in a pole-zero transfer function describing the spectral envelope. The cepstrum of t his transfer function is the feature. Experiments involving a closed s et, text-independent and vector quantizer based speaker identification system are done to compare the various features. The TIMIT and King d atabases are used. The ACW and PFL1 features are the preferred feature s, since they do as wed or better than the LP cepstrum for all the tes t conditions. The corresponding spectra show a clear emphasis of the f ormants and no spectral tilt. To enhance robustness, it is important t o emphasize the formants. An accurate description of the spectral enve lope is not required.