A unified neural-network-based speaker localization technique

Citation
G. Arslan et Ea. Sakarya, A unified neural-network-based speaker localization technique, IEEE NEURAL, 11(4), 2000, pp. 997-1002
Citations number
9
Categorie Soggetti
AI Robotics and Automatic Control
Journal title
IEEE TRANSACTIONS ON NEURAL NETWORKS
ISSN journal
10459227 → ACNP
Volume
11
Issue
4
Year of publication
2000
Pages
997 - 1002
Database
ISI
SICI code
1045-9227(200007)11:4<997:AUNSLT>2.0.ZU;2-L
Abstract
Locating and tracking a speaker in real time using microphone arrays is imp ortant in many applications such as hands-free video conferencing, speech p rocessing in large rooms, and acoustic echo cancellation, A speaker can be moving from the far field to the near field of the array, or vice versa, Ma ny neural-network-based localization techniques exist, but they are applica ble to either far-field or near-held sources, and are computationally inten sive for real-time speaker localization applications because of the wide-ba nd nature of the speech. We propose a unified neural-network-based source localization technique, wh ich is simultaneously applicable to wide-band and narrow-band signal source s that are in the far field or near field of a microphone array, The techni que exploits a multilayer perceptron feedforward neural network structure a nd forms the feature vectors by computing the normalized instantaneous cros s-power spectrum samples between adjacent pairs of sensors, Simulation resu lts indicate that our technique is able to locate a source with an absolute error of less than 3.5 degrees at a signal-to-noise ratio of 20 db and a s ampling rate of 8000 Hz at each sensor.