WORD BOUNDARY HYPOTHESIZATION FOR CONTINUOUS SPEECH IN HINDI BASED ONF-0 PATTERNS

Citation
S. Rajendran et B. Yegnanarayana, WORD BOUNDARY HYPOTHESIZATION FOR CONTINUOUS SPEECH IN HINDI BASED ONF-0 PATTERNS, Speech communication, 18(1), 1996, pp. 21-46
Citations number
32
Categorie Soggetti
Communication,"Language & Linguistics
Journal title
ISSN journal
01676393
Volume
18
Issue
1
Year of publication
1996
Pages
21 - 46
Database
ISI
SICI code
0167-6393(1996)18:1<21:WBHFCS>2.0.ZU;2-Y
Abstract
This paper proposes an algorithm based on F-0 patterns to hypothesize word boundaries and function words in continuous speech in Hindi. It m akes use of the properties of F-0 contour such as declination tendency , resetting and fall-rise patterns in Hindi. The syllabic units are id entified by using the energy contour, pitch and the first order LP coe fficient. Each syllabic unit is assigned an accent value L (Low), H or h (High) by (i) comparing the F-0 value at the mid point of each syll abic nucleus with that of the previous syllabic unit and (ii) comparin g the F-0 values at two different points within each syllabic unit in a sequence having an accent value L. Word boundaries are placed betwee n the adjacent syllabic units (i)H and L, (ii)h and L, (iii)L and L, ( iv)L and h and (v)H and h. An evaluation conducted on a corpus of 50 s entences in Hindi read aloud by five native speakers in an ordinary of fice environment showed that about 74 percent of the word boundaries a nd about 28 percent of the function words were correctly identified. T he results of the word boundary hypothesization can be used to improve the performance of the acoustic-phonetic, lexical and syntactic modul es in a speech-to-text conversion system. Robustness of the algorithm in handling noisy speech input conditions and telephone speech are als o discussed.