ITA
ENG

WORD BOUNDARY HYPOTHESIZATION FOR CONTINUOUS SPEECH IN HINDI BASED ONF-0 PATTERNS

Authors

RAJENDRAN S YEGNANARAYANA B

Citation

S. Rajendran et B. Yegnanarayana, WORD BOUNDARY HYPOTHESIZATION FOR CONTINUOUS SPEECH IN HINDI BASED ONF-0 PATTERNS, Speech communication, 18(1), 1996, pp. 21-46

Citations number

Categorie Soggetti

Communication,"Language & Linguistics

Journal title

Speech communication → ACNP

ISSN journal

01676393

Volume

Issue

Year of publication

1996

Pages

21 - 46

Database

ISI

SICI code

0167-6393(1996)18:1<21:WBHFCS>2.0.ZU;2-Y

Abstract

This paper proposes an algorithm based on F-0 patterns to hypothesize word boundaries and function words in continuous speech in Hindi. It m akes use of the properties of F-0 contour such as declination tendency , resetting and fall-rise patterns in Hindi. The syllabic units are id entified by using the energy contour, pitch and the first order LP coe fficient. Each syllabic unit is assigned an accent value L (Low), H or h (High) by (i) comparing the F-0 value at the mid point of each syll abic nucleus with that of the previous syllabic unit and (ii) comparin g the F-0 values at two different points within each syllabic unit in a sequence having an accent value L. Word boundaries are placed betwee n the adjacent syllabic units (i)H and L, (ii)h and L, (iii)L and L, ( iv)L and h and (v)H and h. An evaluation conducted on a corpus of 50 s entences in Hindi read aloud by five native speakers in an ordinary of fice environment showed that about 74 percent of the word boundaries a nd about 28 percent of the function words were correctly identified. T he results of the word boundary hypothesization can be used to improve the performance of the acoustic-phonetic, lexical and syntactic modul es in a speech-to-text conversion system. Robustness of the algorithm in handling noisy speech input conditions and telephone speech are als o discussed.