DERIVING GESTURAL SCORES FROM ARTICULATOR-MOVEMENT RECORDS USING WEIGHTED TEMPORAL DECOMPOSITION

Citation
Tp. Jung et al., DERIVING GESTURAL SCORES FROM ARTICULATOR-MOVEMENT RECORDS USING WEIGHTED TEMPORAL DECOMPOSITION, IEEE transactions on speech and audio processing, 4(1), 1996, pp. 2-18
Citations number
23
Categorie Soggetti
Engineering, Eletrical & Electronic",Acoustics
ISSN journal
10636676
Volume
4
Issue
1
Year of publication
1996
Pages
2 - 18
Database
ISI
SICI code
1063-6676(1996)4:1<2:DGSFAR>2.0.ZU;2-D
Abstract
A computational model to map from articulatory data to an articulatory -phonetic representation is examined in this paper. The approach uses positional values tracked by X-ray microbeam of two lip pellets and fo ur tongue pellets nonlinearly transformed to a new Cartesian space in which the new x and y values represent the distance of the pellets goi ng back along the opposing vocal tract wall and the distance perpendic ular to the tract wall, The transformed articulatory data, as well as the simultaneously recorded electroglottograph data, then serve as the input representation for the computational model that makes use of te mporal decomposition to model multichannel trajectories, Temporal deco mposition constructs a set of target functions from data-derived basis functions using a form of adaptive Gauss-Seidel iteration, The result ant target functions, in conjunction with the weights for each basis f unction, are then used to derive the articulatory-phonetic representat ion called a gestural score. This method is applied to the task of est imating the gestural score for various CVC syllables embedded in frame sentences in the stimulus set, To determine the adequacy of the deriv ed gestural scores, two evaluations were performed: a perception test and a classification test using an automatic recognizer based on a neu ral network model, High recognition rates from both the perceptual exp eriments and the automatic recognizers support the hypothesis that suf ficient information is available in the resultant gestural scores to a llow accurate identification of the phonetic elements.