ITA
ENG

GENERATING FACIAL EXPRESSIONS FOR SPEECH

Authors

PELACHAUD C BADLER NI STEEDMAN M

Citation

C. Pelachaud et al., GENERATING FACIAL EXPRESSIONS FOR SPEECH, Cognitive science, 20(1), 1996, pp. 1-46

Citations number

Categorie Soggetti

Psychology, Experimental

Journal title

Cognitive science → ACNP

ISSN journal

03640213

Volume

Issue

Year of publication

1996

Pages

1 - 46

Database

ISI

SICI code

0364-0213(1996)20:1<1:GFEFS>2.0.ZU;2-M

Abstract

This article reports results from a program that produces high-quality animation of facial expressions and head movements as automatically a s possible in conjunction with meaning-based speech synthesis, includi ng spoken intonation, The goal of the research is as much to test and define our theories of the formal semantics for such gestures, as to p roduce convincing animation. Towards this end, we have produced a high -level programming language for three-dimensional (3-D) animation of f acial expressions, We have been concerned primarily with expressions c onveying information correlated with the intonation of the voice: This includes the differences of timing, pitch, and emphasis that are rela ted to such semantic distinctions of discourse as ''focus.'' ''topic,' ' and ''comment,'' ''theme'' and ''rheme.'' or ''given'' and ''new'' i nformation, We ore also interested in the relation of affect or emotio n to facial expression. Until now, systems have not embodied such rule -governed translation from spoken utterance meaning to facial expressi ons. Our system embodies rules that describe and coordinate these rela tions: intonation/information, intonation/effect, and facial expressio ns/effect. A meaning representation includes discourse information: Wh at is contrastive/background information in the given context, and wha t is the ''topic'' or ''theme'' of the discourse? The system maps the meaning representation into how accents and their placement ore chosen , how they are conveyed over facial expression, and how speech and fac ial expressions ore coordinated. This determines a sequence of functio nal groups: lip shapes, conversational signals, punctuators, regulator s, and manipulators. Our algorithms then impose synchrony, create coar ticulation effects, end determine affectual signals, eye and head move ments. The lowest level representation is the Facial Action Coding Sys tem (FACS), which makes the generation system portable to other facial models.