T. Fukuhara et T. Murakami, 3-D MOTION ESTIMATION OF HUMAN HEAD FOR MODEL-BASED IMAGE-CODING, IEE proceedings. Part I. Communications, speech and vision, 140(1), 1993, pp. 26-35
Model-based image coding applied to interpersonal communication achiev
es very low bit-rate image transmission. To accomplish it, accurate th
ree-dimensional (3-D) motion estimation of a speaker is necessary. A n
ew method of 3-D motion estimation is presented, consisting of two ste
ps. In the first, facial contours and feature points of a speaker are
extracted using filtering and Snake algorithms. Five feature points on
a speaker's facial image are tracked between consecutive picture fram
es, which gives 2-D motion vectors of the feature points. Then, in the
second step, the 3-D motion of a speaker's head is estimated using a
three-layered neural network model, after training with many possible
motion patterns of the human head using an existing 3-D general shape
model. Experimental results show that our method not only achieves goo
d results but is also more robust than existing methods, even when the
motion of an object is rather large or complicated. Accurately estima
ted 3-D motion parameters can realise image transmission at a very low
bit rate.