MODELING TRANSPOSITIONAL INVARIANCY OF MELODY RECOGNITION WITH AN ATTRACTOR NEURAL-NETWORK

Authors
Citation
L. Benuskova, MODELING TRANSPOSITIONAL INVARIANCY OF MELODY RECOGNITION WITH AN ATTRACTOR NEURAL-NETWORK, Network, 6(3), 1995, pp. 313-331
Citations number
38
Categorie Soggetti
Mathematical Methods, Biology & Medicine",Neurosciences,"Engineering, Eletrical & Electronic","Computer Science Artificial Intelligence
Journal title
ISSN journal
0954898X
Volume
6
Issue
3
Year of publication
1995
Pages
313 - 331
Database
ISI
SICI code
0954-898X(1995)6:3<313:MTIOMR>2.0.ZU;2-E
Abstract
We present the attractor neural network (ANN) model that accounts for invariancy of melody recognition under transposition, modulated by tra nsposition distance effect, while serving as a memory for tone sequenc es. Recognition is performed by an ANN with fast and slow synapses des igned for storage and recognition of sequences of patterns where the r ecognition is defined as a completed set of transitions from one quasi -attractor to another. In our model, the sequence of ANN states evoked by the transposed melody is transformed into the sequence of perceptu al templates of tones composing the original untransposed melody. A tr ansposed Lone first initiates a process of transposition-invariant rec all of the original tone pattern. If this transposition-invariant reca ll was succesful, the recalled state serves for auto-associative retri eval of the corresponding pattern in a predetermined sequence. The ton e patterns are combinations of parallel stripes of active neurons repr esenting the active isofrequency bands in the auditory cortex which ar e orthogonal to the low-to-high frequency gradient. Such a representat ion allows for treating the problem of transposition-invariant recogni tion of the tone in the sequence as a translation-invariant retrieval of its stripe representation. The translation-invariant retrieval of t he tone pattern is accomplished by means of the modified algorithm of Dotsenko (1988 J. Phys. A: Math. Gen. 21 L783-7) proposed for translat ion-, rotation- and size-invariant pattern recognition, which uses rel axation of neuronal firing thresholds to guide the ANN evolution in th e state space towards the desired memory attractor. The dynamics of ne uronal relaxation is modified for storage and retrieval of low-activit y patterns and the original gradient optimization of threshold dynamic s is replaced with optimization by simulated annealing. The proposed A NN model can be generalized for the transposition-invariant recognitio n of unharmonic sounds, for instance speech.