Application of multidimensional scaling to subjective evaluation of coded speech

Authors
Citation
Jl. Hall, Application of multidimensional scaling to subjective evaluation of coded speech, J ACOUST SO, 110(4), 2001, pp. 2167-2182
Citations number
15
Categorie Soggetti
Multidisciplinary,"Optics & Acoustics
Journal title
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
ISSN journal
00014966 → ACNP
Volume
110
Issue
4
Year of publication
2001
Pages
2167 - 2182
Database
ISI
SICI code
0001-4966(200110)110:4<2167:AOMSTS>2.0.ZU;2-S
Abstract
We present results from a pilot study directed at developing an anchorable subjective speech quality test. The test uses multidimensional scaling tech niques to obtain quantitative information about the perceptual attributes o f speech. In the first phase of the study, subjects ranked perceptual dista nces between samples of speech produced by two different talkers, one male and one female, processed by a variety of codecs. The resulting distance ma trices were processed to obtain, for each talker, a stimulus space for the various speech samples. This stimulus space has the properties that distanc es between stimuli in this space correspond to perceptual distances between stimuli and that the dimensions of this space correspond to attributes use d by the subjects in determining perceptual distances. Mean opinion scores (MOS) scores obtained in an earlier study were found to be highly correlate d with position in the stimulus space, and the three dimensions of the stim ulus space were found to have identifiable physical and perceptual correlat es. In the second phase of the study, we developed techniques for fitting s peech generated by a new codec. under investigation into a previously estab lished stimulus space. The user is provided with a collection of speech sam ples and with the stimulus space for these speech samples as determined by a large-scale listening test. The user then carries out a much smaller list ening test to determine the position of the new stimulus in the previously established stimulus space. This system is anchorable, so that different ve rsions of a codec, under development can be compared directly, and it provi des more detailed information than the single number provided by MOS testin g. We suggest that this information could be used to advantage in algorithm development and in development of objective measures of speech quality. (C ) 2001 Acoustical Society of America.