TESTING THE EQUIVALENCE OF TRANSLATIONS OF WIDELY USED RESPONSE CHOICE LABELS - RESULTS FROM THE IQOLA PROJECT

Citation
Sd. Keller et al., TESTING THE EQUIVALENCE OF TRANSLATIONS OF WIDELY USED RESPONSE CHOICE LABELS - RESULTS FROM THE IQOLA PROJECT, Journal of clinical epidemiology, 51(11), 1998, pp. 933-944
Citations number
60
Categorie Soggetti
Public, Environmental & Occupation Heath
ISSN journal
08954356
Volume
51
Issue
11
Year of publication
1998
Pages
933 - 944
Database
ISI
SICI code
0895-4356(1998)51:11<933:TTEOTO>2.0.ZU;2-D
Abstract
The similarity in meaning assigned to response choice labels from the SF-36 Health Survey (SF-36) was evaluated across countries. Convenienc e samples of judges (range, 10 to 117; median = 48) from 13 countries rated translations of response choice labels, using a variation df the Thurstone method of equal appearing intervals. Judges marked a point on a 10-cm line representing the magnitude of a response choice label (e.g., ''good'' relative to the anchors of ''poor'' and ''excellent'') . Ratings were evaluated to determine the ordinal consistency of respo nse choice labels within a response scale; the degree to which differe nces between adjacent response choice labels were equal interval; and the amount of variance due to response choice label, country, judge, a nd interaction between response choice label and country. Results conf irmed the hypothesized ordering of response choice labels; the percent age of ordinal pairs ranged from 88.7% to 100% (median = 98.2%) across countries and response scales. Examination of the average magnitudes of response choice labels supported the ''quasi-interval'' nature of t he scales. Analysis of variance (ANOVA) results supported the generali zability of response choice magnitudes across countries; labels explai ned 64% to 77% of the variance in ratings, and country explained 1% to 3%. These results support the equivalence of SF-36 response choice la bels across countries. Departures from the assumption of equal interva ls, when observed; were similar across countries and were greatest for the two response scales that are recalibrated under standard SF-36 sc oring. Results provide justification for scoring translations of indiv idual items using standard SF-36 scoring; whether these items form the same scales in other countries as they do in the United States is eva luated with tests of scaling assumptions. (C) 1998 Elsevier Science In c.