ITA
ENG

HOW RELIABLE IS PEER-REVIEW OF SCIENTIFIC ABSTRACTS - LOOKING BACK ATTHE 1991 ANNUAL-MEETING OF THE SOCIETY OF GENERAL INTERNAL-MEDICINE

Authors

RUBIN HR REDELMEIER DA WU AW STEINBERG EP

Citation

Hr. Rubin et al., HOW RELIABLE IS PEER-REVIEW OF SCIENTIFIC ABSTRACTS - LOOKING BACK ATTHE 1991 ANNUAL-MEETING OF THE SOCIETY OF GENERAL INTERNAL-MEDICINE, Journal of general internal medicine, 8(5), 1993, pp. 255-258

Citations number

Categorie Soggetti

Medicine, General & Internal

Journal title

Journal of general internal medicine → ACNP

ISSN journal

08848734

Volume

Issue

Year of publication

1993

Pages

255 - 258

Database

ISI

SICI code

0884-8734(1993)8:5<255:HRIPOS>2.0.ZU;2-I

Abstract

Objective: To evaluate the interrater reproducibility of scientific ab stract review. Design: Retrospective analysis. Setting: Review for the 1991 Society of General Internal Medicine (SGIM) annual meeting. Subj ects: 426 abstracts in seven topic categories evaluated by 55 reviewer s. Measurements: Reviewers rated abstracts from 1 (poor) to 5 (excelle nt), globally and on three specific dimensions: interest to the SGIM a udience, quality of methods, and quality of presentation. Each abstrac t was reviewed by five to seven reviewers. Each reviewer's ratings of the three dimensions were added to compute that reviewer's summary sco re for a given abstract. The mean of all reviewers' summary scores for an abstract, the final score, was used by SGIM to select abstracts fo r the meeting. Results: Final scores ranged from 4.6 to 13.6 (mean = 9 .9). Although 222 abstracts (52%) were accepted for publication, the 9 5% confidence interval around the final score of 300 (70.4%) of the 42 6 abstracts overlapped with the threshold for acceptance of an abstrac t. Thus, these abstracts were potentially misclassified. Only 36% of t he variance in summary scores was associated with an abstract's identi ty, 12% with the reviewer's identity, and the remainder with idiosyncr atic reviews of abstracts. Global ratings were more reproducible than summary scores. Conclusion: Reviewers disagreed substantially when eva luating the same abstracts. Future meeting organizers may wish to rank abstracts using global ratings, and to experiment with structured rev iew criteria and other ways to improve raters' agreement.