Objectives: The purpose of this study was to compare the performance of tra
nsient evoked otoacoustic emissions (TEOAEs), distortion product otoacousti
c emissions (DPOAEs), and auditory brain stem responses (ABRs) as tools for
identification of neonatal hearing impairment.
Design: A total of 4911 infants including 4478 graduates of neonatal intens
ive care units, 353 well babies with one or more risk factors for hearing l
oss (Joint Committee on Infant Hearing, 1994) and 80 well babies without ri
sk factor who did not pass one or more neonatal test were targeted as the p
otential subject pool on which test performance would be assessed. During t
he neonatal period, they were evaluated using TEOAEs in response to an 80 d
B pSPL click, DPOAE responses to two stimulus conditions (L1 = L2 = 75 dB S
PL and L1 = 65 dB SPL L2 = 50 dE SPL), and ABR elicited by a 30 dB nHL, cli
ck. In an effort to describe test performance, these "at-risk" infants were
asked to return for behavioral audiologic assessments, using visual reinfo
rcement audiometry (VRA) at 8 to 12 mo corrected age, regardless of neonata
l test results. Sixty-four percent of these subjects returned and reliable
VRA data were obtained on 95.6% of these returnees. This approach is in con
trast to previous studies in which, by necessity, efforts were made to foll
ow only those infants who "failed" the neonatal screening tests. The accura
cy of the neonatal measures in predicting hearing status at 8 to 12 mo corr
ected age was determined. Only those infants who provided reliable, monaura
l VRA test results were included in the analysis. Separate analyses were pe
rformed without regard to intercurrent events (i.e., events between the neo
natal and VRA tests that could cause their results to disagree), and then a
fter accounting for the possible influence of intercurrent events such as o
titis media and late-onset or progressive hearing loss.
Results: Low refer rates were achieved for the stopping criteria used in th
e present study, especially when a protocol similar to the one recommended
in the National Institutes of Health (1993) Consensus Conference Report was
followed. These analyses, however, do not completely describe test perform
ance because they did not compare neonatal screening test results with a go
ld standard test of hearing. Test performance, as measured by the area unde
r a relative operating characteristic curve, were similar for all three neo
natal tests when neonatal test results were compared with VRA data obtained
at 8 to 12 mo corrected age. However, ABRs were more successful at determi
ning auditory status at 1 kHz, compared with the otoacoustic emission (OAE)
tests. Performance was more similar across all three tests when they were
used to identify hearing loss at 2 and 4 kHz. No test performed perfectly.
Using either the two- or three-frequency pure-tone average (PTA), with a fi
xed false alarm rate of 20%, hit rates for the neonatal tests, in general,
exceeded 80% when hearing impairment was defined as behavioral thresholds g
reater than or equal to 30 dB HL. All three tests performed similarly when
a two-frequency (2 and 4 kHz) PTA was used as the gold standard; OAE test p
erformance decreased when a three-frequency PTA (adding 1 kHz) was used as
the gold standard definition. For both PTA. and all three neonatal screenin
g measures, however, hit rate increased as the magnitude of hearing loss in
creased.
Conclusions: Singly, all three neonatal hearing screening tests resulted in
low refer rates, especially if referrals for follow-up were made only for
the cases in which stopping criteria were not met in both ears. Following a
protocol similar to that recommended in the National Institutes of Health
(1993) Consensus Conference report resulted in refer rates that were less t
han 4%. TEOAEs at 80 dB pSPL, DPOAE at L1 = 65, L2 = 50 dB SPL and ABR at 3
0 dB nHL measured during the neonatal period, and as implemented in the cur
rent study, performed similarly at predicting behavioral hearing status at
8 to 12 mo corrected age. Although perfect test performance was never achie
ved, sensitivity for each measure increased with the magnitude of hearing l
oss. This latter finding is important because it suggests that all three te
sts performed better at identifying hearing losses for which intervention w
ould be immediately recommended.