Ten tests of olfactory function (including tests of odor identificatio
n, detection, discrimination, memory, and suprathreshold odor intensit
y and pleasantness perception) were administered on two test occasions
to 57 subjects ranging in age from 18 to 83 years. The stability of t
he average test scores was determined across the two test sessions for
14 measures derived from these 10 tests and for subcomponents of the
Japanese T&T olfactometer threshold test. In addition, the test-retest
reliability (Pearson r) of each test measure was established. With th
e exception of a response bias measure, the average test scores did no
t differ significantly across the two test sessions. Statistically, th
e reliability coefficients of the primary test measures fell into thre
e general classes bound by the following r values: 0.43-0.53; 0.67-0.7
1; 0.76-0.90. Detection threshold values were more reliable than recog
nition threshold values; those based upon a single ascending presentat
ion series were much less reliable than those based upon a staircase p
rocedure. The relationship between test length and reliability was exa
mined for several of the tests and mathematically modeled. For example
, within the staircase series incorporating the odorant phenyl ethyl a
lcohol, reliability was related (R(2) = 0.984) to the number of revers
als included in the threshold estimate by a function derived from the
Spearman-Brown formula; namely, reliability = 0.455 # reversals/[1 0.455 (# reversals - 1)]. Reversal location, per se, had little influe
nce on reliability. Overall, this study suggests that (i) considerable
variation is present in the reliability of olfactory tests, (ii) reli
ability is a function of test length, and (iii) caution is warranted i
n comparing results from nominally different olfactory tests in applie
d settings since the findings may, in some instances, simply reflect t
he differential reliability of the tests.