The process by which individuals learn from feedback when making recur
rent choices among ambiguous alternatives is explored. We describe an
experiment in which subjects solve a variant of the classic armed-band
it problem of dynamic decision theory, set in the context of airline c
hoice. Subjects are asked to make repeated choices between two hypothe
tical airlines, one having an on-time departure probability which is k
nown a priori, and the other has an ambiguous probability whose true v
alue can only be discovered by making sample trips on the airline. Sub
jects attempt to make choices in such a way as to maximize the total n
umber of one-time departures over a fixed planning horizon. We examine
the extent to which actual choice patterns over time are consistent w
ith those which would be made by a decision maker acting as an optimal
Bernoulli sampler. The data offer support for a number of expected-an
d some unexpected-departures from optimality, including a tendency to
underexperiment with promising options and overexperiment with unpromi
sing options, and a tendency to increasingly switch between airlines a
s the average base rate of departures decreases. Implications of the w
ork for the descriptive validity of normative dynamic decision models
is explored, as well as for the generalizability of previous findings
about choice under ambiguity to dynamic settings.