In many textbook solutions, for systems failure diagnosis problems stu
died using reliability theory and artificial intelligence, the prior p
robabilities of different failure states can be estimated and used to
guide the sequential search for failed components after the whole syst
em fails. In practice, however, both the component failure probabiliti
es and the structure function of the system being examined-i.e., the m
apping between the states of its components and the state of the syste
m-may not be known with certainty. At best, the probabilities of diffe
rent hypothesized system descriptions, each specifying the component f
ailure probabilities and the system's structure function, may be known
to a useful approximation, perhaps based on sample data and previous
experience. Cost-effective diagnosis of the system's failure state is
then a challenging problem. Although the probabilities of component fa
ilures are aleatory, uncertainties about these probabilities and about
the system structure function are epistemic. This paper examines how
to make best use of both epistemic prior probabilities for system desc
riptions and the information gleaned from costly inspections of compon
ent states after the system fails, to minimize the average cost of ide
ntifying the failure state. Two approaches are introduced for systems
dominated by aleatory uncertainties, one motivated by information theo
ry and the other based on the idea of trying to prove a hypothesis abo
ut the identity of the failure state as efficiently as possible. While
the general problem of cost-effective failure diagnosis is computatio
nally intractable (NP-hard), both heuristics provide useful approximat
ions on small to moderate sized problems and optimal results for certa
in common types of reliability systems, including series, parallel, pa
rallel-series, and k-out-of-n systems. A hybrid heuristic that adaptiv
ely chooses which heuristic to apply next after any sequence of observ
ations (component test results) appears to give excellent results. Sev
eral computational experiments are summarized in support of these conc
lusions, and extensions to reliability systems with repair are briefly
considered. Next, it is shown that diagnosis can proceed when aleator
y and epistemic uncertainties are both present using the same techniqu
es developed for aleatory probabilities alone. If only the epistemic p
robability distribution of system descriptions is known, then the same
heuristics that are used to diagnose a system's failure state for sys
tems with known descriptions can also be used to identify the system a
nd diagnose its failure state when there is epistemic uncertainty abou
t the identity of the system. This result suggests a unified approach
to least-cost failure diagnosis in reliability systems with both aleat
ory probabilities of component failures and epistemic probabilities fo
r system descriptions. (C) 1996 Elsevier Science Limited.