A knowledge-based system uses its database (also known as its "theory") to
produce answers to the queries it receives. Unfortunately, these answers ma
y be incorrect if the underlying theory is faulty. Standard "theory revisio
n" systems use a given set of "labeled queries" (each a query paired with i
ts correct answer) to transform the given theory, by adding and/or deleting
either rules and/or antecedents, into a related theory that is as accurate
as possible. After formally defining the theory revision task, this paper
provides both sample and computational complexity bounds for this process.
It first specifies the number of labeled queries necessary to identify a re
vised theory whose error is close to minimal with high probability. It then
considers the computational complexity of finding this best theory, and pr
oves that, unless P = NP, no polynomial-time algorithm can identify this op
timal revision, even given the exact distribution of queries, except in cer
tain simple situations. It also shows that, except in such simple situation
s, no polynomial-time algorithm can produce a theory whose error is even cl
ose to (i.e., within a particular polynomial factor of) optimal. The first
(sample complexity) results suggest reasons why theory revision can be more
effective than learning from scratch, while the second (computational comp
lexity) results explain many aspects of the standard theory revision system
s, including the practice of hill-climbing to a locally-optimal theory, bas
ed on a given set of labeled queries. (C) 1999 Elsevier Science B.V. All ri
ghts reserved.