ASKING QUESTIONS TO MINIMIZE ERRORS

Citation
Nh. Bshouty et al., ASKING QUESTIONS TO MINIMIZE ERRORS, Journal of computer and system sciences, 52(2), 1996, pp. 268-286
Citations number
28
Categorie Soggetti
System Science","Computer Science Hardware & Architecture","Computer Science Theory & Methods
ISSN journal
00220000
Volume
52
Issue
2
Year of publication
1996
Pages
268 - 286
Database
ISI
SICI code
0022-0000(1996)52:2<268:AQTME>2.0.ZU;2-7
Abstract
A number of efficient learning algorithms achieve exact identification of an unknown function from some class using membership and equivalen ce queries. Using a standard transformation such algorithms can easily be converted to on-line learning algorithms that use membership queri es. Under such a transformation the number of equivalence queries made by the query algorithm directly corresponds to the number of mistakes made by the on-line algorithm. In this paper we consider several of t he natural classes known to be learnable in this setting, and investig ate the minimum number of equivalence queries with accompanying counte rexamples (or equivalently the minimum number of mistakes in the on-li ne model) that can be made by a learning algorithm that makes a polyno mial number of membership queries and uses polynomial computation time . We are able both to reduce the number of equivalence queries used by the previous algorithms and often to prove matching lower bounds. As an example, consider the class of DNF formulas over n variables with a t most k = O(log n) terms. Previously, the algorithm of Plum and Rudic h provided the best known upper bound of 2(O(k)) log n for the minimum number of equivalence queries needed for exact identification. We gre atly improve on this upper bound showing that exactly k counterexample s are needed if the learner knows k a priori and exactly k + 1 counter examples are needed if the learner does not know k a priori, This exac tly matches known lower bounds of Bshouty and Cleve. For many of our r esults we obtain a complete characterization of the trade-off between the number of membership and equivalence queries needed for exact iden tification. The classes we consider here are monotone DNF formulas, Ho rn sentences, O(log n)-term DNF formulas, read-k sat-j DNF formulas, r eadonce formulas over various bases, and deterministic finite automata . (C) 1996 Academic Press, Inc.