We consider the complexity of learning classes of smooth functions formed b
y bounding different norms of a function's derivative. The learning model i
s the generalization of the mistake-bound model to continuous-valued functi
ons. Suppose F-q is the set of all absolutely continuous functions f from [
0, 1] to R such that parallel to f'parallel to(q)less than or equal to 1 an
d opt(F-q, m) is the best possible bound on the worst-case sum of absolute
prediction errors over sequences of m trials. We show that for all q greate
r than or equal to 2, opt(F-q, m) = circle minus( root log m), and that opt
(F-2, m)less than or equal to(root log(2)m)/2 + 0(1), matching a known lowe
r bound of (root log(2)m)/2 - O(1) to within an additive constant. (C) 2000
Elsevier Science B.V. All rights reserved.