It seems widely agreed that IE (Information Extraction) is now a tested lan
guage technology that has reached precision + recall values that put it in
about the same position as Information Retrieval and Machine Translation, b
oth of which are widely used commercially. There is also a clear range of p
ractical applications that would be eased by the sort of template-style dat
a that IE provides. The problem for wider deployment of the technology is a
daptability: the ability to customize 1E rapidly to new domains.
In this paper we discuss some methods that have been tried to ease this pro
blem, and to create something more rapid than the bench-mark one-month figu
re, which was roughly what ARPA teams in IE needed to adapt an existing sys
tem by hand to a new domain of corpora and templates. An important distinct
ion in discussing the issue is the degree to which a user can be assumed to
know what is wanted, to have preexisting templates ready to hand, as oppos
ed to a user who has a vague idea of what is needed from a corpus.
We shall discuss attempts to derive templates directly from corpora; to der
ive knowledge structures and lexicons directly from corpora, including disc
ussion of the recent LE project ECRAN which attempted to tune existing lexi
cons to new corpora. An important issue is how far established methods in I
nformation Retrieval of tuning to a user's needs with feedback at an interf
ace can be transferred to IE.