Can we make information extraction more adaptive?

Citation
Y. Wilks et R. Catizone, Can we make information extraction more adaptive?, LECT N A I, 1714, 1999, pp. 1-16
Citations number
55
Categorie Soggetti
Current Book Contents
ISSN journal
03029743
Volume
1714
Year of publication
1999
Pages
1 - 16
Database
ISI
SICI code
0302-9743(1999)1714:<1:CWMIEM>2.0.ZU;2-2
Abstract
It seems widely agreed that IE (Information Extraction) is now a tested lan guage technology that has reached precision + recall values that put it in about the same position as Information Retrieval and Machine Translation, b oth of which are widely used commercially. There is also a clear range of p ractical applications that would be eased by the sort of template-style dat a that IE provides. The problem for wider deployment of the technology is a daptability: the ability to customize 1E rapidly to new domains. In this paper we discuss some methods that have been tried to ease this pro blem, and to create something more rapid than the bench-mark one-month figu re, which was roughly what ARPA teams in IE needed to adapt an existing sys tem by hand to a new domain of corpora and templates. An important distinct ion in discussing the issue is the degree to which a user can be assumed to know what is wanted, to have preexisting templates ready to hand, as oppos ed to a user who has a vague idea of what is needed from a corpus. We shall discuss attempts to derive templates directly from corpora; to der ive knowledge structures and lexicons directly from corpora, including disc ussion of the recent LE project ECRAN which attempted to tune existing lexi cons to new corpora. An important issue is how far established methods in I nformation Retrieval of tuning to a user's needs with feedback at an interf ace can be transferred to IE.