ITA
ENG

Can we make information extraction more adaptive?

Authors

Wilks, Y Catizone, R

Citation

Y. Wilks et R. Catizone, Can we make information extraction more adaptive?, LECT N A I, 1714, 1999, pp. 1-16

Citations number

Categorie Soggetti

Current Book Contents

Journal title

INFORMATION EXTRACTION: TOWARDS SCALABLE, ADAPTABLE SYSTEMS → ACNP

ISSN journal

03029743

Volume

1714

Year of publication

1999

Pages

1 - 16

Database

ISI

SICI code

0302-9743(1999)1714:<1:CWMIEM>2.0.ZU;2-2

Abstract

It seems widely agreed that IE (Information Extraction) is now a tested lan guage technology that has reached precision + recall values that put it in about the same position as Information Retrieval and Machine Translation, b oth of which are widely used commercially. There is also a clear range of p ractical applications that would be eased by the sort of template-style dat a that IE provides. The problem for wider deployment of the technology is a daptability: the ability to customize 1E rapidly to new domains. In this paper we discuss some methods that have been tried to ease this pro blem, and to create something more rapid than the bench-mark one-month figu re, which was roughly what ARPA teams in IE needed to adapt an existing sys tem by hand to a new domain of corpora and templates. An important distinct ion in discussing the issue is the degree to which a user can be assumed to know what is wanted, to have preexisting templates ready to hand, as oppos ed to a user who has a vague idea of what is needed from a corpus. We shall discuss attempts to derive templates directly from corpora; to der ive knowledge structures and lexicons directly from corpora, including disc ussion of the recent LE project ECRAN which attempted to tune existing lexi cons to new corpora. An important issue is how far established methods in I nformation Retrieval of tuning to a user's needs with feedback at an interf ace can be transferred to IE.