2-STEP EXTRACTION OF BILINGUAL COLLOCATIONS BY USING WORD-LEVEL SORTING

Citation
M. Haruno et S. Ikehara, 2-STEP EXTRACTION OF BILINGUAL COLLOCATIONS BY USING WORD-LEVEL SORTING, IEICE transactions on information and systems, E81D(10), 1998, pp. 1103-1110
Citations number
19
Categorie Soggetti
Computer Science Information Systems
ISSN journal
09168532
Volume
E81D
Issue
10
Year of publication
1998
Pages
1103 - 1110
Database
ISI
SICI code
0916-8532(1998)E81D:10<1103:2EOBCB>2.0.ZU;2-T
Abstract
This paper describes a new method for learning bilingual collocations from sentence-aligned parallel corpora. Our method comprises two steps : (1) extracting useful word chunks (n-grams) in each language by word -level sorting and (2) constructing bilingual collocations by combinin g the word-chunks acquired in stage (1). We apply the method to a two kinds of Japanese-English texts; (1) scientific articles that comprise relatively literal translations and (2) more challenging texts: a sto ck market bulletin in Japanese and its abstract in English In both cas es, domain specific collocations are well captured even if they were n ot contained in the dictionaries of specialized terms.