M. Haruno et S. Ikehara, 2-STEP EXTRACTION OF BILINGUAL COLLOCATIONS BY USING WORD-LEVEL SORTING, IEICE transactions on information and systems, E81D(10), 1998, pp. 1103-1110
This paper describes a new method for learning bilingual collocations
from sentence-aligned parallel corpora. Our method comprises two steps
: (1) extracting useful word chunks (n-grams) in each language by word
-level sorting and (2) constructing bilingual collocations by combinin
g the word-chunks acquired in stage (1). We apply the method to a two
kinds of Japanese-English texts; (1) scientific articles that comprise
relatively literal translations and (2) more challenging texts: a sto
ck market bulletin in Japanese and its abstract in English In both cas
es, domain specific collocations are well captured even if they were n
ot contained in the dictionaries of specialized terms.