ITA
ENG

TOWARD THE AUTOMATIC IDENTIFICATION OF SUBLANGUAGE VOCABULARY

Authors

HAAS SW HE S

Citation

Sw. Haas et S. He, TOWARD THE AUTOMATIC IDENTIFICATION OF SUBLANGUAGE VOCABULARY, Information processing & management, 29(6), 1993, pp. 721-732

Citations number

Categorie Soggetti

Information Science & Library Science","Information Science & Library Science","Computer Applications & Cybernetics

Journal title

Information processing & management → ACNP

ISSN journal

03064573

Volume

Issue

Year of publication

1993

Pages

721 - 732

Database

ISI

SICI code

0306-4573(1993)29:6<721:TTAIOS>2.0.ZU;2-Z

Abstract

A sublanguage is the language used in a restricted or specialized doma in or field, such as computer science. Information about the vocabular y and structure of a sublanguage is used in any domain-related natural language processing application; however, such information is very ti me-consuming to gather, and much of it must be found and organized man ually. Additionally, information retrieval strategies using lexical in formation depend on finding the appropriate dictionary entry for gener al and technical words. The ability to automatically identify terms be longing to a sublanguage could aid in these and other applications. In this paper, a simple but effective method is developed for automatic identification of sublanguage vocabulary words as they occur in abstra cts. This procedure may significantly reduce the effort required to ex tract sublanguage vocabulary for sublanguage analysis and other applic ations, such as information retrieval. First, the sublanguage vocabula ry identification procedures are described using abstracts from comput er science and library and information science as the sublanguage sour ces. The results of these experiments are evaluated using three differ ent criteria. Finally, the practical and theoretical significance of t his research is discussed along with plans for further experiments on the vocabulary and structure of sublanguages.