TEXT WINDOWS AND PHRASES DIFFERING BY DISCIPLINE, LOCATION IN DOCUMENT, AND SYNTACTIC STRUCTURE

Authors
Citation
Rm. Losee, TEXT WINDOWS AND PHRASES DIFFERING BY DISCIPLINE, LOCATION IN DOCUMENT, AND SYNTACTIC STRUCTURE, Information processing & management, 32(6), 1996, pp. 747-767
Citations number
55
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
ISSN journal
03064573
Volume
32
Issue
6
Year of publication
1996
Pages
747 - 767
Database
ISI
SICI code
0306-4573(1996)32:6<747:TWAPDB>2.0.ZU;2-D
Abstract
Knowledge of window style, content, location, and grammatical structur e may be used to classify documents as originating within a particular discipline or may be used to place a document on a theory vs practice spectrum. This distinction is also studied here using the type-token ratio to differentiate between Sublanguages. The statistical significa nce of windows is computed, based on the presence of terms in titles, abstracts, citations, and section headers, as well as binary-independe nt and inverse-document-frequency weightings. The characteristics of w indows are studied by examining their within-window density and the S concentration, the concentration of terms from various document fields (e.g. title, abstract) in the fulltext. The rate of window occurrence s from the beginning to the end of document fulltext differs between a cademic fields. Different syntactic structures in sublanguages are exa mined, and their use is considered for discriminating between specific academic disciplines and, more generally, between theory vs practice or knowledge vs applications-oriented documents. Copyright (C) 1996 El sevier Science Ltd