Rm. Losee, TEXT WINDOWS AND PHRASES DIFFERING BY DISCIPLINE, LOCATION IN DOCUMENT, AND SYNTACTIC STRUCTURE, Information processing & management, 32(6), 1996, pp. 747-767
Citations number
55
Categorie Soggetti
Information Science & Library Science","Information Science & Library Science","Computer Science Information Systems
Knowledge of window style, content, location, and grammatical structur
e may be used to classify documents as originating within a particular
discipline or may be used to place a document on a theory vs practice
spectrum. This distinction is also studied here using the type-token
ratio to differentiate between Sublanguages. The statistical significa
nce of windows is computed, based on the presence of terms in titles,
abstracts, citations, and section headers, as well as binary-independe
nt and inverse-document-frequency weightings. The characteristics of w
indows are studied by examining their within-window density and the S
concentration, the concentration of terms from various document fields
(e.g. title, abstract) in the fulltext. The rate of window occurrence
s from the beginning to the end of document fulltext differs between a
cademic fields. Different syntactic structures in sublanguages are exa
mined, and their use is considered for discriminating between specific
academic disciplines and, more generally, between theory vs practice
or knowledge vs applications-oriented documents. Copyright (C) 1996 El
sevier Science Ltd