ITA
ENG

A STUDY OF A STATISTICAL-MODEL OF NATURAL-LANGUAGE

Authors

OBOYLE P OWENS M SMITH FJ

Citation

P. Oboyle et al., A STUDY OF A STATISTICAL-MODEL OF NATURAL-LANGUAGE, Irish journal of psychology, 14(3), 1993, pp. 382-396

Citations number

Categorie Soggetti

Psychology

Journal title

Irish journal of psychology → ACNP

ISSN journal

03033910

Volume

Issue

Year of publication

1993

Pages

382 - 396

Database

ISI

SICI code

0303-3910(1993)14:3<382:ASOASO>2.0.ZU;2-Q

Abstract

A statistical model of language is described and shown to be surprisin gly successful in two experiments based on a statistical analysis of t wo text corpora. One experiment trained the model on the domain-specif ic VODIS corpus of 70,000 words, while the other trained it on the Bro wn corpus of 1 million words, containing text from a wide range of dom ains. In each experiment the model was tested using unseen phrases fro m the appropriate corpus and results show that a statistical model can be remarkably successful, even though there is no knowledge of syntax included in the model. Our results also show that the model is most e ffective when trained and tested on the domain-specific VODIS corpus, in spite of its small size. It is noted that the VODIS corpus is a gre at deal smaller than the total amount of language heard by a child in its first few years of life, which suggests that in the restricted dom ain of interest to a child there is more than sufficient sample langua ge to build a successful statistical model containing no knowledge of grammar.