Reference point logistic regression and the identification of DNA functional sites

Authors
Citation
Pm. Hooper, Reference point logistic regression and the identification of DNA functional sites, J CLASSIF, 18(1), 2001, pp. 81-107
Citations number
38
Categorie Soggetti
Library & Information Science
Journal title
JOURNAL OF CLASSIFICATION
ISSN journal
01764268 → ACNP
Volume
18
Issue
1
Year of publication
2001
Pages
81 - 107
Database
ISI
SICI code
0176-4268(2001)18:1<81:RPLRAT>2.0.ZU;2-4
Abstract
Logistic regression is frequently used in pattern recognition problems to m odel conditional probabilities of class membership given features observed. While performing well in many applications, logistic regression is limited to a relatively simple parametric model and is often not suitable for comp lex applications. This article describes a generalization of logistic regre ssion based on reference point logistic (RPL) functions; i.e., normalized e xponential functions of squared distance between the vector of observed fea tures and reference points in the feature space. This generalization is clo sely related to a recently developed method for constructing classification rules. RPL regression and classification methods are based on the same par ametric family of functions and the same optimization technique. The method s differ primarily in their optimality criterion and interpretation. Both m ethods are highly flexible. By adjusting the number of reference points, th e complexity of conditional probability models acid classification boundari es can be adapted to the problem at hand. Comparisons are made with related techniques from statistics and neural networks. As an illustration RPL reg ression is applied to the problem of identifying functional sites at the bo undaries of protein coding regions in genomic DNA.