Family pairwise search with embedded motif models

Citation
Wn. Grundy et Tl. Bailey, Family pairwise search with embedded motif models, BIOINFORMAT, 15(6), 1999, pp. 463-470
Citations number
35
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
15
Issue
6
Year of publication
1999
Pages
463 - 470
Database
ISI
SICI code
1367-4803(199906)15:6<463:FPSWEM>2.0.ZU;2-Y
Abstract
Motivation: Statistical models of protein families, such as position-specif ic scoring matrices, profiles and hidden Markov models, have been used effe ctively to find remote homologs when given a set of known protein family me mbers. Unfortunately training these models typically requires a relatively large set of training sequences. Recent work (Grundy, J. Comput. Biol., 5, 479-492, 1998) has shown that, when only a few family members are known, se veral theoretically justified statistical modeling techniques fail to provi de homology detection performance on a par with Family Pairwise Search (FPS ), an algorithm that combines scores from a pairwise sequence similarity al gorithm such as BLAST. Results: The present paper provides a model-based algorithm that improves F PS by incorporating hybrid motif-based models of the form generated by Cobb ler (Henikoff and Henikoff, Protein Sci., 6, 698-705, 1997). For the 73 pro tein families investigated here, this cobbled FPS algorithm provides better homology detection performance than either Cobbler or FPS alone. This impr ovement is maintained when BLAST is replaced with the fill Smith-Waterman a lgorithm.