Domain size distributions can predict domain boundaries

Citation
Sj. Wheelan et al., Domain size distributions can predict domain boundaries, BIOINFORMAT, 16(7), 2000, pp. 613-618
Citations number
19
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
7
Year of publication
2000
Pages
613 - 618
Database
ISI
SICI code
1367-4803(200007)16:7<613:DSDCPD>2.0.ZU;2-3
Abstract
Motivation: The sizes of protein domains observed in the 3D-structure datab ase follow a surprisingly narrow distribution. Structural domains are furth ermore formed from a single-chain continuous segment in over 80% of instanc es. These observations imply that some choices of domain boundaries on an o therwise uncharacterized sequence are more likely than others, based solely on the size and segment number of predicted domains. This property might b e used to guess the locations of protein domain boundaries. Results: To test this possibility we enumerate putative domain boundaries a nd calculate their relative likelihood under a probability model that consi ders only the size and segment number of predicted domains. We ask in a cro ss-validated test using sequences with known 3D structure, whether the most likely guesses agree with the observed domain structure. We find that doma in boundary predictions are surprisingly successful for sequences up to 400 residues long and that guessing domain boundaries in this way can improve the sensitivity of threading analysis.