IMPROVED SENSITIVITY OF PROFILE SEARCHES THROUGH THE USE OF SEQUENCE WEIGHTS AND GAP EXCISION

Citation
Jd. Thompson et al., IMPROVED SENSITIVITY OF PROFILE SEARCHES THROUGH THE USE OF SEQUENCE WEIGHTS AND GAP EXCISION, Computer applications in the biosciences, 10(1), 1994, pp. 19-29
Citations number
33
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
10
Issue
1
Year of publication
1994
Pages
19 - 29
Database
ISI
SICI code
0266-7061(1994)10:1<19:ISOPST>2.0.ZU;2-G
Abstract
Position-specific substitution matrices, known as profiles, derived fr om multiple sequence alignments are currently used to search sequence databases for distantly related members of protein families. The perfo rmance of the database searches is enhanced by using (i) a sequence we ighting scheme which assigns higher weights to more distantly related sequences based on branch lengths derived from phylogenetic trees, (ii ) exclusion of positions with mainly padding characters at sites of in sertions or deletions and (iii) the BLOSUM62 residue comparison matrix . A natural consequence of these modifications is an improvement in th e alignment of new sequences to the profiles. However, the accuracy of the alignments can be further increased by employing a similarity res idue comparison matrix. These developments are implemented in a progra m called PROFILEWEIGHT which runs on Unix and Vax computers. The only input required by the program is the multiple sequence alignment. The output from PROFILEWEIGHT is a profile designed to be used by existing searching and alignment programs. Test results from database searches with four different families of proteins show the improved sensitivit y of the weighted profiles.