Sequence search algorithm assessment and testing toolkit (SAT)

Citation
J. Park et al., Sequence search algorithm assessment and testing toolkit (SAT), BIOINFORMAT, 16(2), 2000, pp. 104-110
Citations number
28
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
16
Issue
2
Year of publication
2000
Pages
104 - 110
Database
ISI
SICI code
1367-4803(200002)16:2<104:SSAAAT>2.0.ZU;2-S
Abstract
Motivation: The Sequence Search Algorithm Assessment and Testing Toolkit (S AT) aims to be a complete package for the comparison of different protein h omology search algorithms. The structural classification of proteins can pr ovide us with a clear criterion for judgement in homology detection. There have been several assessments based on structural sequences with classifica tions but a good deal of similar work is now being repented with locally de veloped procedures and programs. The SAT will provide developers with a com plete package which will save time and produce more comparable performance assessments for search algorithms. The package is complete in the sense tha t it provides a non-redundant large sequence resource database, a well-char acterized query database of proteins domains, all the parsers and some prev ious results from PSI-BLAST and a hidden markov model algorithm. Results: An analysis on two different data sets was carried out using the S AT package. It compared rite performance of a full protein sequence databas e (RSDB100) with a non-redundant representative sequence database derived f rom it (RSDB50). The performance measurement indicated that the full databa se is sub-optimal for a homology search. This result justifies the use of m uch smaller and faster RSDB50 than RSDB100 for the SAT.