ITA
ENG

Sequence search algorithm assessment and testing toolkit (SAT)

Authors

Park, J Holm, L Chothia, C

Citation

J. Park et al., Sequence search algorithm assessment and testing toolkit (SAT), BIOINFORMAT, 16(2), 2000, pp. 104-110

Citations number

Categorie Soggetti

Multidisciplinary

Journal title

BIOINFORMATICS

ISSN journal

13674803 → ACNP

Volume

Issue

Year of publication

2000

Pages

104 - 110

Database

ISI

SICI code

1367-4803(200002)16:2<104:SSAAAT>2.0.ZU;2-S

Abstract

Motivation: The Sequence Search Algorithm Assessment and Testing Toolkit (S AT) aims to be a complete package for the comparison of different protein h omology search algorithms. The structural classification of proteins can pr ovide us with a clear criterion for judgement in homology detection. There have been several assessments based on structural sequences with classifica tions but a good deal of similar work is now being repented with locally de veloped procedures and programs. The SAT will provide developers with a com plete package which will save time and produce more comparable performance assessments for search algorithms. The package is complete in the sense tha t it provides a non-redundant large sequence resource database, a well-char acterized query database of proteins domains, all the parsers and some prev ious results from PSI-BLAST and a hidden markov model algorithm. Results: An analysis on two different data sets was carried out using the S AT package. It compared rite performance of a full protein sequence databas e (RSDB100) with a non-redundant representative sequence database derived f rom it (RSDB50). The performance measurement indicated that the full databa se is sub-optimal for a homology search. This result justifies the use of m uch smaller and faster RSDB50 than RSDB100 for the SAT.