FISH - A GUIDE TO PROTEIN-CODING DNA-SEQUENCES IN THE GENBANK DATABASE

Authors
Citation
Dw. Collins, FISH - A GUIDE TO PROTEIN-CODING DNA-SEQUENCES IN THE GENBANK DATABASE, Computer applications in the biosciences, 9(3), 1993, pp. 337-342
Citations number
16
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Applications & Cybernetics","Biology Miscellaneous
ISSN journal
02667061
Volume
9
Issue
3
Year of publication
1993
Pages
337 - 342
Database
ISI
SICI code
0266-7061(1993)9:3<337:F-AGTP>2.0.ZU;2-0
Abstract
FISH (Fast Index Search for Homologous coding sequences) consists of a database and associated software and is intended to function as a dir ectory of protein-coding gene sequences. The FISH index contains descr iptions of 22 361 DNA sequences from release 69.0 of the GenBank genet ic sequence database. Complete coding sequences are represented numeri cally with counts of nucleotides and synonymous codons, and with GenBa nk LOCUS names and short descriptions. The software permits the databa se to be queried by GenBank LOCUS name, sequence length (expressed as total number of codons), or by comparison with a DNA sequence. In the latter case, the numerical descriptions are compared with simple dista nce measures in place of actual DNA sequences. The FISH package can be used to rapidly assemble lists of similar coding sequences, without r egard to functional annotation or sequence alignments. Typical search times are well under a minute on widely available IBM-compatible micro computers.