ONLINE TOOLS FOR SEQUENCE RETRIEVAL AND MULTIVARIATE-STATISTICS IN MOLECULAR-BIOLOGY

Citation
G. Perriere et J. Thioulouse, ONLINE TOOLS FOR SEQUENCE RETRIEVAL AND MULTIVARIATE-STATISTICS IN MOLECULAR-BIOLOGY, Computer applications in the biosciences, 12(1), 1996, pp. 63-69
Citations number
40
Categorie Soggetti
Mathematical Methods, Biology & Medicine","Computer Sciences, Special Topics","Computer Science Interdisciplinary Applications","Biology Miscellaneous
ISSN journal
02667061
Volume
12
Issue
1
Year of publication
1996
Pages
63 - 69
Database
ISI
SICI code
0266-7061(1996)12:1<63:OTFSRA>2.0.ZU;2-B
Abstract
We have developed a World-Wide-Web server for browsing sequence collec tions structured under the ACNUC format and for performing multivariat e analyses on sequences. General collections (like GenBank or EMBL), a s well as specialized data banks (like Hovergen and NRSub) can be acce ssed. This system allows complex queries to be constructed, and the re sult of each query, represented by a list of sequences is stored on th e server. It is then possible to reuse this list to compute multivaria te analyses on the sequences. Two examples of applications are shown. The first one consists in a study of codon usage with correspondence a nalysis on all the protein genes of Haemophilus influenzae Rd. This st udy allows the highly expressed genes and the integral membrane protei ns of this organism to be identified. The second one consists in an or dering of 70 aligned protein sequences of growth hormone with principa l coordinate analysis. With this method, we are able to re-establish t he patterns of relationships between the sequences previously determin ed with tree building programs.