NEURAL NETWORKS FOR MOLECULAR SEQUENCE CLASSIFICATION

Citation
C. Wu et al., NEURAL NETWORKS FOR MOLECULAR SEQUENCE CLASSIFICATION, Mathematics and computers in simulation, 40(1-2), 1995, pp. 23-33
Citations number
30
Categorie Soggetti
Computer Sciences",Mathematics,"Computer Science Interdisciplinary Applications","Computer Science Software Graphycs Programming
ISSN journal
03784754
Volume
40
Issue
1-2
Year of publication
1995
Pages
23 - 33
Database
ISI
SICI code
0378-4754(1995)40:1-2<23:NNFMSC>2.0.ZU;2-K
Abstract
A neural network classification method has been developed as an altern ative approach to the search/organization problem of large molecular d atabases. Two artificial neural systems have been implemented on a Gra y supercomputer for rapid protein/nucleic acid sequence classification s. The neural networks used are three-layered, feed-forward networks t hat employ back-propagation learning algorithm. The molecular sequence s are encoded into neural input vectors by applying an n-gram hashing method or a SVD (singular value decomposition) method. Once trained wi th known sequences in the molecular databases, the neural system becom es an associative memory capable of classifying unknown sequences base d on the class information embedded in its neural interconnections. Th e protein system, which classifies proteins into PIR (Protein Identifi cation Resource) superfamilies, showed a 82% to a close to 100% sensit ivity at a speed that is about an order of magnitude faster than other search methods. The pilot nucleic acid system, which classifies ribos omal RNA sequences according to phylogenetic groups, has achieved a 10 0% classification accuracy. The system could be used to reduce the dat abase search time and help organize the molecular sequence databases. The tool is generally applicable to any databases that are organized a ccording to family relationships.