ITA
ENG

NEURAL NETWORKS FOR MOLECULAR SEQUENCE CLASSIFICATION

Authors

WU C SHIVAKUMAR S LIN HP VELDURTI S BHATIKAR Y

Citation

C. Wu et al., NEURAL NETWORKS FOR MOLECULAR SEQUENCE CLASSIFICATION, Mathematics and computers in simulation, 40(1-2), 1995, pp. 23-33

Citations number

Categorie Soggetti

Computer Sciences",Mathematics,"Computer Science Interdisciplinary Applications","Computer Science Software Graphycs Programming

Journal title

Mathematics and computers in simulation → ACNP

ISSN journal

03784754

Volume

Issue

1-2

Year of publication

1995

Pages

23 - 33

Database

ISI

SICI code

0378-4754(1995)40:1-2<23:NNFMSC>2.0.ZU;2-K

Abstract

A neural network classification method has been developed as an altern ative approach to the search/organization problem of large molecular d atabases. Two artificial neural systems have been implemented on a Gra y supercomputer for rapid protein/nucleic acid sequence classification s. The neural networks used are three-layered, feed-forward networks t hat employ back-propagation learning algorithm. The molecular sequence s are encoded into neural input vectors by applying an n-gram hashing method or a SVD (singular value decomposition) method. Once trained wi th known sequences in the molecular databases, the neural system becom es an associative memory capable of classifying unknown sequences base d on the class information embedded in its neural interconnections. Th e protein system, which classifies proteins into PIR (Protein Identifi cation Resource) superfamilies, showed a 82% to a close to 100% sensit ivity at a speed that is about an order of magnitude faster than other search methods. The pilot nucleic acid system, which classifies ribos omal RNA sequences according to phylogenetic groups, has achieved a 10 0% classification accuracy. The system could be used to reduce the dat abase search time and help organize the molecular sequence databases. The tool is generally applicable to any databases that are organized a ccording to family relationships.