Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies

Citation
Cf. Allex et al., Neural network input representations that produce accurate consensus sequences from DNA fragment assemblies, BIOINFORMAT, 15(9), 1999, pp. 723-728
Citations number
17
Categorie Soggetti
Multidisciplinary
Journal title
BIOINFORMATICS
ISSN journal
13674803 → ACNP
Volume
15
Issue
9
Year of publication
1999
Pages
723 - 728
Database
ISI
SICI code
1367-4803(199909)15:9<723:NNIRTP>2.0.ZU;2-U
Abstract
Motivation: Given inputs extracted from an aligned column of DNA bases and the underlying Pet-kin Elmer Applied Biosystems (ABI) fluorescent tr-aces, our goal is to train a neural network to determine correctly the consensus base for the column. Choosing an appropriate network input representation i s critical to success in this task. We empirically compare five representat ions; one uses only base calls and the others include trace information. Results: We attained the most accurate results from networks that incorpora te trace information into their input representations. Based on estimates d erived from using 10-fold cross-validation, the best network topology produ ces consensus accuracies ranging from 99.26% to >99.98% for coverages from two to six aligned sequences. With a coverage of six, it makes only three e rrors in 20 000 consensus calls. In contrast, the network that only uses ba se calls in its input representation has over double that error rate: eight errors in 20 000 consensus calls.