Y. Fukunishi et Y. Hayashizaki, Amino acid translation program for full-length cDNA sequences with frameshift errors, PHYSIOL GEN, 5(2), 2001, pp. 81-87
Here we present an amino acid translation program designed to suggest the p
osition of experimental frameshift errors and predict amino acid sequences
for full-length cDNA sequences having phred scores. Our program generates a
rtificial insertions into artificial deletions from low-accuracy positions
of the original sequence, thereby generating many candidate sequences. The
validity of the most probable sequence (the likelihood that it represents t
he actual protein) is evaluated by using a score (Va) that is calculated in
light of the Kozak consensus, preferred codon usage, and position of the i
nitiation codon. To evaluate the software, we have used a database in which
, out of 612 cDNA sequences, 524 (86%) carried 773 frameshift errors in the
coding sequence. Our software detected and corrected 48% of the total fram
eshift errors in 62% of the total cDNA sequences with frameshift errors. Th
e false positive rate of frameshift correction was 9%, and 91% of the sugge
sted frameshifts were true.