Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis

Citation
M. Hirosawa et al., Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis, GENOME RES, 10(9), 2000, pp. 1333-1341
Citations number
29
Categorie Soggetti
Molecular Biology & Genetics
Journal title
GENOME RESEARCH
ISSN journal
10889051 → ACNP
Volume
10
Issue
9
Year of publication
2000
Pages
1333 - 1341
Database
ISI
SICI code
1088-9051(200009)10:9<1333:DOSIOP>2.0.ZU;2-H
Abstract
cDNA is an artificial copy of mRNA and, therefore, no cDNA can be completel y free from suspicion of cloning errors. Because overlooking these cloning errors results in serious misinterpretation of cDNA sequences, development of an alerting system targeting spurious sequences in cloned cDNAs is an ur gent requirement for massive cDNA sequence analysis. We describe here the a pplication of a modified GeneMark program, originally designed for prokaryo tic gene finding. For detection of artifacts in cDNA clones. This program s erves to provide a warning when any spurious split of protein-coding region s is detected through statistical analysis of cDNA sequences based on Marko v models. In this study, 817 cDNA sequences deposited in public databases b y us were subjected to analysis using this alerting system to assess its se nsitivity and specificity. The results indicated that any spurious split of protein-coding regions in cloned cDNAs could be sensitively detected and s ystematically revised by means of this system after the experimental valida tion of the alerts. Furthermore, this study offered us, for the first time, statistical data regarding the rates and types of errors causing protein-c oding splits in cloned cDNAs obtained by conventional cloning methods.