M. Hirosawa et al., Identification of novel transcribed sequences on human chromosome 22 by expressed sequence tag mapping, DNA RES, 8(1), 2001, pp. 1-9
To identify sequences on the human genome that are actually transcribed, we
mapped expressed sequence tags (ESTs) of long cDNAs ranging from 4 kb to 7
kb along a 33.4-Mb sequence of human chromosome 22, the first human chromo
some entirely sequenced. By the EST mapping of 30,683 long cDNAs in silico,
603 cDNA sequences were found to locate on chromosome 22 and classified in
to 169 clusters. Comparison of the genomic loci of these cDNA sequences wit
h 679 genes already annotated on chromosome 22q revealed that 46 clusters r
epresented newly identified transcribed sequences. To further characterize
these sequences, we sequenced 12 cDNAs in their entirety out of 46 clusters
. Of these 12 cDNAs, 6 were predicted to include a protein-coding region wh
ile the remaining 6 were unlikely to encode proteins. Interestingly, 3 out
of the 12 cDNAs had the nucleotide sequences of the opposite strands of the
genes previously annotated, which suggested that these genomic regions wer
e transcribed bi-directionally. In addition to these newly identified 12 cD
NAs, another 12 cDNAs were entirely sequenced since these cDNAs were likely
to contain new information about the predicted protein-coding sequences pr
eviously annotated. In the cases of KIAA1670 and KIAA1672, these single cDN
A sequences covered two separately annotated transcribed regions. For examp
le, the sequence of a clone for KIAA1670 indicated that the CHKL and CPT1B
genes were co-transcribed as a contiguous transcript without making both th
e protein-coding regions fused. In conclusion, the mapping of ESTs derived
from long cDNAs followed by sequencing of the entire cDNAs provided indispe
nsable information for the precise annotation of genes on the genome togeth
er with ESTs derived from short cDNAs.