A. Pavesi et al., IDENTIFICATION OF NEW EUKARYOTIC TRANSFER-RNA GENES IN GENOMIC DNA DATABASES BY A MULTISTEP WEIGHT MATRIX ANALYSIS OF TRANSCRIPTIONAL CONTROL REGIONS, Nucleic acids research, 22(7), 1994, pp. 1247-1256
A linear method for the search of eukaryotic nuclear tRNA genes in DNA
databases is described. Based on a modified version of the general we
ight matrix procedure, our algorithm relies on the recognition of two
intragenic control regions known as A and B boxes, a transcription ter
mination signal, and on the evaluation of the spacing between these el
ements. The scanning of the eukaryotic nuclear DNA database using this
search algorithm correctly identified 933 of the 940 known tRNA genes
(0.74% of false negatives). Thirty new potential tRNA genes were iden
tified, and the transcriptional activity of two of them was directly v
erified by in vitro transcription. The total false positive rate of th
e algorithm was 0.014%. Structurally unusual tRNA genes, like those co
ding for selenocysteine tRNAs, could also be recognized using a set of
rules concerning their specific properties, and one human gene coding
for such tRNA was identified. Some of the newly identified tRNA genes
were found in rather uncommon genomic positions: 2 in centromeric reg
ions and 3 within introns. Furthermore, the presence of extragenically
located B boxes in tRNA genes from various organisms could be detecte
d through a specific subroutine of the standard search program.