A 2341-nucleotide region of the Adelaide River virus (ARV) genome, loc
ated immediately downstream of the second glycoprotein (G(NS)) gene, h
as been cloned and sequenced. The region contains four long open readi
ng frames (ORFs), the last of which represents a 1088-nucleotide fragm
ent at the start of the ARV L gene. Between the G(NS) and L genes are
two coding regions, separated by a single nucleotide (C), and each bou
nded by recognized transcription initiation (AACAG) and termination/po
lyadenylation (CATG[A](7)) sequences. The first coding region comprise
s 682 nucleotides and contains two long ORFs (alpha 1 and alpha 2) whi
ch are in the same reading frame but separated by two consecutive stop
codons. The alpha 2 ORF encodes a 12,545-Da polypeptide which contain
s highly hydrophobic and highly basic domains. The alpha 2 ORF include
s a potential initiation codon 18 nucleotides downstream of the tandem
stop codons and encodes a polypeptide of 11,951 Da. In ARV-infected c
ells, the a region is transcribed primarily as a long 4.7-kb polycistr
onic mRNA containing the G, G(NS), alpha 1, and alpha 2 ORFs. Direct s
equence analysis of the mRNA indicated that the tandem stop codons bet
ween the alpha 1 and alpha 2 ORFs are retained in the transcript. The
second coding region contains a single long ORF (beta) comprising 493
nucleotides which encodes a polypeptide with a calculated pl of 6.614
and molecular weight of 17,102 Da. The putative beta protein is simila
r in size to a protein which has been reported as a minor component of
virions. The beta gene is transcribed as a 0.65-kb monocistronic mRNA
for which the putative transcription termination/polyadenylation sign
al overlaps the L gene by 22 nucleotides. (C) 1994 Academic Press, Inc
.