NUCLEOTIDE-SEQUENCE OF THE MURINE LEUKEMIA-VIRUS AMPHOTROPIC STRAIN 4070A INTEGRASE (IN) CODING REGION AND COMPARATIVE STRUCTURAL-ANALYSIS OF THE INFERRED POLYPEPTIDE
Pl. Ey et al., NUCLEOTIDE-SEQUENCE OF THE MURINE LEUKEMIA-VIRUS AMPHOTROPIC STRAIN 4070A INTEGRASE (IN) CODING REGION AND COMPARATIVE STRUCTURAL-ANALYSIS OF THE INFERRED POLYPEPTIDE, Archives of virology, 142(9), 1997, pp. 1757-1770
The complete nucleotide sequence of the integrase (IN) protein coding
region of the murine leukaemia virus (MLV) amphotropic strain 4070A is
presented. The sequence comprises 1,224 nucleotides, encoding a 408-r
esidue polypeptide of M-r 46,312. Alignment of the inferred 4070A IN a
mino acid sequence with the IN proteins of other MLV showed that subst
itutions are confined largely to segments within the N- and C-terminal
domains. In the N-terminal domain the majority of substitutions occur
as contiguous 2- to 6-residue blocks, whereas in the C-terminal domai
n they occur as isolated entities except within a short segment charac
terized by deletions/insertions. Selection appears to act on the C-ter
minal 19 residues of IN rather than on the N-terminal residues of ENV
(encoded by overlapping reading frames), suggesting a functional role
for this segment. Phylogenetic analyses grouped the sequences into two
clusters, one comprising IN from the amphotropic strain 4070A and thr
ee ecotropic MLV (CAS-BR-E, Moloney and Friend), the other consisting
of IN from three ecotropic MLV (two radiation-induced viruses and AKV)
and a mink cell focus-forming (MCF) MLV virus. The same dichotomy and
cluster composition was obtained from analysis of the long terminal r
epeat (LTR) regions from these viruses (consistent with the functional
interrelationship of IN and LTR) but not from analysis of envelope pr
otein sequences (consistent with the functional independence of ENV pr
oteins from both IN and LTR). Secondary structure predictions supporte
d features determined from the catalytic domain of human immunodeficie
ncy virus and avian sarcoma virus IN, and identified probable structur
es within the relatively long N- and C-terminal domains of MLV IN prot
eins.