HIV-1 and HIV-2 LTR nucleotide sequences: Assessment of the alignment by N-block presentation, "Retroviral signatures" of overrepeated oligonucleotides, and a probable important role of scrambled stepwise duplications/deletions in molecular evolution
I. Laprevotte et al., HIV-1 and HIV-2 LTR nucleotide sequences: Assessment of the alignment by N-block presentation, "Retroviral signatures" of overrepeated oligonucleotides, and a probable important role of scrambled stepwise duplications/deletions in molecular evolution, MOL BIOL EV, 18(7), 2001, pp. 1231-1245
Previous analyses of retroviral nucleotide sequences, suggest a so-called "
scrambled duplicative stepwise molecular evolution" (many sectors with succ
essive duplications/deletions of short and longer motifs) that could have s
temmed from one or several starter tandemly repeated short sequence(s). In
the present report, we tested this hypothesis by focusing on the long termi
nal repeats (LTRs) land flanking sequences) of 24 human and 3 simian immuno
deficiency viruses. By using a calculation strategy applicable to short seq
uences, we found consensus overrepresented motifs (often containing CTG or
GAG) that were congruent with the previously defined "retroviral signature.
" We also show many local repetition patterns that are significant when com
pared with simply shuffled sequences. First- and second-order Markov chain
analyses demonstrate that a major portion of the overrepresented oligonucle
otides can be predicted from the dinucleotide compositions of the sequences
, but by no means can biological mechanisms be deduced from these results:
some of the listed local repetitions remain significant against dinucleotid
e-conserving shuffled sequences; together with previous results, this sugge
sts that interspersed and/or local mononucleotide and oligonucleotide repet
itions could have biased the dinucleotide compositions of the sequences. We
searched for suggestive evolutionary patterns by scrutinizing a reliable m
ultiple alignment of the 27 sequences. A manually constructed alignment bas
ed on homology blocks was in good agreement with the polypeptide alignment
in the coding sectors and has been exhaustively assessed by using a multipl
ied alphabet obtained by the promising mathematical strategy called the N-b
lock presentation (taking into account the environment of each nucleotide i
n a sequence). Sector by sector, we hypothesize many successive duplication
/deletion scenarios that fit our previous evolutionary hypotheses. This sug
gests an important duplication/deletion role for the reverse transcriptase,
particularly in inducing stuttering cryptic simplicity patterns.