Hypervariability is a prominent feature of large gene families that mediate
interactions between organisms, such as venom-derived toxins or immunoglob
ulins. In order to study mechanisms for evolution of hypervariability, we e
xamined an EST-generated assemblage of 170 distinct conopeptide sequences f
rom the venoms of five species of marine Conus snails. These sequences were
assigned to eight gene families, defined by conserved elements in the sign
al domain and untranslated regions. Order-of-magnitude differences were obs
erved in the expression levels of individual conopeptides, with five to sev
en transcripts typically comprising over 50% of the sequenced clones in a g
iven species. The conopeptide precursor alignments revealed four striking f
eatures peculiar to the mature peptide domain: (1) an accelerated rate of n
ucleotide substitution, (2) a bias for transversions over transitions in nu
cleotide substitutions, (3) a position-specific conservation of cysteine co
dons within the hypervariable region, and (4) a preponderance of nonsynonym
ous substitutions over synonymous substitutions. We propose that the first
three observations argue for a mutator mechanism targeted to mature domains
in conopeptide genes, combining a protective activity specific for cystein
e codons and a mutagenic polymerase that exhibits transversion bias, such a
s DNA polymerase V. The high D-n/D-s ratio is consistent with positive or d
iversifying selection, and further analyses by intraspecific/interspecific
gene tree contingency tests weakly support recent diversifying selection in
the evolution of conopeptides. Since only the most highly expressed transc
ripts segregate in gene trees according to the feeding specificity of the s
pecies, diversifying selection might be acting primarily on these sequences
. The combination of a targeted mutator mechanism to generate high variabil
ity with the subsequent action of diversifying selection on highly expresse
d variants might explain both the hypervariability of conopeptides and the
large number of unique sequences per species.