Cw. Hill et al., RHS ELEMENTS OF ESCHERICHIA-COLI - A FAMILY OF GENETIC COMPOSITES EACH ENCODING A LARGE MOSAIC PROTEIN, Molecular microbiology, 12(6), 1994, pp. 865-871
The Rhs family comprises a set of composite elements found in the chro
mosomes of many natural Escherichia coli strains. Five Rhs elements oc
cur in strain K-12. The most prominent Rhs component is a giant core o
pen reading frame (core ORF) whose features are suggestive of a cell s
urface ligand-binding protein. This hypothetical protein contains a pe
ptide motif, xxGxxxRYxYDxxGRL(I or T)xxxx, that is repeated 28 times.
A similar repeated motif is found in a Bacillus subtilis wall-associat
ed protein. The Rhs core ORFs consist of two distinct parts: a large N
-terminal core that is conserved in all Rhs elements, and a smaller C-
terminus that is highly variable. Distinctive G+C contents of Rhs comp
onents indicate that the elements have a recent origin outside the E.
coli species, and that they are composites assembled from segments wit
h very different evolutionary histories. The Rhs cores fall into three
sub-families that are mutually more than 20% divergent. Downstream of
the core ORF is a second, much shorter ORF. Like the adjacent core ex
tension, these are highly variable. In most examples, the hypothetical
product of this ORF has a candidate signal sequence for transport acr
oss the cytoplasmic membrane. Another Rhs component, the 1.3 kb H-rpt,
has features typical of insertion sequences. Structures homologous to
H-rpt have been detected in other bacterial genera, such as Vibrio an
d Salmonella, where they are associated with loci that determine O-ant
igen variation.