Rlh. Nelissen et al., CLONING AND CHARACTERIZATION OF 2 PROCESSED PSEUDOGENES AND THE CDNA FOR THE MURINE U1 SNRNP-SPECIFIC PROTEIN-C, Gene, 184(2), 1997, pp. 273-278
Genes for the snRNP proteins U1-70K, U1-A, Sm-B'/B, Sm-D1 and Sm-E hav
e been isolated from various metazoan species. The genes for Sm-D1 and
Sm-E, which were isolated from a murine and human source respectively
, appear to belong to a multigene family. It has been suggested that a
lso for the mammalian U1-C protein such a multigene family exists. Wit
h the human U1-C cDNA as a probe, two genes containing sequences homol
ogous to the probe sequence were isolated from a mouse genomic library
. Simultaneously, a murine U1-C cDNA was isolated from a mouse cDNA li
brary. This 0.74 kb cDNA contains an open reading frame (ORF) of 477 b
p encoding a polypeptide of 159 amino acids (aa) which differs at only
one position (position 65) from the human U1-C protein. One of the is
olated U1-C genes contains an ORF as well and shares 92% nucleotide se
quence identity with the mouse U1-C cDNA. The features of this gene, i
n particular the absence of introns, the acquisition of a 3' poly(A) t
ail and flanking direct repeats, indicate that it represents a process
ed pseudogene. At the predicted aa sequence level, substitutions of co
nserved residues at functionally important positions are observed, str
ongly suggesting that expression of this gene would not lead to a func
tional polypeptide. The second U1-C gene appeared to be a pseudogene a
s well because it is also intronless and contains a frameshift mutatio
n compared to the ORF in the mouse U1-C cDNA. The characterization of
these two pseudogenes points to the existence of a U1-C multigene fami
ly in mice. Furthermore, comparison of aa sequences of the murine, hum
an and Xenopus U1-C shows that the protein is highly conserved through
evolution. Since the Xenopus U1-C differs from the two mammalian coun
terparts solely at a number of positions in the C-terminal region, it
can be concluded that aa changes are less well tolerated in the N-term
inal region of U1-C than in the rest of the protein.