Ey. Chen et al., LONG-RANGE SEQUENCE-ANALYSIS IN XQ28 - 13 KNOWN AND 6 CANDIDATE GENESIN 219.4 KB OF HIGH GC DNA BETWEEN THE RCP GCP AND GGPD LOCI/, Human molecular genetics, 5(5), 1996, pp. 659-668
DNA comprising 219 447 bp was sequenced in nine cosmids and verified a
t >99.9% precision, Of the standard repetitive elements, 187 Alus make
up 20.6% of the sequence, but there were only 27 MERs (2.9%) and 17 L
1 fragments (1.6%), This may be characteristic of such high GC (57%) r
egions, The sequence also includes an 11.3 kb tract duplicated with 99
.2% identity at a distance of 38 kb, The region is 80-90% transcribed
and 12.5% translated, Thirteen known genes and their exon-intron borde
rs are all accurately predicted at least in part by GRAIL programs, as
are six additional genes, From centromere to telomere, the orientatio
n of transcription varies among the first eight genes, then runs centr
omeric to telomeric far the next five, and is in the opposite sense fo
r the last six, Eighteen of the 19 genes are associated with CpG islan
ds, Two islands are exact copies in the 11.3 kb repeat units, and coul
d thus give rise to double dosage levels of an X-linked gene, Another
island is associated with two genes transcribed in opposite directions
, From the sequence data, three genes and their exon structure are inf
erred, One of them, previously associated with HEX2, is shown to be a
different gene unrelated to hexokinases; a second gene, previously kno
wn by an EST, is plexin, from its 65.5% identity with the Xenopus anal
og; and a third is a subunit of a vacuolar H-ATPase, and is named VATP
S1.