Genes for intermediate filament proteins and the draft sequence of the human genome: novel keratin genes and a surprisingly high number of pseudogenes related to keratin genes 8 and 18
M. Hesse et al., Genes for intermediate filament proteins and the draft sequence of the human genome: novel keratin genes and a surprisingly high number of pseudogenes related to keratin genes 8 and 18, J CELL SCI, 114(14), 2001, pp. 2569-2575
We screened the draft sequence of the human genome for genes that encode in
termediate filament (IF) proteins in general, and keratins in particular. T
he draft covers nearly all previously established IF genes including the re
cent cDNA and gene additions, such as pancreatic keratin 23, synemin and th
e novel muscle protein syncoilin. In the draft, seven novel type II keratin
s were identified, presumably expressed in the hair follicle/epidermal appe
ndages. In summary, 65 IF genes were detected, placing IF among the 100 lar
gest gene families in humans. All functional keratin genes map to the two k
nown keratin clusters on chromosomes 12 (type II plus keratin 18) and 17 (t
ype I), whereas other IF genes are not clustered. Of the 208 keratin-relate
d DNA sequences, only 49 reflect true keratin genes, whereas the majority d
escribe inactive gene fragments and processed pseudogenes. Surprisingly, ne
arly 90% of these inactive genes relate specifically to the genes of kerati
ns 8 and 18. Other keratin genes, as well as those that encode non-keratin
IF proteins, lack either gene fragments/pseudogenes or have only a few deri
vatives. As parasitic derivatives of mature mRNAs, the processed pseudogene
s of keratins 8 and 18 have invaded most chromosomes, often at several posi
tions. We describe the limits of our analysis and discuss the striking unev
enness of pseudogene derivation in the IF multigene family. Finally, we pro
pose to extend the nomenclature of Moll and colleagues to any novel keratin
.