Human cathepsin K is a recently described cysteine protease with high
sequence homology to cathepsins S and L, members of the papain superfa
mily of cysteine proteases. Cathepsin K is abundantly and selectively
expressed in osteoclasts and may perform a specialized role in osteocl
ast-mediated bone resorption. In the present study, the genomic organi
zation and chromosomal localization of human cathepsin K (HGMW-approve
d symbol CTSK) were determined. Intron-exon boundaries were identified
by PCR on human genomic DNA, and subsequently a P1 genomic clone cont
aining the full-length gene was isolated. Cathepsin K spans approximat
ely 12.1 kb of genomic DNA and is composed of eight exons and seven in
trons. The genomic organization of cathepsin K is similar to that of c
athepsins S and L. The gene was mapped to chromosome 1q21 by fluoresce
nce in situ hybridization. Primer walking on the P1 genomic clone iden
tified 1108 bp of 5' flanking sequence and 459 bp of 3' flanking seque
nce. Ribonuclease protection assay and 5' RACE indicated a single tran
scriptional start site 49 bp upstream of the initiator Met codon. Anal
ysis of the 5' flanking region indicates that this gene lacks canonica
l TATA and CAAT boxes and contains multiple potential transcription re
gulatory sites. The characterization of the cathepsin K gene and its p
romoter may provide valuable insights not only into its osteoclast-sel
ective expression, but also into the molecular mechanisms responsible
for osteoclast activation. (C) 1997 Academic Press.