The Cancer Genome Anatomy Project (CGAP) is a large cooperative effort spon
sored by the US National Institutes of Health designed to End, catalog and
annotate genes that are expressed during cancer development. In the past 2
years, the CGAP has sequenced over 700,000 clones from approximately 140 cD
NA libraries, resulting in the identification of over 30,000 new human gene
s. As a first seep in applying this project to oral cancer we entered four
cell lines-two from oral cancer, one from primary oral keratinocytes, and o
ne from oral keratinocytes which had been immortalized by human papillomavi
rus. Libraries of cDNA were made and sequenced and the data were deposited
in GenBank. The expressed genes were then identified where possible. The ca
ll lines, and the total number of expressed genes that were cloned from eac
h were: HN3 (oral cancer), 263 genes; HN4 (oral cancer), 550 genes; HN5 (pr
imary keratinocytes), 237 genes; HN6 (immortalized keratinocytes), 408 gene
s. The total number of different genes that were found was 1160. A total of
38 new genes, of unknown function, were discovered. The data presented her
e represent a beginning of the application of the CGAP technology to oral c
ancer. Even though the data are still quite incomplete, they already repres
ent a large quantity of new information and clones of potential utility to
the oral cancer community, and provide a glimpse of the data sets to be for
thcoming from the Project. It must therefore be expected that there will so
on be a large expansion in the volume of data regarding the genetics of ora
l cancer. Those who study this disease must be prepared to develop new meth
ods of analysis and storage for handling the oncoming volumes of informatio
n. (C) 2000 Elsevier Science Ltd. All rights reserved.