Most mammalian genes will soon be characterized as cDNA sequences with
little information about their function. To utilize this sequence inf
ormation for large-scale functional studies,a gene trap retrovirus shu
ttle vector has been developed to disrupt genes expressed in murine em
bryonic stem (ES) cells. A library of mutant clones was isolated, and
regions of genomic DMA adjacent to 400 independent provirus inserts we
re cloned and sequenced. The flanking sequences, designated 'promoter-
proximal sequence tags', or PSTs, identified 63 specific genes and ano
nymous cDNAs disrupted as a result of virus integration. The efficienc
y of tagged sequence mutagenesis suggests that many of the 10,000-20,0
00 genes expressed in ES cells can be targeted, providing defined muta
tions for the analysis of gene functions in vivo. In addition, PSTs pr
ovide the first expressed sequence tags derived from genomic DNA, and
define gene features such as exon boundaries and promoters that are mi
ssing from cDNA sequences.