Background: A central issue in genome analysis is the identification a
nd characterization of coding regions. Estimating the coding complexit
y of vertebrate genomes by measuring the kinetic complexity of mRNA po
pulations and by sequence analysis of cDNAs is limited by the fact tha
t any given source of mRNA represents a very biased sample of all gene
s. Exon trapping is a method that enables the identification of genes
irrespective of their transcriptional status. Results: Exons were trap
ped from the entire mouse genome, and the resulting fragments cloned.
About 7% of a random sample of exons taken from this library have sign
ificant structural homology or sequence similarity to previously seque
nced genes. Using cDNAs derived from several stages of mouse developme
nt, evidence for expression of about 62% of this sample of exons was f
ound. These data suggest that the great majority of exons' in the libr
ary are derived from genes. We estimate that the fraction of the genom
e contained in trapped exons is 2.4%; this corresponds to a sequence c
omplexity of about 72 megabases. Conclusions: The library of exons tra
pped from the entire mouse genome probably represents one of the least
biased and most comprehensive libraries of mouse coding regions, and
should therefore prove very useful for finding genes during genome map
ping and sequencing.