We describe a new method for gene discovery and analysis, CD-tagging,
that puts specific molecular tags on a gene, its transcript and its pr
otein product. The method has been successfully tested in two organism
s, the haploid unicellular alga Chlamydomonas reinhardtii and the meta
zoan Drosophila melanogaster. The method utilized a specially designed
DNA molecule, the CD-cassette, that contains splice acceptor and dono
r sits surrounding a short open reading frame. Insertion of the CD-cas
sette into an intron in a target gene introduced a new exon, represent
ed by the open reading frame of the CD-cassette, surrounded by two fun
ctional hybrid introns. As a result (i) the gene is tagged by a specif
ic nucleotide sequence, (ii) the mRNA is tagged by a specific nucleoti
de sequence and (iii) the protein is tagged by a specific peptide sequ
ence. Because these tags are unique, specific nucleotide or antibody p
robes can be used to obtain and/or analyze the gene, transcript or pro
tein. As a gene discovery technology, CD-tagging has two unique advant
ages: 1) Genes can be identified through a primary screen at the prote
in level, and so the very process by which a gene is identified provid
es specific empirical information about its biological function. 2) Th
e cassette arms, which are spliced out of the transcript of the target
gene, are available to carry a wide variety of DNA sequences, such as
genes encoding drug resistance that can be used to select for the pre
sence of the CD-cassette in the genome.