We have performed a survey of the active genes in the important human patho
gen Trypanosoma cruzi by analyzing 5013 expressed sequence tags (ESTs) gene
rated from a normalized epimastigote cDNA library. Clustering of all sequen
ces resulted in 771 clusters, comprising 54% of the ESTs. In total, the EST
s corresponded to 3054 transcripts that might represent one-fourth of the t
otal gene repertoire in T. cruzi. About 33% of the T. cruzi transcripts sho
wed similarity to sequences in the public databases, and a large number of
hitherto undiscovered genes predicted to be involved in transcription, cell
cycle control, cell division, signal transduction, secretion, and metaboli
sm were identified. More than 140 full-length gene sequences were derived f
rom the ESTs. Comparisons with all open reading frames in yeast and in Caen
orhabditis elegans showed that only 12% of the T. cruzi transcripts were sh
ared among diverse eukaryotic organisms. Comparison with other kinetoplasti
d sequences identified 237 orthologous genes that are shared between these
evolutionarily divergent organisms. The generated data are a useful resourc
e for further studies of the biology of the parasite and for development of
new means to combat Chagas' disease.