The vast body of Expressed Sequence Tag (EST) data in the public databases
provide an important resource for comparative and functional genomics studi
es and an invaluable tool for the annotation of genomic sequences. We have
developed a rigorous protocol for reconstructing the sequences of transcrib
ed genes from EST and gene sequence fragments. A key element in developing
this protocol has been the evaluation of a number of sequence assembly prog
rams to determine which most faithfully reproduce transcript sequences from
EST data. The TIGR Gene Indices constructed using this protocol for human,
mouse, rat and a variety of other plant and animal models have demonstrate
d their utility in a variety of applications and are freely available to th
e scientific research community.