Cj. Stoeckert et al., EpoDB: a prototype database for the analysis of genes expressed during vertebrate erythropoiesis, NUCL ACID R, 27(1), 1999, pp. 200-203
EpoDB is a database of genes expressed in vertebrate red blood cells. It is
also a prototype for the creation of cell and tissue-specific databases fr
om multiple external sources. The information in EpoDB obtained from GenBan
k, SWISS-PROT, Transfac, TRRD and GERD is curated to provide high quality d
ata for sequence analysis aimed at understanding gene regulation during ery
thropoiesis, New protocols have been developed for data integration and upd
ating entries. Using a BLAST-based algorithm, we have grouped GenBank entri
es representing the same gene together, This sequence similarity protocol w
as also used to identify new entries to be included in EpoDB. We have recen
tly implemented our database in Sybase (relational tables) in addition to S
ICStus Prolog to provide us with greater flexibility in asking complex quer
ies that utilize information from multiple sources. New additions to the pu
blic web site (http://www.cbil.upenn.edu/epodb) for accessing EpoDB are the
ability to retrieve groups of entries representing different variants of t
he same gene and to retrieve gene expression data. The BLAST query has been
enhanced by incorporating BLAST-View, an interactive and graphical display
of BLAST results. We have also enhanced the queries for retrieving sequenc
e from specified genes by the addition of MEME, a motif discovery tool, to
the integrated analysis tools which include CLUSTALW and TESS.