The parasitic protozoan Giardia lamblia represents one of the earliest
diverging lineages in the evolutionary history of eukaryotic organism
s as well as an important human pathogen. A representative sampling of
gene sequences from this early diverging protozoan could provide insi
ghts into genotypic and phenotypic innovations associated with the ori
gin of eukaryotes. Currently, known giardial gene sequences are heavil
y biased toward a few gene families, including variant surface protein
s (VSPs), structural proteins, and ribosomal RNA genes. One-pass seque
nces of Giardia genomic DNA were obtained using vector flanking primin
g sequences on the ends of cosmids in two independent libraries. Compa
risons of 2304 of these sequences against the GenBank(TM) database ide
ntified 205 potential giardial genes with BLAST scores P(n) < 10(9). T
hese coding regions encompass a wide range of metabolic, repair, and s
ignaling enzymes, and include some genes not predicted by our current
understanding of Giardia biochemistry. The efficiency of identificatio
n of putative genes is consistent with earlier findings that coding re
gions in the Giardia genome are densely packed and do not appear to co
ntain introns. Our current results suggest that direct genome sequenci
ng is an efficient method for identifying giardial genes for evolution
ary and biochemical studies. (C) 1998 Elsevier Science B.V. All rights
reserved.