Public databases now include vast amounts of recently acquired DNA sequence
s that are only partially annotated and, furthermore, are often annotated b
y automated methods that are subject to errors. Maximum information value o
f these databases can be derived only by further detailed analyses that fre
quently require careful examination of records in the context of biological
functions. In this study we present an example of such an analysis focused
on plant glycerolipid synthesis. Public databases were searched for sequen
ces corresponding to 65 plant polypeptides involved in lipid metabolism. Co
mprehensive search results and analysis of genes, cDNAs and expressed seque
nce tags (ESTs) are available online (http://www.canr.msu.edu/lgc). Multipl
e alignments provided a method to estimate the number of genes in gene fami
lies. Further analysis of sequences allowed us to tentatively identify seve
ral previously undescribed genes in Arabidopsis. For example, two genomic s
equences were identified as candidates for the palmitate-specific monogalac
tosyldiacylglycerol desaturase (FAD5). A candidate genomic sequence For 3-k
etoacyl-acyl-carrier protein (ACP) synthase involved in mitochondrial fatty
acid biosynthesis was also identified. Biotin carboxyl carrier protein (BC
CP) in Arabidopsis is encoded by at least two genes, but the most abundant
BCCP transcript so far has not been characterized. The large number (>165,0
00) of plant ESTs also provides an opportunity to perform "digital northern
" comparisons of gene expression levels across many genes. EST abundance in
general correlated with biochemical and flux characteristics of the enzyme
s in Arabidopsis leaf tissue. In a few cases, statistically significant dif
ferences in EST abundance levels were observed for enzymes that catalyze si
milar reactions in fatty acid metabolism. For example, ESTs for the Fats ac
yl-ACP thioesterase occur 21 times compared with 7 times for FatA acyl-ACP
thioesterase, although flux through the FatA reaction is several times high
er than through Fats. Such comparisons may provide initial clues toward pre
viously undescribed regulatory phenomena. The abundance of ESTs for ACP com
pared with that of stearoyl-ACP desaturase and Fats acyl-ACP thioesterase s
uggests that concentrations of some enzymes of fatty acid synthesis may be
higher than their acyl-ACP substrates.