Most biological information is contained within gene and genome sequences.
However, current methods for analyzing these data are limited primarily to
the prediction of coding regions and identification of sequence similaritie
s. We have developed a computer algorithm, CoSMoS (for contest sensitive mo
tif searches), which adds context sensitivity to sequence motif searches. C
oSMoS was challenged to identify genes encoding peroxisome-associated and o
leate-induced genes in the yeast Saccharomyces cerevisiae, Specifically, we
searched for genes capable of encoding proteins with a type 1 or type 2 pe
roxisomal targeting signal and for genes containing the oleate-response ele
ment, a cis-acting element common to fatty acid-regulated genes. CoSMoS suc
cessfully identified 7 of 8 known PTS-containing peroxisomal proteins and 1
3 of 14 known oleate-regulated genes, More importantly, CoSMoS identified a
n additional 18 candidate peroxisomal proteins and 300 candidate oleate-reg
ulated genes. Preliminary localization studies suggest that these include a
t least 10 previously unknown peroxisomal proteins. Phenotypic studies of s
elected gene disruption mutants suggests that several of these new peroxiso
mal proteins play roles in growth on fatty acids, one is involved in peroxi
some biogenesis and at least two are required for synthesis of lysine, a he
retofore unrecognized role for peroxisomes. These results expand our unders
tanding of peroxisome content and function, demonstrate the utility of CoSM
oS for context-sensitive motif scanning, and point to the benefits of impro
ved in silico genome analysis.