Analysis of the huge volume of data generated by large scale sequencin
g projects requires the construction of new. sophisticated computer sy
stems. These systems should be able to manage the biological data as w
ell as the results of their analysis. They should also help the user t
o choose the most appropriate methods and to string them together in o
rder to solve a global analysis task. In this paper we present the pro
totype of a software system providing an environment for the analysis
of large scale sequence data. As a first step toward this end. this en
vironment has been put to the test within the Bacillus subtilis genome
sequencing project. This system integrates both the descriptive knowl
edge of the entities involved (genes, regulatory signals and the like)
and the methodological knowledge comprising an extensible set of anal
ytical methods. A knowledge representation based on two existing objec
t-oriented models is used to implement this integrated system. In addi
tion, the present prototype provides a suitable user interface both fo
r displaying simultaneously the results generated by several methods a
nd for interacting with the objects. We present in this paper the anal
ysis of a B. subtilis genome fragment. present in data libraries but n
ot annotated. Annotation of the genes present in the fragment allowed
us to combine the results of several methods used for predicting codin
g sequences, and to characterize it as comprising a cryptic phage. the
skin element. Comparison between the annotation of the skin element a
nd a standard region of the chromosome indicated that local features o
f the nucleotide sequence could discriminate between phage and non-pha
ge DNA sequence