Motivation: It is well known that the regulatory regions of genomes are hig
hly repetitive. They are rich in direct, symmetric and complemented repeats
, and there is no doubt about the functional significance of these repeats.
Among known measures of complexity, the Ziv-Lempel complexity measure refl
ects most adequately repeats occurring in the text But this measure does no
t take into account isomorphic repents. By isomorphic repeats we mean fragm
ents that are identical (or symmetric) module some permutation of the alpha
bet letters.
Results: In this paper two complexity measures of symbolic sequences are pr
oposed that generalize the Ziv-Lempel complexity measure by taking into acc
ount any isomorphic repeats in the text (rather than just direct repeats as
in Ziv-Lempel). The first of them, the complexity vector, is designed for
small alphabets such as the alphabet of nucleotides. The second is based on
a search for the longest isomorphic fragment in the history of sequence sy
nthesis and can be used for alphabets of arbitrary cardinality These measur
es have been used for recognition of structural regularities in DNA sequenc
es. Some interesting structures related to the regulatory region of the hum
an growth hormone are reported.