Analytical DNA ultracentrifugation revealed that eukaryotic genomes are mos
aics of isochores: long DNA segments ( >> 300 kb on average) relatively hom
ogeneous in G + C. Important genome features are dependent on this isochore
structure, e.g. genes are found predominantly in the GC-richest isochore c
lasses. However, no reliable method is available to rigorously partition th
e genome sequence into relatively homogeneous regions of different composit
ion, thereby revealing the isochore structure of chromosomes at the sequenc
e level. Homogeneous regions are currently ascertained by plain statistics
on moving windows of arbitrary length, or simply by eye on G + C plots. On
the contrary, the entropic segmentation method is able to divide a DNA sequ
ence into relatively homogeneous, statistically significant domains. An ear
ly version of this algorithm only produced domains having an average length
far below the typical isochore size. Here we show that an improved segment
ation method, specifically intended to determine the most statistically sig
nificant partition of the sequence at each scale, is able to identify the b
oundaries between long homogeneous genome regions displaying the typical fe
atures of isochores. The algorithm precisely locates classes II and III of
the human major histocompatibility complex region, two well-characterized i
sochores at the sequence level, the boundary between them being the first i
sochore boundary experimentally characterized at the sequence level. The an
alysis is then extended to a collection of human large contigs. The relativ
ely homogeneous regions we find show many of the features (G + C range, rel
ative proportion of isochore classes, size distribution, and relationship w
ith gene density) of the isochores identified through DNA centrifugation. I
sochore chromosome maps, with many potential applications in genomics, are
then drawn for all the completely sequenced eukaryotic genomes available. (
C) 2001 Elsevier Science B.V. All rights reserved.