The World Wide Web's extraordinary reach is based in part on its open
assimilation of document formats. Although Web transfer protocols and
addressing can accommodate any kinds of resources, the unique applicat
ion context of a truly global hypermedia system favours the adoption o
f certain Web-adapted formats. In this paper we consider the evolution
ary record that has led to the ascent of the eXtensible Markup Languag
e (XML). We present a taxonomy of document species in the Web accordin
g to their syntax, style, structure, and semantics. We observe the pre
ferential adoption of SGML, CSS, HTML, and XML, respectively, which le
verage a parsimonious evolutionary strategy favouring declarative enco
dings over Turing-complete languages; separable styles over inline for
matting; declarative markup over presentational markup; and well-defin
ed semantics over operational behavior. The paper concludes with an ev
olutionary walkthrough of citation formats. Ultimately, combined with
the self-referential power of the Web to document itself, we believe X
ML can catalyze a critical shift of the Web from a global information
space into a universal knowledge network. (C) 1998 Published by Elsevi
er Science B.V. All rights reserved.