Large-scale genomic sequencing requires a software infrastructure to s
upport and integrate applications that are not directly compatible. We
describe a suite of software tools built around the Common Assembly F
ormat (CAF), a comprehensive representation of a sequence assembly as
a text file. These tools Form tile backbone of sequencing informatics
at the Sanger Centre and the Genome Sequencing Center. The CAF format
is intentionally flexible, and our Perl and C libraries, which parse a
nd manipulate it, provide powerful tools for creating new applications
as well as wrappers to incorporate other software. The tools are avai
lable fl ee by anonymous FTP from ftp://ftp.sanger.ac.uk/pub/badger/.