We have compared the genomes of 49 bacteriophages related to T4. PCR a
nalysis of six chromosomal regions reveals two types of local sequence
variation. In four loci, we found only two alternative configurations
in all the genomes that could be analyzed. In contrast, two highly po
lymorphic loci exhibit variations in the number, the order and the ide
ntity of the sequences present. In phage T4, both highly polymorphic l
oci encode internal proteins (IPs) that are encapsidated in the phage
particle and injected with the viral DNA. Among the various T4-related
phages, 10 different ORFs have been identified in the IP loci; their
amino acid sequences have the characteristics of internal proteins. At
the beginning of each of these coding sequences is a highly conserved
11 amino acid leader motif. In addition, both 5' and 3' to most of th
ese ORFs, there is a similar to 70 bp sequence that contains a T4 earl
y promoter sequence with an overlapping inversely repeated sequence. T
he homologies within these flanking sequences may mediate the recombin
ational shuffling of the IP sequences within the locus. A role for the
new IP-like sequences in determining the phage host range is proposed
since such a role has been previously demonstrated for the IP1 gene o
f T4.