We apply the Minimal Length Encoding Principle to formalize inference
about the evolution of macromolecular sequences. The Principle is show
n to imply a combination of Weighted Parsimony and Compatibility metho
ds that have long been used by biologists because of their good practi
cal performance. The background assumptions are expressed as an encodi
ng scheme for the observed data and as heuristic rules for selection o
f diagnostic positions in the sequences. The Principle was applied to
discover new subfamilies of Alu sequences, the most numerous family of
repetitive DNA sequences in the human genome.