We explore the application of statistical techniques, borrowed from natural
language processing, to music. A probabilistic method is used to capture a
nd generalise from the local harmonic movement of a corpus of seventeenth-c
entury dance music. The probabilistic grammars so generated are then used f
or experiments in generation (composition).
The corpus is preprocessed in a novel way, automatically converting the har
monies into a normal form to capture the underlying harmonic similarities b
etween pieces. It is then automatically marked up with constituent boundari
es (beginnings and ends of pieces, phrases and bars), to enable the learnin
g process to capture some of the higher-level structure of the music.
The experiment is promising, and a sample of the results are given. We disc
uss the limitations of the approach, and how they might be overcome.