The Minimum Description Length (MDL) principle is solidly based on a provab
ly ideal method of inference using Kolmogorov complexity. We test how the t
heory behaves in practice on a general problem in model selection: that of
learning the best model granularity. The performance of a model depends cri
tically on the granularity, for example the choice of precision of the para
meters. Too high precision generally involves modeling of accidental noise
and too low precision may lead to confusion of models that should be distin
guished. This precision is often determined ad hoc. In MDL the best model i
s the one that most compresses a two-part code of the data set: this embodi
es "Occam's Razor". In two quite different experimental settings the theore
tical value determined using MDL coincides with the best value found experi
mentally. In the first experiment the task is to recognize isolated handwri
tten characters in one subject's handwriting, irrespective of size and orie
ntation. Based on a new modification of elastic matching, using multiple pr
ototypes per character, the optimal prediction rate is predicted for the le
arned parameter (length of sampling interval) considered most likely by MDL
, which is shown to coincide with the best value found experimentally. In t
he second experiment the task is to model a robot arm with two degrees of f
reedom using a three layer feed-forward neural network where we need to det
ermine the number of nodes in the hidden layer giving best modeling perform
ance. The optimal model (the one that extrapolizes best on unseen examples)
is predicted for the number of nodes in the hidden layer considered most l
ikely by MDL, which again is found to coincide with the best value found ex
perimentally. (C) 2000 Elsevier Science B.V. All rights reserved.