M. Shepperd et G. Kadoda, Using simulation to evaluate prediction techniques, SEVENTH INTERNATIONAL SOFTWARE METRICS SYMPOSIUM - METRICS 2001, PROCEEDINGS, 2000, pp. 349-359
The need for accurate software prediction systems increases as software bec
omes much larger and more complex. A variety of techniques have been propos
ed, however, none has proved consistently accurate and there is still much
uncertainty as to what technique suits which type of prediction problem. We
believe that the underlying characteristics - size, number of features, ty
pe of distribution, etc. - of the dataset influence the choice of the predi
ction system to be used. In previous work, it has proved difficult to obtai
n significant results over small datasets. Consequently we required large v
alidation datasets, moreover, we wished to control the characteristics of s
uch datasets in order to systematically explore the relationship between ac
curacy, choice of prediction system and dataset characteristic. Our solutio
n has been to simulate data allowing both control and the possibility of la
rge (1000) validation cases. In this paper we compared regression, rule ind
uction and nearest neighbour (a form of case based reasoning). The results
suggest that there are significant differences depending upon the character
istics of the dataset. Consequently researchers should consider prediction
context when evaluating competing prediction systems. We also observed that
the more "messy" the data and the more complex the relationship with the d
ependent variable the more variability in the results. This became apparent
since we sampled two different training sets from each simulated populatio
n of data. In the more complex cases we observed significantly different re
sults depending upon the training set. This suggests that researchers will
need to exercise caution when comparing different approaches and utilise pr
ocedures such as bootstrapping in order to generate multiple samples for tr
aining purposes.