Using simulation to evaluate prediction techniques

Citation
M. Shepperd et G. Kadoda, Using simulation to evaluate prediction techniques, SEVENTH INTERNATIONAL SOFTWARE METRICS SYMPOSIUM - METRICS 2001, PROCEEDINGS, 2000, pp. 349-359
Citations number
23
Categorie Soggetti
Current Book Contents
Year of publication
2000
Pages
349 - 359
Database
ISI
SICI code
Abstract
The need for accurate software prediction systems increases as software bec omes much larger and more complex. A variety of techniques have been propos ed, however, none has proved consistently accurate and there is still much uncertainty as to what technique suits which type of prediction problem. We believe that the underlying characteristics - size, number of features, ty pe of distribution, etc. - of the dataset influence the choice of the predi ction system to be used. In previous work, it has proved difficult to obtai n significant results over small datasets. Consequently we required large v alidation datasets, moreover, we wished to control the characteristics of s uch datasets in order to systematically explore the relationship between ac curacy, choice of prediction system and dataset characteristic. Our solutio n has been to simulate data allowing both control and the possibility of la rge (1000) validation cases. In this paper we compared regression, rule ind uction and nearest neighbour (a form of case based reasoning). The results suggest that there are significant differences depending upon the character istics of the dataset. Consequently researchers should consider prediction context when evaluating competing prediction systems. We also observed that the more "messy" the data and the more complex the relationship with the d ependent variable the more variability in the results. This became apparent since we sampled two different training sets from each simulated populatio n of data. In the more complex cases we observed significantly different re sults depending upon the training set. This suggests that researchers will need to exercise caution when comparing different approaches and utilise pr ocedures such as bootstrapping in order to generate multiple samples for tr aining purposes.