Jm. Twomey et Ae. Smith, PERFORMANCE-MEASURES, CONSISTENCY, AND POWER FOR ARTIFICIAL NEURAL-NETWORK MODELS, Mathematical and computer modelling, 21(1-2), 1995, pp. 243-258
Model building in artificial neural networks (ANN) refers to selecting
the ''optimal'' network architecture, network topology, data represen
tation, training algorithm, training parameters, and terminating crite
ria, such that some desired level of performance is achieved. Validati
on, a critical aspect of any model construction, is based upon some sp
ecified ANN performance measure of data that was not used in model con
struction. In addition to trained ANN validation, this performance mea
sure is often used to evaluate the superiority of network architecture
, learning algorithm, or application of a neural network. This paper i
nvestigates the three most frequently reported performance measures fo
r pattern classification networks: Mean Absolute Error (MAE), Root Mea
n Squared Error (RMSE), and percent good classification. First the inc
onsistency of the three metrics for selecting the ''better'' network i
s examined empirically. An analysis of error histograms is shown to be
an effective means for investigating and resolving inconsistent netwo
rk performance measures. Second, the focus of this paper is on percent
good classification, the most often used measure of performance for c
lassification networks. This measure is satisfactory if no particular
importance is given to any single class, however, if one class is deem
ed more serious than others, percent good classification will mask the
individual class components. This deficiency is resolved through a ne
ural network analogy to the statistical concept of power. It is shown
that power as a neural network performance metric is tuneable, and is
a more descriptive measure than percent correct for evaluating and pre
dicting the ''goodness'' of a network.