Many evolving mission-critical systems must have high software reliability.
However, it is often difficult to identify fault-prone modules early enoug
h in a development cycle to guide software enhancement efforts effectively
and efficiently. Software quality models can yield timely predictions of me
mbership in the fault-prone class on a module-by-module basis, enabling one
to target enhancement techniques. However, it is an open empirical questio
n, "Can a software quality model remain useful over several releases?" Most
prior software quality studies have examined only one release of a system,
evaluating the model with modules from the same release. We conducted a ca
se study of a large legacy telecommunications system where measurements on
one software release were used to build models, and three subsequent releas
es of the same system were used to evaluate model accuracy. This is a reali
stic assessment of model accuracy, closely simulating actual use of a softw
are quality model. A module was considered fault-prone if any of its faults
were discovered by customers. These faults are extremely expensive due to
consequent loss of service and emergency repair efforts. We found that the
model maintained useful accuracy over several releases. These findings are
initial empirical evidence that software quality models can remain useful a
s a system is maintained by a stable software development process.