While complexity and difficulty are abstract concepts and therefore no
t directly observable, they are nevertheless indicated by observable p
henomena. The study described in this article was undertaken to model
the relationship between source code complexity and maintenance diffic
ulty. This is achieved by applying canonical correlation analysis. Pro
duct and process measures collected during the development of a commer
cial real-time product provided the data for the analysis. Sets of the
se measures represent source code complexity and maintenance difficult
y. The authors hypothesize that source code complexity exerts a causal
influence on maintenance difficulty experienced during the system tes
t phase of the product. They demonstrate that significant canonical co
rrelations along two dimensions support this hypothesis. Interpretatio
n of these two dimensions of canonical correlation reveals relationshi
ps between the sets of manifest variables that were not immediately ap
parent from their simple correlations. Specifically, the model suggest
s that two subsets of product measures have different relationships wi
th process activity. One is related to design-change activity that res
ulted in faults, and the other is related directly to faults. The auth
ors conclude that soft models of greater generality than canonical cor
relation could provide more insight into relationships among software
engineering measures. However, much work remains to specify subsets of
indicators and development efforts for which the technique could be u
seful as a predictive tool.