With the approach of the new millennium, a primary focus in software engine
ering involves issues relating to upgrading, migrating, and evolving existi
ng software systems. In this environment, the role of careful empirical stu
dies as the basis for improving software maintenance processes, methods, an
d tools is highlighted. One of the most important processes that merits emp
irical evaluation is software evolution. Software evolution refers to the d
ynamic behavior of software systems as they are maintained and enhanced ove
r their lifetimes. Software evolution is particularly important as systems
in organizations become longer-lived. However, evolution is challenging to
study due to the longitudinal nature of the phenomenon in addition to the u
sual difficulties in collecting empirical data. In this paper, we describe
a set of methods and techniques that we have developed and adapted to empir
ically study software evolution. Our longitudinal empirical study involves
collecting, coding, and analyzing more than 25,000 change events to 23 comm
ercial software systems over a 20-year period. Using data from two of the s
ystems, we illustrate the efficacy of flexible phase mapping and gamma sequ
ence analytic methods originally developed in social psychology to examine
group problem solving processes. We have adapted these techniques in the co
ntext of our study to identify and understand the phases through which a so
ftware system travels as it evolves over time. We contrast this approach wi
th time series analysis, the more traditional way of studying longitudinal
data. Our work demonstrates the advantages of applying methods and techniqu
es from other domains to software engineering and illustrates how, despite
difficulties, software evolution can be empirically studied.