Reverse engineering 4.7 million lines of code

Citation
P. Tonella et al., Reverse engineering 4.7 million lines of code, SOFTW PR EX, 30(2), 2000, pp. 129-150
Citations number
28
Categorie Soggetti
Computer Science & Engineering
Journal title
SOFTWARE-PRACTICE & EXPERIENCE
ISSN journal
00380644 → ACNP
Volume
30
Issue
2
Year of publication
2000
Pages
129 - 150
Database
ISI
SICI code
0038-0644(200002)30:2<129:RE4MLO>2.0.ZU;2-8
Abstract
The ITC-Irst Reverse Engineering group was charged with analyzing a softwar e application of approximately 4.7 million lines of C code. It was an old l egacy system, maintained for a long time, on which several successive adapt ive and corrective maintenance interventions had led to the degradation of the original structure. The company decided to re-engineer the software ins tead of replacing it, because the complexity and costs of re-implementing t he application from scratch could not be afforded, and the associated risk could not be run. Several problems were encountered during re-engineering, including identifying dependencies and detecting redundant functions that w ere not used anymore. To accomplish these goals, we adopted a conservative approach. Before performing any kind of analysis on the whole code, we care fully evaluated the expected costs. To this aim, a small but representative sample of modules was preliminarily analyzed, and the costs and outcomes w ere extrapolated so as to obtain some indications on the analysis of the wh ole system. When the results of the sample modules were found to be useful as well as affordable for the entire system, the resources involved were ca refully distributed among the different reverse engineering tasks to meet t he customer's deadline. This paper summarizes that experience, discussing h ow we approached the problem, the way we managed the limited resources avai lable to complete the task within the assigned deadlines, and the lessons w e learned. Copyright (C) 2000 John Wiley & Sons, Ltd.