Legacy software systems are typically complex, geriatric, and difficul
t to change, having evolved over decades and having passed through man
y developers. Nevertheless, these systems are mature, heavily used, an
d constitute massive corporate assets. Migrating such systems to moder
n platforms is a significant challenge due to the loss of information
over time, As a result, we embarked on a research project to design an
d implement an environment to support software migration. in particula
r we focused on migrating legacy PL/I source code to C++, with an init
ial phase of looking at redocumentation strategies. Recent technologie
s such as reverse engineering fools and World Wide Web standards now m
ake it possible to build tools that greatly simplify the process of re
documenting a legacy software system. In this paper we introduce the c
oncept of a software bookshelf as a means to capture, organize, and ma
nage information about a legacy software system. We distinguish three
roles directly involved in the construction, population, and use of su
ch a bookshelf: the builder, the librarian, and the patron. From these
perspectives, we describe requirements for the bookshelf, as well as
a generic architecture and a prototype implementation. We also discuss
various parsing and analysis fools that were developed and integrated
to assist in the recovery of useful information about a legacy system
. In addition, we illustrate how a software bookshelf is populated wit
h the information of a given software project and how the bookshelf ca
n be used in a program-understanding scenario. Reported results are ba
sed on a pilot project that developed a prototype bookshelf for a soft
ware system consisting of approximately 300K lines of code written in
a PL/I dialect.