An improved method for the indexing of software

Citation
P. Di Felice et G. Fonzi, An improved method for the indexing of software, INF SOFTW T, 41(7), 1999, pp. 413-420
Citations number
10
Categorie Soggetti
Computer Science & Engineering
Journal title
INFORMATION AND SOFTWARE TECHNOLOGY
ISSN journal
09505849 → ACNP
Volume
41
Issue
7
Year of publication
1999
Pages
413 - 420
Database
ISI
SICI code
0950-5849(19990515)41:7<413:AIMFTI>2.0.ZU;2-K
Abstract
Many organizations are implementing free-text indexing schemes in order to build software catalogs with the aim of promoting systematic code reuse. Un fortunately, comments embedded in software systems suffer from several shor tcomings, therefore it is not reasonable to pretend that the quality of the indices that can be extracted from them must be high. in the present empir ical work, we implemented one such methods with the purpose of showing what could be expected when they are applied to the comments. The method we ref erred to uses pairs of words (called lexical affinities) as indexing units. The authors of such a method have given numerical indications (by carrying out a limited number of experiments on text-files about Unix commands) tha t lexical affinities provide better results than single-word schemes tradit ionally adopted in information retrieval. Our findings, arrived at by exper imenting with such an indexing scheme over the comments of a large collecti on of commercial routines, account for our pessimism: only in 1.9% of the t exts processed, the extracted indices are semantically representative of th e purpose of the routines the comments were embedded in. A general strategy suitable to get better results is proposed in the second part of the artic le and evaluated against the same collection of routines. (C) 1999 Elsevier Science B.V. All rights reserved.