ITA
ENG

An improved method for the indexing of software

Authors

Di Felice, P Fonzi, G

Citation

P. Di Felice et G. Fonzi, An improved method for the indexing of software, INF SOFTW T, 41(7), 1999, pp. 413-420

Citations number

Categorie Soggetti

Computer Science & Engineering

Journal title

INFORMATION AND SOFTWARE TECHNOLOGY

ISSN journal

09505849 → ACNP

Volume

Issue

Year of publication

1999

Pages

413 - 420

Database

ISI

SICI code

0950-5849(19990515)41:7<413:AIMFTI>2.0.ZU;2-K

Abstract

Many organizations are implementing free-text indexing schemes in order to build software catalogs with the aim of promoting systematic code reuse. Un fortunately, comments embedded in software systems suffer from several shor tcomings, therefore it is not reasonable to pretend that the quality of the indices that can be extracted from them must be high. in the present empir ical work, we implemented one such methods with the purpose of showing what could be expected when they are applied to the comments. The method we ref erred to uses pairs of words (called lexical affinities) as indexing units. The authors of such a method have given numerical indications (by carrying out a limited number of experiments on text-files about Unix commands) tha t lexical affinities provide better results than single-word schemes tradit ionally adopted in information retrieval. Our findings, arrived at by exper imenting with such an indexing scheme over the comments of a large collecti on of commercial routines, account for our pessimism: only in 1.9% of the t exts processed, the extracted indices are semantically representative of th e purpose of the routines the comments were embedded in. A general strategy suitable to get better results is proposed in the second part of the artic le and evaluated against the same collection of routines. (C) 1999 Elsevier Science B.V. All rights reserved.