ITA
ENG

COMPARISON OF STRINGS BELONGING TO THE SAME FAMILY

Authors

ELLOUMI M

Citation

M. Elloumi, COMPARISON OF STRINGS BELONGING TO THE SAME FAMILY, Information sciences, 111(1-4), 1998, pp. 49-63

Citations number

Categorie Soggetti

Computer Science Information Systems","Computer Science Information Systems

Journal title

Information sciences → ACNP

ISSN journal

00200255

Volume

111

Issue

1-4

Year of publication

1998

Pages

49 - 63

Database

ISI

SICI code

0020-0255(1998)111:1-4<49:COSBTT>2.0.ZU;2-6

Abstract

The comparison of strings belonging to the same family can be made thr ough the construction of a long(est) Common Subsequence (CS) that refl ects structural similarities that exist between these strings. The lon gest CS problem is NP-complete. In this paper, we present an O(N-2 x L (2)log(2)(L)) algorithm, where N is the number of the strings and L is the maximum length, that constructs a CS, to a family of strings, mad e up by longer words appearing, approximately, in the same positions i n all the strings. During each iteration, our algorithm looks for the longest common words that appear, approximately, in the same positions in all the strings, then, filters the common words found, to keep jus t those that assure the smallest norm between all the strings. The fil tering is based on a new distance called contextual distance. (C) 1998 Elsevier Science Inc. All rights reserved.