ITA
ENG

COUNTING DISTINCT STRINGS

Authors

MOORE D SMYTH WF MILLER D

Citation

D. Moore et al., COUNTING DISTINCT STRINGS, Algorithmica, 23(1), 1999, pp. 1-13

Citations number

Categorie Soggetti

Mathematics,"Computer Science Software Graphycs Programming",Mathematics,"Computer Science Software Graphycs Programming

Journal title

Algorithmica → ACNP

ISSN journal

01784617

Volume

Issue

Year of publication

1999

Pages

1 - 13

Database

ISI

SICI code

0178-4617(1999)23:1<1:>2.0.ZU;2-2

Abstract

This paper discusses how to count and generate strings that are ''dist inct'' in two senses: p-distinct and b-distinct. Two strings x on alph abet A and x' on alphabet A' are said to be p-distinct iff they repres ent distinct ''patterns''; that is, iff there exists no one-one mappin g from A to A' that transforms x into x'. Thus aab and baa are p-disti nct while aab and dde are p-equivalent. On the other hand, x and x' ar e said to be b-distinct iff they give rise to distinct border (failure function) arrays: thus aab with border array 010 is b-distinct from a ba with border array 001. The number of p-distinct (resp. b-distinct) strings of length n formed using exactly k different letters is the [k , n] entry in an infinite p' (resp, b') array. Column sums p[n] and b[ n] in these arrays give the number of distinct strings of length n. We present algorithms to compute, in constant time per string, all p-dis tinct (resp. b-distinct) strings of length n formed using exactly k le tters, and we also show how to compute all elements p'[k, n] and b'[k, Ir]. These ideas and results have application to the efficient genera tion of appropriate test data sets for many string algorithms.