COUNTING DISTINCT STRINGS

Citation
D. Moore et al., COUNTING DISTINCT STRINGS, Algorithmica, 23(1), 1999, pp. 1-13
Citations number
8
Categorie Soggetti
Mathematics,"Computer Science Software Graphycs Programming",Mathematics,"Computer Science Software Graphycs Programming
Journal title
ISSN journal
01784617
Volume
23
Issue
1
Year of publication
1999
Pages
1 - 13
Database
ISI
SICI code
0178-4617(1999)23:1<1:>2.0.ZU;2-2
Abstract
This paper discusses how to count and generate strings that are ''dist inct'' in two senses: p-distinct and b-distinct. Two strings x on alph abet A and x' on alphabet A' are said to be p-distinct iff they repres ent distinct ''patterns''; that is, iff there exists no one-one mappin g from A to A' that transforms x into x'. Thus aab and baa are p-disti nct while aab and dde are p-equivalent. On the other hand, x and x' ar e said to be b-distinct iff they give rise to distinct border (failure function) arrays: thus aab with border array 010 is b-distinct from a ba with border array 001. The number of p-distinct (resp. b-distinct) strings of length n formed using exactly k different letters is the [k , n] entry in an infinite p' (resp, b') array. Column sums p[n] and b[ n] in these arrays give the number of distinct strings of length n. We present algorithms to compute, in constant time per string, all p-dis tinct (resp. b-distinct) strings of length n formed using exactly k le tters, and we also show how to compute all elements p'[k, n] and b'[k, Ir]. These ideas and results have application to the efficient genera tion of appropriate test data sets for many string algorithms.