P. Ferragina et R. Grossi, The string B-tree: A new data structure for string search in external memory and its applications, J ACM, 46(2), 1999, pp. 236-280
We introduce a new text-indexing data structure, the String B-Tree, that ca
n be seen as a link between some traditional external-memory and string-mat
ching data structures. In a short phrase, it is a combination of B-trees an
d Patricia tries for internal-node indices that is made more effective by a
dding extra pointers to speed up search and update operations. Consequently
, the String B-Tree overcomes the theoretical limitations of inverted files
, B-trees, prefix B-trees, suffix arrays, compacted tries and suffix trees.
String B-trees have the same worst-case performance as B-trees but they ma
nage unbounded-length strings and perform much more powerful search operati
ons such as the ones supported by suffix trees. String B-trees are also eff
ective in main memory (RAM model) because they improve the online suffix tr
ee search on a dynamic set of strings. They also can be successfully applie
d to database indexing and software duplication.