The string B-tree: A new data structure for string search in external memory and its applications

Citation
P. Ferragina et R. Grossi, The string B-tree: A new data structure for string search in external memory and its applications, J ACM, 46(2), 1999, pp. 236-280
Citations number
53
Categorie Soggetti
Computer Science & Engineering
Journal title
Volume
46
Issue
2
Year of publication
1999
Pages
236 - 280
Database
ISI
SICI code
Abstract
We introduce a new text-indexing data structure, the String B-Tree, that ca n be seen as a link between some traditional external-memory and string-mat ching data structures. In a short phrase, it is a combination of B-trees an d Patricia tries for internal-node indices that is made more effective by a dding extra pointers to speed up search and update operations. Consequently , the String B-Tree overcomes the theoretical limitations of inverted files , B-trees, prefix B-trees, suffix arrays, compacted tries and suffix trees. String B-trees have the same worst-case performance as B-trees but they ma nage unbounded-length strings and perform much more powerful search operati ons such as the ones supported by suffix trees. String B-trees are also eff ective in main memory (RAM model) because they improve the online suffix tr ee search on a dynamic set of strings. They also can be successfully applie d to database indexing and software duplication.