HARDWARE-ASSISTED ALGORITHM FOR FULL-TEXT LARGE-DICTIONARY STRING-MATCHING USING N-GRAM HASHING

Authors
Citation
Jd. Cohen, HARDWARE-ASSISTED ALGORITHM FOR FULL-TEXT LARGE-DICTIONARY STRING-MATCHING USING N-GRAM HASHING, Information processing & management, 34(4), 1998, pp. 443-464
Citations number
54
Categorie Soggetti
Information Science & Library Science","Computer Science Information Systems","Computer Science Information Systems
ISSN journal
03064573
Volume
34
Issue
4
Year of publication
1998
Pages
443 - 464
Database
ISI
SICI code
0306-4573(1998)34:4<443:HAFFLS>2.0.ZU;2-L
Abstract
A method of full-text scanning for matches in a large dictionary is de scribed. The method is suitable for SDI (selective dissemination of in formation) systems, accommodating large dictionaries (10(4)-10(5) entr ies) and typical digital data rates (tens of megabytes per second or m ore). It can be implemented on a single commercially-available board h osted by a personal computer or entirely in software. The preferred ap proach employs a-hardware primary test, followed by a software seconda ry test. The algorithm is described in detail, the implementation is s ketched, and simulation results are presented. (C) 1998 Elsevier Scien ce Ltd. All rights reserved.