J. Bates et A. Lavie, RECOGNIZING SUBSTRINGS OF LR(K) LANGUAGES IN LINEAR-TIME, ACM transactions on programming languages and systems, 16(3), 1994, pp. 1051-1077
LR parsing techniques have long been studied as being efficient and po
werful methods for processing context-free languages. A linear-time al
gorithm for recognizing languages representable by LR(k) grammars has
long been known. Recognizing substrings of a context-free language is
at least as hard as recognizing full strings of the language, since th
e latter problem easily reduces to the former. In this article we pres
ent a linear-time algorithm for recognizing substrings of LR(k) langua
ges, thus showing that the substring recognition problem for these lan
guages is no harder than the full string recognition problem. An inter
esting data structure, the Forest-Structured Stack, allows the algorit
hm to track all possible parses of a substring without losing the effi
ciency of the original LR parser. We present the algorithm, prove its
correctness, analyze its complexity, and mention several applications
that have been constructed.