FROM REGULAR EXPRESSIONS TO DFAS USING COMPRESSED NFAS

Authors
Citation
Ch. Chang et R. Paige, FROM REGULAR EXPRESSIONS TO DFAS USING COMPRESSED NFAS, Theoretical computer science, 178(1-2), 1997, pp. 1-36
Citations number
26
Categorie Soggetti
Computer Sciences","Computer Science Theory & Methods
ISSN journal
03043975
Volume
178
Issue
1-2
Year of publication
1997
Pages
1 - 36
Database
ISI
SICI code
0304-3975(1997)178:1-2<1:FRETDU>2.0.ZU;2-1
Abstract
There are two principal methods for turning regular expressions into N FA's - one due to McNaughton and Yamada and another due to Thompson. U nfortunately, both have drawbacks. Given a regular expression R of len gth r and with s occurrences of alphabet symbols, Chang and Paige (199 2) and Bruggemann-Klein(1993) gave Theta(m+r) time and O(r) spate algo rithms to produce a Theta(m) space representation of McNaughton and Ya mada's NFA with s+1 states and in transitions. The problem with this N FA is that m=Theta(s(2)) in the worst case. Thompson's method takes Th eta(r) time and space to construct a Theta(r) space NFA with Theta(r) states and Theta(r) transitions. The problem with this NFA is that r c an be arbitrarily larger than s. We overcome drawbacks of both methods with a Theta(r) time Theta(s) space algorithm to construct an O(s) sp ace representation of McNaughton and Yamada's NFA. Given any set V of NFA states, our representation can be used to compute the set U of sta tes one transition away from the states in V in optimal time O(\V\+\U\ ). McNaughton and Yamada's NFA requires Theta(\V\x\U\) time in the wor st case. Using Thompson's NFA, the equivalent calculation requires The ta(r) time in the worst case. Comparative benchmarks show that an impl ementation of our method outperforms implementations of competing meth ods with respect to time for NFA construction, NFA accepting testing, and NFA-to-DFA conversion by subset construction. Throughout this pape r program transformations are used to design algorithms and derive pro grams. A transformation of special importance is a form of finite diff erencing used previously by Douglas Smith to improve the efficiency of functional programs.