Hi Arnold, Aharon Robbins asked relating to [1]: > Do I remember correctly that you were going to try to merge the dfa.[ch] > from grep, gawk, and gettext? Did that go anywhere?
It didn't go very far. The diffs in gawk (50 KB of diffs) and grep (50 KB of diffs as well) went into different directions. I could manage the cosmetic and syntactic changes, and those that were ported between the two packages just to be undone a bit later. But the major difference, that dfaexec in gawk requires write access to the string being scanned, goes too deep. Even for someone with a book about DFA/NFA theory in front of him, the comments in the code are not sufficient for understanding what's going on in dfacomp and dfaexec. I gave up. Probably what should be done in the long run, is: - For gettext, use the plain regex module - the use of regular expressions in msggrep is not speed critical. - For gawk and grep, either rewrite the thing from scratch (for example in a way that combines the DFA and kwset approaches instead of having them as separate data structures), or at least add enough comments that an average developer like me can understand what's going on. Bruno [1] http://lists.gnu.org/archive/html/bug-gnulib/2009-02/msg00009.html