On Mon, 14 Sep 2020 00:28:32 -0700
Paul Eggert <egg...@cs.ucla.edu> wrote:

> On 9/14/20 12:13 AM, Norihiro Tanaka wrote:
> 
> > when (i >= d->follows[i].elems[j].index), it seems that
> > map[d->follows[i].elems[j].index] has been already set a value more than 0.
> >
> > What case violates this assumption?
> 
> Thank you for looking into this. I ran into the problem with the 
> dfa-heap-overrun test:
> 
> grep -E '(^| )*(a|b)*(c|d)*( |$)' < /dev/null
> 
> I can reproduce the problem by applying the attached patch to current dfa.c. 
> This patch brings back the previous algorithm, except with a runtime test of 
> the assumption. If I then run the dfa-heap-overrun test, it dumps core on my 
> platform (Ubuntu 18.04.5 x86-64, en_US.utf8 locale) because the assumption is 
> violated.

Thanks for giving me the patch.  I confirmed the crash reproduces with
the patch in GNU/Linux, and I found that a closure to be removed was not
removed.

The bug is introduced in commit cafb61533f5bfb989698e3924f97471498b2422b
which is a first patch I wrote, and I attach a patch to fix the bug.
From dec64ca4eac5bb06d892756018f842293ad20006 Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Mon, 14 Sep 2020 22:21:05 +0900
Subject: [PATCH] dfa: fix failure in removal of epsilon closure

If there are a espilon in a branch and the closure is iterated, maybe fails
in removal of the node.  The bug is introduced in
commit cafb61533f5bfb989698e3924f97471498b2422b.

* lib/dfa.c (dfaanalyze): Calculate backward transition for not only
concatenation but closure.
---
 lib/dfa.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/lib/dfa.c b/lib/dfa.c
index 746c7b5..a7305ec 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2701,6 +2701,18 @@ dfaanalyze (struct dfa *d, bool searchflag)
 
         case STAR:
         case PLUS:
+          /* Every element in the lastpos of the argument is in the backward
+             set of every element in the firstpos.  */
+          if (d->epsilon)
+            {
+              tmp.elems = lastpos - stk[-1].nlastpos;
+              tmp.nelem = stk[-1].nlastpos;
+              position *p = firstpos - stk[-1].nfirstpos;
+              for (position *p = firstpos - stk[-1].nfirstpos;
+                   p < firstpos; p++)
+                merge2 (&backward[p->index], &tmp, &merged);
+            }
+
           /* Every element in the firstpos of the argument is in the follow
              of every element in the lastpos.  */
           {
-- 
1.7.1

Reply via email to