On Tue, Feb 23, 2021 at 04:16:09AM -0800, Michael Paoli wrote:

> > Synopsis:      Basic Regular Expression (BRE) bug in \{m,n\} with \(\) and 
> > \n
> > Category:      library
> > Environment:
>         System      : OpenBSD 6.7
>         Details     : OpenBSD 6.7 (GENERIC) #7: Wed Jan  6 15:19:25 MST 2021
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
> 
>         Architecture: OpenBSD.amd64
>         Machine     : amd64
> > Description:
>         Certain BRE expressions fail/misbehave unexpectedly.
>         The failures are the same in both grep and sed (without -E).
>         The failures only occur with certain combinations of use of:
>         \{\}, \(\), \n (where n is digit) syntax, dropping any one
>         of those then generally fails to trigger the bug.
>         The bug/error can be seen most clearly in unexpected
>         behavior of the \{m,n\} portion in the given context.
>         If more of the (apparently dependent) context is removed,
>         the bug doesn't show up.  E.g. some of the clearest cases
>         involve replacing * with \{0,\} in the BRE, and getting
>         quite unexpected results (one would expect the results
>         to be the same).  These same BREs work under both
>         Solaris 11 and GNU/Linux with their sed and grep.
> > How-To-Repeat:
>         This example code can be used to illustrate the problem,
>         and both show cases where the bug shows up, and also slightly
>         differing contexts where the bug does not occur.
>         In each of these cases, the output should be the STRING
>         we set/echo into grep/sed where we use our BRE, but in the bug
>         cases we get no output.
>         It's also suggested test cases be added to the code to catch
>         possible regression bugs, should issue recur.  :-)
>         Example code to show where bug does (and doesn't) show up:
>         (
>                 exec 2>&1
>                 set -- \
>                         'YYxx' 'Y*\(x\)\1' \
>                         'YYxx' 'Y\{0,\}\(x\)\1' \
>                         'YYxx' 'Y\{2,\}\(x\)\1' \
>                         'YYxx' 'Y\{0,\}\(x\)' \
>                         'YYxx' 'Y\{2,\}x' \
>                         'YYxx' 'Y\{2,\}x\{1,\}' \
>                         'YYxx' 'Y\{2,\}x\{0,\}' \
>                         'YYxxz' 'Y\{2,\}x\{0,\}z' \
>                         'YYxxz' 'Y\{0,\}x\{0,\}z' \
>                         'YYxyxy' 'Y\{2,\}\(xy\)\1' \
>                         'YYxyxy' 'Y\{0,\}\(xy\)\1' \
>                         'YYxyxy' 'Y*\(xy\)\1' \
>                         'YYxyxy' 'Y\{0,\}\(xy\)xy'
>                 while [ "$#" -ge 2 ]
>                 do
>                         STRING="$1"; shift; BRE="$1"; shift
>                         set -x
>                         echo "$STRING" | grep -e "$BRE"
>                         echo "$STRING" | sed -ne "s/$BRE/&/p"
>                         set +x
>                 done
>         )
>         Example run of above code.  Bug is present where our
>         STRING echoed into grep/sed fails to appear in the
>         output:
>         + echo YYxx
>         + grep -e Y*\(x\)\1
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y*\(x\)\1/&/p
>         YYxx
>         + set +x
>         + echo YYxx
>         + grep -e Y\{0,\}\(x\)\1
>         + echo YYxx
>         + sed -ne s/Y\{0,\}\(x\)\1/&/p
>         + set +x
>         + echo YYxx
>         + grep -e Y\{2,\}\(x\)\1
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y\{2,\}\(x\)\1/&/p
>         YYxx
>         + set +x
>         + echo YYxx
>         + grep -e Y\{0,\}\(x\)
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y\{0,\}\(x\)/&/p
>         YYxx
>         + set +x
>         + echo YYxx
>         + grep -e Y\{2,\}x
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y\{2,\}x/&/p
>         YYxx
>         + set +x
>         + echo YYxx
>         + grep -e Y\{2,\}x\{1,\}
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y\{2,\}x\{1,\}/&/p
>         YYxx
>         + set +x
>         + echo YYxx
>         + grep -e Y\{2,\}x\{0,\}
>         YYxx
>         + echo YYxx
>         + sed -ne s/Y\{2,\}x\{0,\}/&/p
>         YYxx
>         + set +x
>         + echo YYxxz
>         + grep -e Y\{2,\}x\{0,\}z
>         YYxxz
>         + echo YYxxz
>         + sed -ne s/Y\{2,\}x\{0,\}z/&/p
>         YYxxz
>         + set +x
>         + echo YYxxz
>         + grep -e Y\{0,\}x\{0,\}z
>         YYxxz
>         + echo YYxxz
>         + sed -ne s/Y\{0,\}x\{0,\}z/&/p
>         YYxxz
>         + set +x
>         + echo YYxyxy
>         + grep -e Y\{2,\}\(xy\)\1
>         YYxyxy
>         + echo YYxyxy
>         + sed -ne s/Y\{2,\}\(xy\)\1/&/p
>         YYxyxy
>         + set +x
>         + echo YYxyxy
>         + grep -e Y\{0,\}\(xy\)\1
>         + echo YYxyxy
>         + sed -ne s/Y\{0,\}\(xy\)\1/&/p
>         + set +x
>         + echo YYxyxy
>         + grep -e Y*\(xy\)\1
>         YYxyxy
>         + echo YYxyxy
>         + sed -ne s/Y*\(xy\)\1/&/p
>         YYxyxy
>         + set +x
>         + echo YYxyxy
>         + grep -e Y\{0,\}\(xy\)xy
>         YYxyxy
>         + echo YYxyxy
>         + sed -ne s/Y\{0,\}\(xy\)xy/&/p
>         YYxyxy
>         + set +x
> > Fix:
>         No known general work-around
> 
> 

Hi,

I can reproduce on current. Do you have an idea if NetBSD or FreeBSD
suffer from te same?

        -Otto

Reply via email to