On Fri, Apr 11, 2025 at 04:52:59PM +0300, Vladimir Gorsunov wrote:
>   When GNU Emacs switched to using gnulib for regular expression
>   functionality in the etags program, some features stopped working
>   (please see https://debbugs.gnu.org/cgi/bugreport.cgi?bug=76945 for
>   details). That is because RE_SYNTAX_EMACS flag combo in gnulib doesn't
>   have the corresponding flags set. This value should be updated to
>   fix etags and to better reflect the set of features GNU Emacs is
>   using at the moment

> From 76f937ae2eacb3649117e7f4c05819e82a7c42a9 Mon Sep 17 00:00:00 2001
> From: vg <v...@glums.kodeks.ru>
> Date: Fri, 11 Apr 2025 16:28:29 +0300
> Subject: [PATCH] Update RE_SYNTAX_EMACS to include features used by GNU Emacs
> 
> * lib/regex.h: macro update
> * doc/regex.texi: documentation update
> ---
>  doc/regex.texi | 3 ++-
>  lib/regex.h    | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/regex.texi b/doc/regex.texi
> index cba1e13520..9917a418be 100644
> --- a/doc/regex.texi
> +++ b/doc/regex.texi
> @@ -316,7 +316,8 @@ regular expressions.
>  The predefined syntaxes---taken directly from @file{regex.h}---are:
>  
>  @smallexample
> -#define RE_SYNTAX_EMACS 0
> +# define RE_SYNTAX_EMACS                                             \
> +  (RE_CHAR_CLASSES | RE_INTERVALS)

Hmm.  GNU m4 1.4.19 documents that its regex engine matches emacs -
but that's only because m4 uses syntax 0.  If this change is made in
gnulib, then either th m4 manual needs to patched to state that it is
similar to emacs except for lacking character classes and intervals,
or we make a non-backwards-compatible change in m4 by actually using
RE_SYNTAX_EMACS instead of 0 for the default syntax.

Since there's already another long thread on how m4 does not match
current emacs regex but why enabling intervals would break at least
autoconf 2.72, I'm inclined to update the m4 manual rather than use
RE_SYNTAX_EMACS, whether or not this patch is accepted.

What's more, this patch is incomplete; if you change RE_SYNTAX_EMACS,
then you also need to change this paragraph:

/* The following bits are used to determine the regexp syntax we
   recognize.  The set/not-set meanings are chosen so that Emacs syntax
   remains the value 0.  The bits are given in alphabetical order, and
   the definitions shifted by one from the previous bit; thus, when we
   add or remove a bit, only one other definition need change.  */

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to