On Wed, 11 May 2022 at 09:15, Reuben Thomas <r...@sc3d.org> wrote:

>
> I'm happy to prepare a patch in this case. I would simply remove all
> mention of syntax tables, as that functionality is no longer available.
>

Attached. Here's the commit message to explain what I've done:

    Remove mention of both Emacs and non-Emacs syntax tables, as these are
no
    longer supported by the code; instead, fixed character classes are used.
    Document the word character class (alnum + _).

    Replace mentions of #defining emacs with RE_NO_GNU_OPS (which takes
effect
    in the opposite sense); merge the node “GNU Emacs Operators” into “GNU
    Operators”.

    For \` and \', refer to the “whole string” rather than the (Emacs)
“buffer”.

    Leave a TODO to document the classes that can be used with \s and \S.
(This
    was not previously documented, and is best left to another commit.)

-- 
https://rrt.sc3d.org
From 72bdacccbd3e6cc3eb6e16549cf51ea9e7321ae2 Mon Sep 17 00:00:00 2001
From: Reuben Thomas <r...@sc3d.org>
Date: Wed, 11 May 2022 11:47:00 +0100
Subject: [PATCH] doc/regex.texi: remove Emacs-specific documentation; match
 code
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Remove mention of both Emacs and non-Emacs syntax tables, as these are no
longer supported by the code; instead, fixed character classes are used.
Document the word character class (alnum + _).

Replace mentions of #defining emacs with RE_NO_GNU_OPS (which takes effect
in the opposite sense); merge the node “GNU Emacs Operators” into “GNU
Operators”.

For \` and \', refer to the “whole string” rather than the (Emacs) “buffer”.

Leave a TODO to document the classes that can be used with \s and \S. (This
was not previously documented, and is best left to another commit.)
---
 doc/regex.texi | 113 +++++++++++++------------------------------------
 1 file changed, 30 insertions(+), 83 deletions(-)

diff --git a/doc/regex.texi b/doc/regex.texi
index d21052282d..7015c8a651 100644
--- a/doc/regex.texi
+++ b/doc/regex.texi
@@ -108,8 +108,8 @@ Compiling}, for more information on compiling.
 Regex considers the current syntax to be a collection of bits; we refer
 to these bits as @dfn{syntax bits}.  In most cases, they affect what
 characters represent what operators.  We describe the meanings of the
-operators to which we refer in @ref{Common Operators}, @ref{GNU
-Operators}, and @ref{GNU Emacs Operators}.
+operators to which we refer in @ref{Common Operators}, and @ref{GNU
+Operators}.
 
 For reference, here is the complete list of syntax bits, in alphabetical
 order:
@@ -467,15 +467,15 @@ cases @code{RE_BK_PLUS_QM}, @code{RE_NO_BK_BRACES}, @code{RE_NO_BK_VAR},
 (@pxref{Match-non-word-constituent Operator}).
 
 @item
-@samp{\`} represents the match-beginning-of-buffer
-operator and @samp{\'} represents the match-end-of-buffer operator
-(@pxref{Buffer Operators}).
+@samp{\`} represents the match-beginning-of-string
+operator and @samp{\'} represents the match-end-of-string operator
+(@pxref{Whole-string Operators}).
 
 @item
-If Regex was compiled with the C preprocessor symbol @code{emacs}
-defined, then @samp{\s@var{class}} represents the match-syntactic-class
-operator and @samp{\S@var{class}} represents the
-match-not-syntactic-class operator (@pxref{Syntactic Class Operators}).
+@samp{\s@var{class}} represents the match-syntactic-class operator and
+@samp{\S@var{class}} represents the match-not-syntactic-class operator
+(@pxref{Syntactic Class Operators}), unless the syntax bit
+@code{RE_NO_GNU_OPS} is set.
 
 @end itemize
 
@@ -1243,22 +1243,24 @@ exactly the dual of @samp{^}'s; see the previous section.  (That is,
 @node GNU Operators
 @chapter GNU Operators
 
-Following are operators that GNU defines (and POSIX doesn't).
+Following are operators that GNU defines (and POSIX doesn't) that you
+can use unless the syntax bit @code{RE_NO_GNU_OPS} is set.
 
 @menu
 * Word Operators::
-* Buffer Operators::
+* Whole-string Operators::
 @end menu
 
 @node Word Operators
 @section Word Operators
 
 The operators in this section require Regex to recognize parts of words.
-Regex uses a syntax table to determine whether or not a character is
-part of a word, i.e., whether or not it is @dfn{word-constituent}.
+Characters that are part of words, which are called
+@dfn{word-constituent}, are letters, digits, and the underscore
+(@samp{_}); more precisely, any character in the POSIX class
+@code{alnum} in the current locale, or underscore.
 
 @menu
-* Non-Emacs Syntax Tables::
 * Match-word-boundary Operator::        \b
 * Match-within-word Operator::          \B
 * Match-beginning-of-word Operator::    \<
@@ -1267,34 +1269,6 @@ part of a word, i.e., whether or not it is @dfn{word-constituent}.
 * Match-non-word-constituent Operator:: \W
 @end menu
 
-@node Non-Emacs Syntax Tables
-@subsection Non-Emacs Syntax Tables
-
-A @dfn{syntax table} is an array indexed by the characters in your
-character set.  In the ASCII encoding, therefore, a syntax table
-has 256 elements.  Regex always uses a @code{char *} variable
-@code{re_syntax_table} as its syntax table.  In some cases, it
-initializes this variable and in others it expects you to initialize it.
-
-@itemize @bullet
-@item
-If Regex is compiled with the preprocessor symbols @code{emacs} and
-@code{SYNTAX_TABLE} both undefined, then Regex allocates
-@code{re_syntax_table} and initializes an element @var{i} either to
-@code{Sword} (which it defines) if @var{i} is a letter, number, or
-@samp{_}, or to zero if it's not.
-
-@item
-If Regex is compiled with @code{emacs} undefined but @code{SYNTAX_TABLE}
-defined, then Regex expects you to define a @code{char *} variable
-@code{re_syntax_table} to be a valid syntax table.
-
-@item
-@xref{Emacs Syntax Tables}, for what happens when Regex is compiled with
-the preprocessor symbol @code{emacs} defined.
-
-@end itemize
-
 @node Match-word-boundary Operator
 @subsection The Match-word-boundary Operator (@code{\b})
 
@@ -1347,74 +1321,47 @@ This operator (represented by @samp{\W}) matches any character that is
 not word-constituent.
 
 
-@node Buffer Operators
-@section Buffer Operators
+@node Whole-string Operators
+@section Whole-string Operators
 
-Following are operators which work on buffers.  In Emacs, a @dfn{buffer}
-is, naturally, an Emacs buffer.  For other programs, Regex considers the
-entire string to be matched as the buffer.
+Following are operators which work on the whole string.
 
 @menu
-* Match-beginning-of-buffer Operator::  \`
-* Match-end-of-buffer Operator::        \'
+* Match-beginning-of-string Operator::  \`
+* Match-end-of-string Operator::        \'
+* Syntactic Class Operators::
 @end menu
 
 
-@node Match-beginning-of-buffer Operator
-@subsection The Match-beginning-of-buffer Operator (@code{\`})
+@node Match-beginning-of-string Operator
+@subsection The Match-beginning-of-string Operator (@code{\`})
 
 @cindex @samp{\`}
 
 This operator (represented by @samp{\`}) matches the empty string at the
-beginning of the buffer.
+beginning of the string.
 
-@node Match-end-of-buffer Operator
-@subsection The Match-end-of-buffer Operator (@code{\'})
+@node Match-end-of-string Operator
+@subsection The Match-end-of-string Operator (@code{\'})
 
 @cindex @samp{\'}
 
 This operator (represented by @samp{\'}) matches the empty string at the
-end of the buffer.
-
-
-@node GNU Emacs Operators
-@chapter GNU Emacs Operators
-
-Following are operators that GNU defines (and POSIX doesn't)
-that you can use only when Regex is compiled with the preprocessor
-symbol @code{emacs} defined.
-
-@menu
-* Syntactic Class Operators::
-@end menu
+end of the string.
 
 
 @node Syntactic Class Operators
 @section Syntactic Class Operators
 
 The operators in this section require Regex to recognize the syntactic
-classes of characters.  Regex uses a syntax table to determine this.
+classes of characters.
+@c TODO: What are the valid classes?
 
 @menu
-* Emacs Syntax Tables::
 * Match-syntactic-class Operator::      \sCLASS
 * Match-not-syntactic-class Operator::  \SCLASS
 @end menu
 
-@node Emacs Syntax Tables
-@subsection Emacs Syntax Tables
-
-A @dfn{syntax table} is an array indexed by the characters in your
-character set.  In the ASCII encoding, therefore, a syntax table
-has 256 elements.
-
-If Regex is compiled with the preprocessor symbol @code{emacs} defined,
-then Regex expects you to define and initialize the variable
-@code{re_syntax_table} to be an Emacs syntax table.  Emacs' syntax
-tables are more complicated than Regex's own (@pxref{Non-Emacs Syntax
-Tables}).  @xref{Syntax, , Syntax, emacs, The GNU Emacs User's Manual},
-for a description of Emacs' syntax tables.
-
 @node Match-syntactic-class Operator
 @subsection The Match-syntactic-class Operator (@code{\s}@var{class})
 
-- 
2.25.1

Reply via email to