Hello-

May I please ping this one (now for GCC 15)? Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html

-Lewis

On Sat, Feb 10, 2024 at 9:02 AM Lewis Hyatt <lhy...@gmail.com> wrote:
>
> Hello-
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html
>
> May I please ping this one? Thanks!
>
> On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhy...@gmail.com> wrote:
> >
> > Hello-
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704
> >
> > The below patch fixes the issue noted in the PR that extended characters
> > cannot appear in the identifier passed to a #pragma push_macro or #pragma
> > pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for
> > GCC 13 please?
> >
> > I know we just entered stage 4, however I feel this is kinda like an old
> > regression, given that the issue was not apparent until support for UCNs and
> > UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into
> > GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this
> > release. (The other major one was for extended characters in a user-defined
> > literal, that was fixed by r14-2629).
> >
> > Speaking of just entering stage 4. I do have 4 really short patches sent
> > over the past several months that never got any response. Is there any
> > chance someone may have a few minutes to look at them please? They are
> > really just like 1-3 line fixes for PRs.
> >
> > libcpp (pinged once recently):
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html
> >
> > diagnostics (pinged for 3rd time last week):
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html
>
> > -- >8 --
> >
> > The implementation of #pragma push_macro and #pragma pop_macro has to date
> > made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
> > identifier out of a string. When support was added for extended characters
> > in identifiers ($, UCNs, or UTF-8), that support was added only for the
> > "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
> > not for the ad-hoc way. Consequently, extended identifiers are not usable
> > with these pragmas.
> >
> > The logic for lexing identifiers has become more complicated than it was
> > when _cpp_lex_identifier() was written -- it now handles things like \N{}
> > escapes in C++, for instance -- and it no longer seems practical to maintain
> > a redundant code path for lexing identifiers. Address the issue by changing
> > the implementation of #pragma {push,pop}_macro to lex identifiers in the
> > expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
> > there.
> >
> > The existing implementation has some quirks because of the ad-hoc parsing
> > logic. For example:
> >
> >  #pragma push_macro("X ")
> >  ...
> >  #pragma pop_macro("X")
> >
> > will not restore macro X (note the extra space in the first string). 
> > However:
> >
> >  #pragma push_macro("X ")
> >  ...
> >  #pragma pop_macro("X ")
> >
> > actually does sucessfully restore "X". This is because the key for looking
> > up the saved macro on the push stack is the original string passed, so the
> > string passed to pop_macro needs to match it exactly. It is not that easy to
> > reproduce this logic in the world of extended characters, given that for
> > example it should be valid to pass a UCN to push_macro, and the
> > corresponding UTF-8 to pop_macro. Given that this aspect of the existing
> > behavior seems unintentional and has no tests (and does not match other
> > implementations), I opted to make the new logic more straightforward. The
> > string passed needs to lex to one token, which must be a valid identifier,
> > or else no action is taken and no error is generated. Any diagnostics
> > encountered during lexing (e.g., due to a UTF-8 character not permitted to
> > appear in an identifier) are also suppressed.
> >
> > It could be nice (for GCC 15) to also add a warning if a pop_macro does not
> > match a previous push_macro.
> >
> > libcpp/ChangeLog:
> >
> >         PR preprocessor/109704
> >         * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
> >         * errors.cc
> >         (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
> >         function.
> >         (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
> >         function.
> >         * charset.cc (noop_diagnostic_cb): Remove.
> >         (cpp_interpret_string_ranges): Refactor diagnostic suppression logic
> >         into new class cpp_auto_suppress_diagnostics.
> >         (count_source_chars): Likewise.
> >         * directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
> >         (lex_identifier_from_string): New static helper function.
> >         (push_pop_macro_common): Refactor common logic from
> >         do_pragma_push_macro and do_pragma_pop_macro; use
> >         lex_identifier_from_string instead of _cpp_lex_identifier.
> >         (do_pragma_push_macro): Reimplement using push_pop_macro_common.
> >         (do_pragma_pop_macro): Likewise.
> >         * internal.h (_cpp_lex_identifier): Remove.
> >         * lex.cc (lex_identifier_intern): Remove.
> >         (_cpp_lex_identifier): Remove.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR preprocessor/109704
> >         * c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
> >         * g++.dg/pch/pushpop-2.C: New test.
> >         * g++.dg/pch/pushpop-2.Hs: New test.
> >         * gcc.dg/pch/pushpop-2.c: New test.
> >         * gcc.dg/pch/pushpop-2.hs: New test.
> > ---
> >  libcpp/charset.cc                             |  33 +--
> >  libcpp/directives.cc                          | 175 +++++++--------
> >  libcpp/errors.cc                              |  16 ++
> >  libcpp/include/cpplib.h                       |  13 ++
> >  libcpp/internal.h                             |   1 -
> >  libcpp/lex.cc                                 |  33 ---
> >  .../c-c++-common/cpp/pragma-push-pop-utf8.c   | 203 ++++++++++++++++++
> >  gcc/testsuite/g++.dg/pch/pushpop-2.C          |  18 ++
> >  gcc/testsuite/g++.dg/pch/pushpop-2.Hs         |   9 +
> >  gcc/testsuite/gcc.dg/pch/pushpop-2.c          |  18 ++
> >  gcc/testsuite/gcc.dg/pch/pushpop-2.hs         |   9 +
> >  11 files changed, 378 insertions(+), 150 deletions(-)
> >  create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
> >  create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C
> >  create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs
> >  create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c
> >  create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs
> >
> > diff --git a/libcpp/charset.cc b/libcpp/charset.cc
> > index 54d7b9e0932..7937df7d78c 100644
> > --- a/libcpp/charset.cc
> > +++ b/libcpp/charset.cc
> > @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const 
> > cpp_string *from, size_t count,
> >    return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL);
> >  }
> >
> > -/* A "do nothing" diagnostic-handling callback for use by
> > -   cpp_interpret_string_ranges, so that it can temporarily suppress
> > -   diagnostic-handling.  */
> > -
> > -static bool
> > -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level,
> > -                   enum cpp_warning_reason, rich_location *,
> > -                   const char *, va_list *)
> > -{
> > -  /* no-op.  */
> > -  return true;
> > -}
> > -
> >  /* This function mimics the behavior of cpp_interpret_string, but
> >     rather than generating a string in the execution character set,
> >     *OUT is written to with the source code ranges of the characters
> > @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, 
> > const cpp_string *from,
> >       failing, rather than being emitted as a user-visible diagnostic.
> >       If an diagnostic does occur, we should see it via the return value of
> >       cpp_interpret_string_1.  */
> > -  bool (*saved_diagnostic_handler) (cpp_reader *, enum 
> > cpp_diagnostic_level,
> > -                                   enum cpp_warning_reason, rich_location 
> > *,
> > -                                   const char *, va_list *)
> > -    ATTRIBUTE_FPTR_PRINTF(5,0);
> > -
> > -  saved_diagnostic_handler = pfile->cb.diagnostic;
> > -  pfile->cb.diagnostic = noop_diagnostic_cb;
> > -
> > +  cpp_auto_suppress_diagnostics suppress {pfile};
> >    bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type,
> >                                         loc_readers, out);
> >
> > -  /* Restore the saved diagnostic-handler.  */
> > -  pfile->cb.diagnostic = saved_diagnostic_handler;
> > -
> >    if (!result)
> >      return "cpp_interpret_string_1 failed";
> >
> > @@ -2691,17 +2668,11 @@ static unsigned
> >  count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type)
> >  {
> >    cpp_string str2 = { 0, 0 };
> > -  bool (*saved_diagnostic_handler) (cpp_reader *, enum 
> > cpp_diagnostic_level,
> > -                                   enum cpp_warning_reason, rich_location 
> > *,
> > -                                   const char *, va_list *)
> > -    ATTRIBUTE_FPTR_PRINTF(5,0);
> > -  saved_diagnostic_handler = pfile->cb.diagnostic;
> > -  pfile->cb.diagnostic = noop_diagnostic_cb;
> > +  cpp_auto_suppress_diagnostics suppress {pfile};
> >    convert_f save_func = pfile->narrow_cset_desc.func;
> >    pfile->narrow_cset_desc.func = convert_count_chars;
> >    bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type);
> >    pfile->narrow_cset_desc.func = save_func;
> > -  pfile->cb.diagnostic = saved_diagnostic_handler;
> >    if (ret)
> >      {
> >        if (str2.text != str.text)
> > diff --git a/libcpp/directives.cc b/libcpp/directives.cc
> > index 479f8c716e8..019e4009dc9 100644
> > --- a/libcpp/directives.cc
> > +++ b/libcpp/directives.cc
> > @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const 
> > cpp_macro *);
> >  static void handle_assertion (cpp_reader *, const char *, int);
> >  static void do_pragma_push_macro (cpp_reader *);
> >  static void do_pragma_pop_macro (cpp_reader *);
> > -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *);
> > +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *,
> > +                               cpp_hashnode *);
> >
> >  /* This is the table of directive handlers.  All extensions other than
> >     #warning, #include_next, and #import are deprecated.  The name is
> > @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile)
> >    _cpp_mark_file_once_only (pfile, pfile->buffer->file);
> >  }
> >
> > -/* Handle #pragma push_macro(STRING).  */
> > -static void
> > -do_pragma_push_macro (cpp_reader *pfile)
> > +/* Helper for #pragma {push,pop}_macro.  Destringize STR and
> > +   lex it into an identifier, returning the hash node for it.  */
> > +
> > +static cpp_hashnode *
> > +lex_identifier_from_string (cpp_reader *pfile, cpp_string str)
> >  {
> > +  auto src = (const uchar *) memchr (str.text, '"', str.len);
> > +  gcc_checking_assert (src);
> > +  ++src;
> > +  const auto limit = str.text + str.len - 1;
> > +  gcc_checking_assert (*limit == '"' && limit >= src);
> > +  const auto ident = XALLOCAVEC (uchar, limit - src + 1);
> > +  auto dest = ident;
> > +  while (src != limit)
> > +    {
> > +      /* We know there is a character following the backslash.  */
> > +      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> > +       src++;
> > +      *dest++ = *src++;
> > +    }
> > +
> > +  /* We reserved a spot for the newline with the + 1 when allocating IDENT.
> > +     Push a buffer containing the identifier to lex.  */
> > +  *dest = '\n';
> > +  cpp_push_buffer (pfile, ident, dest - ident, true);
> > +  _cpp_clean_line (pfile);
> > +  pfile->cur_token = _cpp_temp_token (pfile);
> > +  cpp_token *tok;
> > +  {
> > +    /* Suppress diagnostics during lexing so that we silently ignore 
> > invalid
> > +       input, as seems to be the common practice for this pragma.  */
> > +    cpp_auto_suppress_diagnostics suppress {pfile};
> > +    tok = _cpp_lex_direct (pfile);
> > +  }
> > +
> >    cpp_hashnode *node;
> > -  size_t defnlen;
> > -  const uchar *defn = NULL;
> > -  char *macroname, *dest;
> > -  const char *limit, *src;
> > -  const cpp_token *txt;
> > -  struct def_pragma_macro *c;
> > +  if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit)
> > +    node = nullptr;
> > +  else
> > +    node = tok->val.node.node;
> >
> > -  txt = get__Pragma_string (pfile);
> > -  if (!txt)
> > +  _cpp_pop_buffer (pfile);
> > +  return node;
> > +}
> > +
> > +/* Common processing for #pragma {push,pop}_macro.  */
> > +
> > +static cpp_hashnode *
> > +push_pop_macro_common (cpp_reader *pfile, const char *type)
> > +{
> > +  const cpp_token *const txt = get__Pragma_string (pfile);
> > +  ++pfile->keep_tokens;
> > +  cpp_hashnode *node;
> > +  if (txt)
> >      {
> > -      location_t src_loc = pfile->cur_token[-1].src_loc;
> > -      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> > -                "invalid #pragma push_macro directive");
> >        check_eol (pfile, false);
> >        skip_rest_of_line (pfile);
> > -      return;
> > +      node = lex_identifier_from_string (pfile, txt->val.str);
> >      }
> > -  dest = macroname = (char *) alloca (txt->val.str.len + 2);
> > -  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 
> > 'L'));
> > -  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
> > -  while (src < limit)
> > +  else
> >      {
> > -      /* We know there is a character following the backslash.  */
> > -      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> > -       src++;
> > -      *dest++ = *src++;
> > +      node = nullptr;
> > +      location_t src_loc = pfile->cur_token[-1].src_loc;
> > +      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> > +                          "invalid #pragma %s_macro directive", type);
> > +      skip_rest_of_line (pfile);
> >      }
> > -  *dest = 0;
> > -  check_eol (pfile, false);
> > -  skip_rest_of_line (pfile);
> > -  c = XNEW (struct def_pragma_macro);
> > -  memset (c, 0, sizeof (struct def_pragma_macro));
> > -  c->name = XNEWVAR (char, strlen (macroname) + 1);
> > -  strcpy (c->name, macroname);
> > +  --pfile->keep_tokens;
> > +  return node;
> > +}
> > +
> > +/* Handle #pragma push_macro(STRING).  */
> > +static void
> > +do_pragma_push_macro (cpp_reader *pfile)
> > +{
> > +  const auto node = push_pop_macro_common (pfile, "push");
> > +  if (!node)
> > +    return;
> > +  const auto c = XCNEW (def_pragma_macro);
> > +  c->name = xstrdup ((const char *) NODE_NAME (node));
> >    c->next = pfile->pushed_macros;
> > -  node = _cpp_lex_identifier (pfile, c->name);
> >    if (node->type == NT_VOID)
> >      c->is_undef = 1;
> >    else if (node->type == NT_BUILTIN_MACRO)
> >      c->is_builtin = 1;
> >    else
> >      {
> > -      defn = cpp_macro_definition (pfile, node);
> > -      defnlen = ustrlen (defn);
> > +      const auto defn = cpp_macro_definition (pfile, node);
> > +      const size_t defnlen = ustrlen (defn);
> >        c->definition = XNEWVEC (uchar, defnlen + 2);
> >        c->definition[defnlen] = '\n';
> >        c->definition[defnlen + 1] = 0;
> > @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile)
> >  static void
> >  do_pragma_pop_macro (cpp_reader *pfile)
> >  {
> > -  char *macroname, *dest;
> > -  const char *limit, *src;
> > -  const cpp_token *txt;
> > -  struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros;
> > -  txt = get__Pragma_string (pfile);
> > -  if (!txt)
> > -    {
> > -      location_t src_loc = pfile->cur_token[-1].src_loc;
> > -      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> > -                "invalid #pragma pop_macro directive");
> > -      check_eol (pfile, false);
> > -      skip_rest_of_line (pfile);
> > -      return;
> > -    }
> > -  dest = macroname = (char *) alloca (txt->val.str.len + 2);
> > -  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 
> > 'L'));
> > -  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
> > -  while (src < limit)
> > -    {
> > -      /* We know there is a character following the backslash.  */
> > -      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> > -       src++;
> > -      *dest++ = *src++;
> > -    }
> > -  *dest = 0;
> > -  check_eol (pfile, false);
> > -  skip_rest_of_line (pfile);
> > -
> > -  while (c != NULL)
> > +  const auto node = push_pop_macro_common (pfile, "pop");
> > +  if (!node)
> > +    return;
> > +  for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = 
> > c->next)
> >      {
> > -      if (!strcmp (c->name, macroname))
> > +      if (!strcmp (c->name, (const char *) NODE_NAME (node)))
> >         {
> >           if (!l)
> >             pfile->pushed_macros = c->next;
> >           else
> >             l->next = c->next;
> > -         cpp_pop_definition (pfile, c);
> > +         cpp_pop_definition (pfile, c, node);
> >           free (c->definition);
> >           free (c->name);
> >           free (c);
> >           break;
> >         }
> >        l = c;
> > -      c = c->next;
> >      }
> >  }
> >
> > @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro)
> >  /* Replace a previous definition DEF of the macro STR.  If DEF is NULL,
> >     or first element is zero, then the macro should be undefined.  */
> >  static void
> > -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c)
> > +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode 
> > *node)
> >  {
> > -  cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name);
> > -  if (node == NULL)
> > -    return;
> > -
> >    if (pfile->cb.before_define)
> >      pfile->cb.before_define (pfile);
> >
> > @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct 
> > def_pragma_macro *c)
> >      }
> >
> >    {
> > -    size_t namelen;
> > -    const uchar *dn;
> > -    cpp_hashnode *h = NULL;
> > -    cpp_buffer *nbuf;
> > -
> > -    namelen = ustrcspn (c->definition, "( \n");
> > -    h = cpp_lookup (pfile, c->definition, namelen);
> > -    dn = c->definition + namelen;
> > -
> > -    nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true);
> > +    const auto namelen = ustrcspn (c->definition, "( \n");
> > +    const auto dn = c->definition + namelen;
> > +    const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn,
> > +                                      true);
> >      if (nbuf != NULL)
> >        {
> >         _cpp_clean_line (pfile);
> >         nbuf->sysp = 1;
> > -       if (!_cpp_create_definition (pfile, h, 0))
> > +       if (!_cpp_create_definition (pfile, node, 0))
> >           abort ();
> >         _cpp_pop_buffer (pfile);
> >        }
> >      else
> >        abort ();
> > -    h->value.macro->line = c->line;
> > -    h->value.macro->syshdr = c->syshdr;
> > -    h->value.macro->used = c->used;
> > +    node->value.macro->line = c->line;
> > +    node->value.macro->syshdr = c->syshdr;
> > +    node->value.macro->used = c->used;
> >    }
> >  }
> >
> > diff --git a/libcpp/errors.cc b/libcpp/errors.cc
> > index 295496df7ed..3228dcbe7f6 100644
> > --- a/libcpp/errors.cc
> > +++ b/libcpp/errors.cc
> > @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum 
> > cpp_diagnostic_level level,
> >    return cpp_error_at (pfile, level, loc, "%s: %s", filename,
> >                        xstrerror (errno));
> >  }
> > +
> > +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader 
> > *pfile)
> > +  : m_pfile (pfile), m_cb (pfile->cb.diagnostic)
> > +{
> > +  m_pfile->cb.diagnostic
> > +    = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason,
> > +         rich_location *, const char *, va_list *)
> > +    {
> > +      return true;
> > +    };
> > +}
> > +
> > +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics ()
> > +{
> > +  m_pfile->cb.diagnostic = m_cb;
> > +}
> > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> > index 5746aac9ea4..50705e3377a 100644
> > --- a/libcpp/include/cpplib.h
> > +++ b/libcpp/include/cpplib.h
> > @@ -1638,4 +1638,17 @@ enum cpp_xid_property {
> >
> >  unsigned int cpp_check_xid_property (cppchar_t c);
> >
> > +/* In errors.cc */
> > +
> > +/* RAII class to suppress CPP diagnostics in the current scope.  */
> > +class cpp_auto_suppress_diagnostics
> > +{
> > + public:
> > +  explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile);
> > +  ~cpp_auto_suppress_diagnostics ();
> > + private:
> > +  cpp_reader *const m_pfile;
> > +  const decltype (cpp_callbacks::diagnostic) m_cb;
> > +};
> > +
> >  #endif /* ! LIBCPP_CPPLIB_H */
> > diff --git a/libcpp/internal.h b/libcpp/internal.h
> > index a20215c5709..6221ef0d1e7 100644
> > --- a/libcpp/internal.h
> > +++ b/libcpp/internal.h
> > @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *);
> >  extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode 
> > *);
> >  extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *);
> >  extern void _cpp_init_tokenrun (tokenrun *, unsigned int);
> > -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *);
> >  extern int _cpp_remaining_tokens_num_in_context (cpp_context *);
> >  extern void _cpp_init_lexer (void);
> >  static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have,
> > diff --git a/libcpp/lex.cc b/libcpp/lex.cc
> > index 5aa379980cf..ba97377417b 100644
> > --- a/libcpp/lex.cc
> > +++ b/libcpp/lex.cc
> > @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, 
> > cpp_hashnode *node)
> >                  NODE_NAME (node));
> >  }
> >
> > -/* Helper function to get the cpp_hashnode of the identifier BASE.  */
> > -static cpp_hashnode *
> > -lex_identifier_intern (cpp_reader *pfile, const uchar *base)
> > -{
> > -  cpp_hashnode *result;
> > -  const uchar *cur;
> > -  unsigned int len;
> > -  unsigned int hash = HT_HASHSTEP (0, *base);
> > -
> > -  cur = base + 1;
> > -  while (ISIDNUM (*cur))
> > -    {
> > -      hash = HT_HASHSTEP (hash, *cur);
> > -      cur++;
> > -    }
> > -  len = cur - base;
> > -  hash = HT_HASHFINISH (hash, len);
> > -  result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table,
> > -                                             base, len, hash, HT_ALLOC));
> > -  identifier_diagnostics_on_lex (pfile, result);
> > -  return result;
> > -}
> > -
> > -/* Get the cpp_hashnode of an identifier specified by NAME in
> > -   the current cpp_reader object.  If none is found, NULL is returned.  */
> > -cpp_hashnode *
> > -_cpp_lex_identifier (cpp_reader *pfile, const char *name)
> > -{
> > -  cpp_hashnode *result;
> > -  result = lex_identifier_intern (pfile, (uchar *) name);
> > -  return result;
> > -}
> > -
> >  /* Lex an identifier starting at BASE.  BUFFER->CUR is expected to point
> >     one past the first character at BASE, which may be a (possibly 
> > multi-byte)
> >     character if STARTS_UCN is true.  */
> > diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c 
> > b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
> > new file mode 100644
> > index 00000000000..c8665960e30
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
> > @@ -0,0 +1,203 @@
> > +/* { dg-do preprocess } */
> > +/* { dg-options "-std=c11 -pedantic" { target c } } */
> > +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */
> > +/* { dg-additional-options "-Wall" } */
> > +
> > +/* PR preprocessor/109704 */
> > +
> > +/* Verify basic operations for different extended identifiers...  */
> > +
> > +/* ...dollar sign.  */
> > +#define $x 1
> > +#pragma push_macro("$x")
> > +#undef $x
> > +#define $x 0
> > +#pragma pop_macro("$x")
> > +#if !$x
> > +#error $x
> > +#endif
> > +#define $x 1
> > +_Pragma("push_macro(\"$x\")")
> > +#undef $x
> > +#define $x 0
> > +_Pragma("pop_macro(\"$x\")")
> > +#if !$x
> > +#error $x
> > +#endif
> > +#define x$ 1
> > +#pragma push_macro("x$")
> > +#undef x$
> > +#define x$ 0
> > +#pragma pop_macro("x$")
> > +#if !x$
> > +#error x$
> > +#endif
> > +#define x$ 1
> > +_Pragma("push_macro(\"x$\")")
> > +#undef x$
> > +#define x$ 0
> > +_Pragma("pop_macro(\"x$\")")
> > +#if !x$
> > +#error x$
> > +#endif
> > +
> > +/* ...UCN.  */
> > +#define \u03B1x 1
> > +#pragma push_macro("\u03B1x")
> > +#undef \u03B1x
> > +#define \u03B1x 0
> > +#pragma pop_macro("\u03B1x")
> > +#if !\u03B1x
> > +#error \u03B1x
> > +#endif
> > +#define \u03B1x 1
> > +_Pragma("push_macro(\"\\u03B1x\")")
> > +#undef \u03B1x
> > +#define \u03B1x 0
> > +_Pragma("pop_macro(\"\\u03B1x\")")
> > +#if !\u03B1x
> > +#error \u03B1x
> > +#endif
> > +#define x\u03B1 1
> > +#pragma push_macro("x\u03B1")
> > +#undef x\u03B1
> > +#define x\u03B1 0
> > +#pragma pop_macro("x\u03B1")
> > +#if !x\u03B1
> > +#error x\u03B1
> > +#endif
> > +#define x\u03B1 1
> > +_Pragma("push_macro(\"x\\u03B1\")")
> > +#undef x\u03B1
> > +#define x\u03B1 0
> > +_Pragma("pop_macro(\"x\\u03B1\")")
> > +#if !x\u03B1
> > +#error x\u03B1
> > +#endif
> > +
> > +/* ...UTF-8.  */
> > +#define πx 1
> > +#pragma push_macro("πx")
> > +#undef πx
> > +#define πx 0
> > +#pragma pop_macro("πx")
> > +#if !πx
> > +#error πx
> > +#endif
> > +#define πx 1
> > +_Pragma("push_macro(\"πx\")")
> > +#undef πx
> > +#define πx 0
> > +_Pragma("pop_macro(\"πx\")")
> > +#if !πx
> > +#error πx
> > +#endif
> > +#define xπ 1
> > +#pragma push_macro("xπ")
> > +#undef xπ
> > +#define xπ 0
> > +#pragma pop_macro("xπ")
> > +#if !xπ
> > +#error xπ
> > +#endif
> > +#define xπ 1
> > +_Pragma("push_macro(\"xπ\")")
> > +#undef xπ
> > +#define xπ 0
> > +_Pragma("pop_macro(\"xπ\")")
> > +#if !xπ
> > +#error xπ
> > +#endif
> > +
> > +/* Verify UCN and UTF-8 can be intermixed.  */
> > +#define ħ_0 1
> > +#pragma push_macro("ħ_0")
> > +#undef ħ_0
> > +#define ħ_0 0
> > +#if ħ_0
> > +#error ħ_0 ħ_0 \U00000127_0
> > +#endif
> > +#pragma pop_macro("\U00000127_0")
> > +#if !ħ_0
> > +#error ħ_0 ħ_0 \U00000127_0
> > +#endif
> > +#define ħ_1 1
> > +#pragma push_macro("\U00000127_1")
> > +#undef ħ_1
> > +#define ħ_1 0
> > +#if ħ_1
> > +#error ħ_1 \U00000127_1 ħ_1
> > +#endif
> > +#pragma pop_macro("ħ_1")
> > +#if !ħ_1
> > +#error ħ_1 \U00000127_1 ħ_1
> > +#endif
> > +#define ħ_2 1
> > +#pragma push_macro("\U00000127_2")
> > +#undef ħ_2
> > +#define ħ_2 0
> > +#if ħ_2
> > +#error ħ_2 \U00000127_2 \U00000127_2
> > +#endif
> > +#pragma pop_macro("\U00000127_2")
> > +#if !ħ_2
> > +#error ħ_2 \U00000127_2 \U00000127_2
> > +#endif
> > +#define \U00000127_3 1
> > +#pragma push_macro("ħ_3")
> > +#undef \U00000127_3
> > +#define \U00000127_3 0
> > +#if \U00000127_3
> > +#error \U00000127_3 ħ_3 ħ_3
> > +#endif
> > +#pragma pop_macro("ħ_3")
> > +#if !\U00000127_3
> > +#error \U00000127_3 ħ_3 ħ_3
> > +#endif
> > +#define \U00000127_4 1
> > +#pragma push_macro("ħ_4")
> > +#undef \U00000127_4
> > +#define \U00000127_4 0
> > +#if \U00000127_4
> > +#error \U00000127_4 ħ_4 \U00000127_4
> > +#endif
> > +#pragma pop_macro("\U00000127_4")
> > +#if !\U00000127_4
> > +#error \U00000127_4 ħ_4 \U00000127_4
> > +#endif
> > +#define \U00000127_5 1
> > +#pragma push_macro("\U00000127_5")
> > +#undef \U00000127_5
> > +#define \U00000127_5 0
> > +#if \U00000127_5
> > +#error \U00000127_5 \U00000127_5 ħ_5
> > +#endif
> > +#pragma pop_macro("ħ_5")
> > +#if !\U00000127_5
> > +#error \U00000127_5 \U00000127_5 ħ_5
> > +#endif
> > +
> > +/* Verify invalid input produces no diagnostics.  */
> > +#pragma push_macro("") /* { dg-bogus "." } */
> > +#pragma push_macro("\u") /* { dg-bogus "." } */
> > +#pragma push_macro("\u0000") /* { dg-bogus "." } */
> > +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */
> > +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */
> > +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */
> > +#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */
> > +
> > +/* Verify end-of-line diagnostics for valid and invalid input.  */
> > +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */
> > +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */
> > +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */
> > +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */
> > +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra 
> > tokens" } */
> > +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra 
> > tokens" } */
> > +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra 
> > tokens" } */
> > +#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra 
> > tokens" } */
> > +
> > +/* Verify expected diagnostics.  */
> > +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */
> > +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */
> > +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */
> > +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } 
> > */
> > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C 
> > b/gcc/testsuite/g++.dg/pch/pushpop-2.C
> > new file mode 100644
> > index 00000000000..84886aea985
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C
> > @@ -0,0 +1,18 @@
> > +/* { dg-options -std=c++11 } */
> > +#include "pushpop-2.Hs"
> > +
> > +#if π != 4
> > +#error π != 4
> > +#endif
> > +#pragma pop_macro("\u03C0")
> > +#if π != 3
> > +#error π != 3
> > +#endif
> > +
> > +#if \u03B1 != 6
> > +#error α != 6
> > +#endif
> > +_Pragma("pop_macro(\"\\u03B1\")")
> > +#if α != 5
> > +#error α != 5
> > +#endif
> > diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs 
> > b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
> > new file mode 100644
> > index 00000000000..797139a3196
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
> > @@ -0,0 +1,9 @@
> > +#define π 3
> > +#pragma push_macro ("π")
> > +#undef π
> > +#define π 4
> > +
> > +#define \u03B1 5
> > +#pragma push_macro ("α")
> > +#undef α
> > +#define α 6
> > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c 
> > b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
> > new file mode 100644
> > index 00000000000..61b8430c6d2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
> > @@ -0,0 +1,18 @@
> > +/* { dg-options -std=c11 } */
> > +#include "pushpop-2.hs"
> > +
> > +#if π != 4
> > +#error π != 4
> > +#endif
> > +#pragma pop_macro("\u03C0")
> > +#if π != 3
> > +#error π != 3
> > +#endif
> > +
> > +#if \u03B1 != 6
> > +#error α != 6
> > +#endif
> > +_Pragma("pop_macro(\"\\u03B1\")")
> > +#if α != 5
> > +#error α != 5
> > +#endif
> > diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs 
> > b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
> > new file mode 100644
> > index 00000000000..797139a3196
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
> > @@ -0,0 +1,9 @@
> > +#define π 3
> > +#pragma push_macro ("π")
> > +#undef π
> > +#define π 4
> > +
> > +#define \u03B1 5
> > +#pragma push_macro ("α")
> > +#undef α
> > +#define α 6

Reply via email to