This is a serious bug in Clang: it generates incorrect machine code.
The code that Clang generates for the following (gawk/support/dfa.c lines 1141-1143):
((dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR ? dfaerror : dfawarn) (_("character class syntax is [[:space:]], not [:space:]")));is immediately followed by the code generated for the following (gawk/support/dfa.c line 1015):
dfaerror (_("invalid character class"));and this is incorrect because the two source code regions are not connected with each other.
You can see the bug in the attached (compressed) file dfa.s which contains the assembly language output. Here's the dfa.s file starting with line 6975:
6975 testb $4, 456(%r12) 6976 movl $dfawarn, %eax 6977 movl $dfaerror, %ebx 6978 cmoveq %rax, %rbx 6979 movl $.L.str.26, %esi 6980 xorl %edi, %edi 6981 movl $5, %edx 6982 callq dcgettext 6983 movq %rax, %rdi 6984 callq *%rbx 6985 .LBB34_144: 6986 movl $.L.str.25, %esi 6987 xorl %edi, %edi 6988 movl $5, %edx 6989 callq dcgettext 6990 movq %rax, %rdi 6991 callq dfaerrorLine 6984, which is source lines 1141-1143 call to either dfaerror or dfawarn, is immediately followed by the code for source line 1015. This means that at runtime when dfawarn returns the code immediately calls dfaerror, which is incorrect.
My guess is that Clang got confused because dfaerror is declared _Noreturn, so Clang mistakenly assumed that dfawarn is also _Noreturn, which it is not.
I worked around the Clang bug by installed the attached patch into Gnulib. Please give it a try with Gawk.
Incorrect code generation is a serious bug in Clang; can you please report it to the Clang folks? I am considering using a bigger hammer, and doing this:
#define _Noreturn /*empty*/ whenever Clang is used, until the bug is fixed.This is because if the bug occurs here it's likely that similar bugs will occur elsewhere and this sort of thing can be really subtle and hard to catch or work around in general. Clang really needs to get this fixed.
Thanks.
dfa.s.gz
Description: application/gzip
From 8805a44cf04253f63bce160054e2fbf21ab9beb1 Mon Sep 17 00:00:00 2001 From: Paul Eggert <egg...@cs.ucla.edu> Date: Sun, 1 Jan 2023 22:06:10 -0800 Subject: [PATCH] dfa: work around Clang 15 bug Problem reported by Kenton Groombridge in: https://lists.gnu.org/archive/html/bug-gawk/2022-12/msg00010.html On x86-64, Clang 15 gets confused by a call (X ? dfaerror : dfawarn) (Y) and generates the wrong code, presumably because dfaerror is _Noreturn and dfawarn is not. * lib/dfa.c (parse_bracket_exp): Reword to have one call for dfaerror, the other for dfawarn. --- ChangeLog | 11 +++++++++++ lib/dfa.c | 11 ++++++++--- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/ChangeLog b/ChangeLog index 59500558e4..ac3d388c2b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,14 @@ +2023-01-01 Paul Eggert <egg...@cs.ucla.edu> + + dfa: work around Clang 15 bug + Problem reported by Kenton Groombridge in: + https://lists.gnu.org/archive/html/bug-gawk/2022-12/msg00010.html + On x86-64, Clang 15 gets confused by a call (X ? dfaerror : + dfawarn) (Y) and generates the wrong code, presumably because + dfaerror is _Noreturn and dfawarn is not. + * lib/dfa.c (parse_bracket_exp): Reword to have one call for + dfaerror, the other for dfawarn. + 2023-01-01 Bruno Haible <br...@clisp.org> doc: Update regarding stable branches. diff --git a/lib/dfa.c b/lib/dfa.c index 57df1e0421..211e1ed18f 100644 --- a/lib/dfa.c +++ b/lib/dfa.c @@ -1138,9 +1138,14 @@ parse_bracket_exp (struct dfa *dfa) while ((wc = wc1, (c = c1) != ']')); if (colon_warning_state == 7) - ((dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR - ? dfaerror : dfawarn) - (_("character class syntax is [[:space:]], not [:space:]"))); + { + char const *msg + = _("character class syntax is [[:space:]], not [:space:]"); + if (dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR) + dfaerror (msg); + else + dfawarn (msg); + } if (! known_bracket_exp) return BACKREF; -- 2.37.2