Yesterday's change to regex.m4 has the effect that now, gnulib's regex code gets used even on glibc systems. As a consequence, the ASAN+UBSAN build in gnulib's CI now fails:
FAIL: test-regex ../../gllib/regexec.c:188:36: runtime error: variable length array bound evaluates to non-positive value 0 What the clang UBSAN is complaining about is this definition of the regexec function: int regexec (const regex_t *__restrict preg, const char *__restrict string, size_t nmatch, regmatch_t pmatch[_REGEX_NELTS (nmatch)], int eflags) { ... } According to ISO C23 § 6.7.6.2.(5) the value of nmatch must be > 0 here. Quote: "If the size is an expression that is not an integer constant expression: if it occurs in a declaration at function prototype scope, it is treated as if it were replaced by *; otherwise, each time it is evaluated it shall have a value greater than zero." (Here we're in a function definition, not a function prototype.) But the comments in regexec.c:174..175 indicate that nmatch is allowed to be 0, and apparently the test suite exercises this case. So, we can't use the syntax size_t nmatch, regmatch_t pmatch[nmatch] here — it is undefined behaviour. I tried two patches, attached below. The second one has the advantage that it leaves the declaration of regexec() intact, which is a plus for static analyzers. But it introduces a new warning: In file included from ../../gllib/regex.c:71: ../../gllib/regexec.c:192:29: warning: argument 'pmatch' of type 'regmatch_t[]' with mismatched bound [-Warray-parameter] 192 | size_t nmatch, regmatch_t pmatch[/* nmatch */], int eflags) | ^ ../../gllib/regex.h:687:18: note: previously declared as 'regmatch_t[restrict __nmatch]' here 687 | regmatch_t __pmatch[_Restrict_arr_ | ^ So, I'm committing the first one. Bruno
From e9e73bdeab431f29bb263b757bc8558796e475f6 Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Mon, 14 Apr 2025 16:00:13 +0200 Subject: [PATCH] regex: Fix undefined behaviour. * lib/regex.h (_REGEX_NELTS): Define to empty; don't use ISO C99 variable-length arrays. --- ChangeLog | 6 ++++++ lib/regex.h | 8 ++++++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/ChangeLog b/ChangeLog index 4aa2a83c08..0b1d316a24 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2025-04-14 Bruno Haible <br...@clisp.org> + + regex: Fix undefined behaviour. + * lib/regex.h (_REGEX_NELTS): Define to empty; don't use ISO C99 + variable-length arrays. + 2025-04-14 Bruno Haible <br...@clisp.org> select tests: Work around a Cygwin bug. diff --git a/lib/regex.h b/lib/regex.h index ff7e43b534..0eb72ce908 100644 --- a/lib/regex.h +++ b/lib/regex.h @@ -523,8 +523,12 @@ typedef struct /* Declarations for routines. */ #ifndef _REGEX_NELTS -# if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \ - && !defined __STDC_NO_VLA__) +/* The macro _REGEX_NELTS denotes the number of elements in a variable-length + array passed to a function. + It was meant to make use of ISO C99 variable-length arrays, but this does + not work: ISO C23 ?? 6.7.6.2.(5) requires the number of elements to be > 0, + but the NMATCH argument to regexec() is allowed to be 0. */ +# if 0 # define _REGEX_NELTS(n) n # else # define _REGEX_NELTS(n) -- 2.43.0
From 48e8974874bd5fad45904aed9679ee25b5caefbe Mon Sep 17 00:00:00 2001 From: Bruno Haible <br...@clisp.org> Date: Mon, 14 Apr 2025 16:15:27 +0200 Subject: [PATCH] regex: Fix undefined behaviour. * lib/regex.h (_REGEX_NELTS): Add comment. * lib/regexec.c (regexec): Don't use ISO C variable-length array syntax for the pmatch parameter. --- ChangeLog | 7 +++++++ lib/regex.h | 2 ++ lib/regexec.c | 6 +++++- 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 4aa2a83c08..a835a069d6 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2025-04-14 Bruno Haible <br...@clisp.org> + + regex: Fix undefined behaviour. + * lib/regex.h (_REGEX_NELTS): Add comment. + * lib/regexec.c (regexec): Don't use ISO C variable-length array syntax + for the pmatch parameter. + 2025-04-14 Bruno Haible <br...@clisp.org> select tests: Work around a Cygwin bug. diff --git a/lib/regex.h b/lib/regex.h index ff7e43b534..191bd26836 100644 --- a/lib/regex.h +++ b/lib/regex.h @@ -522,6 +522,8 @@ typedef struct /* Declarations for routines. */ +/* The macro _REGEX_NELTS denotes the number of elements in a variable-length + array passed to a function. */ #ifndef _REGEX_NELTS # if (defined __STDC_VERSION__ && 199901L <= __STDC_VERSION__ \ && !defined __STDC_NO_VLA__) diff --git a/lib/regexec.c b/lib/regexec.c index 6923394a08..1f902b1ef6 100644 --- a/lib/regexec.c +++ b/lib/regexec.c @@ -183,9 +183,13 @@ static reg_errcode_t extend_buffers (re_match_context_t *mctx, int min_len); Return 0 if a match is found, REG_NOMATCH if not, REG_BADPAT if EFLAGS is invalid. */ +/* The declaration of the PMATCH parameter cannot make use of ISO C99 + variable-length arrays: ISO C23 ?? 6.7.6.2.(5) requires the number of + elements to be > 0, but the NMATCH argument is allowed to be 0. */ + int regexec (const regex_t *__restrict preg, const char *__restrict string, - size_t nmatch, regmatch_t pmatch[_REGEX_NELTS (nmatch)], int eflags) + size_t nmatch, regmatch_t pmatch[/* nmatch */], int eflags) { reg_errcode_t err; Idx start, length; -- 2.43.0