On 2023-05-17 04:04, Adhemerval Zanella Netto wrote:
do you think it would be feasible to assume
_POSIX_TZNAME_MAX or 48 (for Factory zone) to avoid the malloc call on
strftime_l?

ALthough it's technically feasible, it'd be some work. Among other things, whatever limit we chose would have to become visible via <limits.h> and sysconf and etc., and we'd need to document that there is now a limit. Better, I think, would be to continue to follow the GNU coding standards and not impose an arbitrary limit on time zone abbreviation length, even for the highly unusual case where code is calling wcsftime or wcsftime_l.

How about avoiding the malloc in the following way? I installed the attached patch into Gnulib lib/nstrftime.c, which has forked from glibc but with some work could be merged back in, and this should also fix the glibc bugs with buffer sizes exceeding INT_MAX.

If you don't want all the hassle of merging, you can cherry-pick this patch. I haven't tested the patch, though, as Gnulib does not use this code.
From 197eec6075bbaf2b97137115a09a6bbce43b4dd4 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Wed, 17 May 2023 10:27:40 -0700
Subject: [PATCH] nstrftime: suggest to glibc how to avoid alloca

* lib/nstrftime.c (widen) [COMPILE_WIDE]: Remove.
(__strftime_internal) [COMPILE_WIDE): Instead of converting the
multibyte time zone abbreviation into a potentially unbounded
alloca buffer, convert it directly into the output buffer.
Although this code is not used in Gnulib, this can help the glibc
developers avoid the problem on the glibc side.
---
 ChangeLog       | 10 ++++++++++
 lib/nstrftime.c | 39 +++++++++++++++++++++++++--------------
 2 files changed, 35 insertions(+), 14 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index ecbc25ef06..36b3c65b81 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,13 @@
+2023-05-17  Paul Eggert  <egg...@cs.ucla.edu>
+
+	nstrftime: suggest to glibc how to avoid alloca
+	* lib/nstrftime.c (widen) [COMPILE_WIDE]: Remove.
+	(__strftime_internal) [COMPILE_WIDE): Instead of converting the
+	multibyte time zone abbreviation into a potentially unbounded
+	alloca buffer, convert it directly into the output buffer.
+	Although this code is not used in Gnulib, this can help the glibc
+	developers avoid the problem on the glibc side.
+
 2023-05-15  Bruno Haible  <br...@clisp.org>
 
 	doc: New chapter "Strings and Characters".
diff --git a/lib/nstrftime.c b/lib/nstrftime.c
index 68bb560910..35a9307e1a 100644
--- a/lib/nstrftime.c
+++ b/lib/nstrftime.c
@@ -226,15 +226,6 @@ extern char *tzname[];
 #  undef __mbsrtowcs_l
 #  define __mbsrtowcs_l(d, s, l, st, loc) __mbsrtowcs (d, s, l, st)
 # endif
-# define widen(os, ws, l) \
-  {                                                                           \
-    mbstate_t __st;                                                           \
-    const char *__s = os;                                                     \
-    memset (&__st, '\0', sizeof (__st));                                      \
-    l = __mbsrtowcs_l (NULL, &__s, 0, &__st, loc);                            \
-    ws = (wchar_t *) alloca ((l + 1) * sizeof (wchar_t));                     \
-    (void) __mbsrtowcs_l (ws, &__s, l, &__st, loc);                           \
-  }
 #endif
 
 
@@ -1374,11 +1365,31 @@ __strftime_internal (STREAM_OR_CHAR_T *s, STRFTIME_ARG (size_t maxsize)
 #ifdef COMPILE_WIDE
           {
             /* The zone string is always given in multibyte form.  We have
-               to transform it first.  */
-            wchar_t *wczone;
-            size_t len;
-            widen (zone, wczone, len);
-            cpy (len, wczone);
+               to convert it to wide character.  */
+            size_t w = pad == L_('-') || width < 0 ? 0 : width;
+            char const *z = zone;
+            mbstate_t st = {0};
+            size_t len = __mbsrtowcs_l (p, &z, maxsize - i, &st, loc);
+            if (len == (size_t) -1)
+              return 0;
+            size_t incr = len < w ? w : len;
+            if (incr >= maxsize - i)
+              {
+                errno = ERANGE;
+                return 0;
+              }
+            if (p)
+              {
+                if (len < w)
+                  {
+                    size_t delta = w - len;
+                    wmemmove (p + delta, p, len);
+                    wchar_t wc = pad == L_('0') || pad == L_('+') ? L'0' : L' ';
+                    wmemset (p, wc, delta);
+                  }
+                p += incr;
+              }
+            i += incr;
           }
 #else
           cpy (strlen (zone), zone);
-- 
2.39.2

Reply via email to