Hi Gerhard and Bryan, Gerhard Roth wrote on Mon, Aug 02, 2021 at 10:36:05AM +0200:
> Bryan Vyhmeister found a strange behavior in date(1): > > # date -f %s -j 1627519989 > Thu Jul 29 01:53:09 PDT 2021 > # date -u -f %s -j 1627519989 > Thu Jul 29 00:53:09 UTC 2021 > > Looks like PDT is GMT-1, which of course is wrong. > > The problem arises from the -f option. The argument of date(1) is passed > to strptime(3). Normally, this will return a broken down time in the > local timezone. This claim confused me somewhat at first. I think a more accurate statement would be that strptime(3) does not use the TZ at all. It merely parses the string and fills the fields in struct tm. The question whether the caller had any particular time zone in mind does not even arise here. Since date(1) says: ENVIRONMENT TZ The time zone to use when parsing or displaying dates. [...] the command $ date -f %H:%M -j 9:00 is correct to essentially just echo "09:00" back at you because date(1) is specified to use the same time zone for parsing and printing. > But the '%s' format makes an exception and returns a > date in UTC. That is indeed true. So for %s the time zone used for parsing is necessarily different, while the time zone used for printing is still the $TZ specified by the user (or the /etc/localtime specified by the sysadmin). So i think your approach of using timegm(3) for %s and mktime(3) otherwise is essentially correct. However, a format string can contain more characters than just a single conversion specification. For example, somebody might wish to parse an input file containing a line SSE=1627519989 and legitimately say $ date -f SSE=%s -j $(grep SSE= input_file) which still yields the wrong result even with your patch. Even worse, what are the admittedly weird commands $ date -f %s:%H -j 1627519989:15 $ date -f %H:%s -j 15:1627519989 $ date -f %H:%s:%m -j 15:1627519989:03 supposed to do? Apparently, data from later conversions is supposed to override data from earlier ones, so should the last conversion that is related to the the time zone win, i.e. %s:%H use mktime(3) but %H:%s and %H:%s:%m gmtime(3)? I would argue that is excessive complexity for little benefit, if any. One might also argue that %s:%H 1627519989:09 is more usefully interpreted as "9 o'clock UTC on the day containing 1627519989" than "9 o'clock in the local TZ on the day containing 1627519989". In addition to using a consistent input time zone, the former also avoids fun with crossing the date line that the latter might cause. The former is also easier to implement, see the patch below. What do people think? > The patch below isn't very beautiful, but fixes the problem: > > # date -f %s -j 1627519989 > Wed Jul 28 17:53:09 PDT 2021 P.S. Your patch was mangled (one missing blank line and spurious spaces inserted before tabs on some lines). Lately, it is becoming annoying that even very experienced developers no longer seem able to reliably send email containing patches that actually apply... :-( Index: date.c =================================================================== RCS file: /cvs/src/bin/date/date.c,v retrieving revision 1.56 diff -u -p -r1.56 date.c --- date.c 8 Aug 2019 02:17:51 -0000 1.56 +++ date.c 6 Aug 2021 17:48:01 -0000 @@ -219,7 +219,11 @@ setthetime(char *p, const char *pformat) } /* convert broken-down time to UTC clock time */ - if ((tval = mktime(lt)) == -1) + if (pformat != NULL && strstr(pformat, "%s") != NULL) + tval = timegm(lt); + else + tval = mktime(lt); + if (tval == -1) errx(1, "specified date is outside allowed range"); if (jflag)