Hello Ingo,
thanks for looking into this.
On 8/6/21 8:13 PM, Ingo Schwarze wrote:
Hi Gerhard and Bryan,
Gerhard Roth wrote on Mon, Aug 02, 2021 at 10:36:05AM +0200:
Bryan Vyhmeister found a strange behavior in date(1):
# date -f %s -j 1627519989
Thu Jul 29 01:53:09 PDT 2021
# date -u -f %s -j 1627519989
Thu Jul 29 00:53:09 UTC 2021
Looks like PDT is GMT-1, which of course is wrong.
The problem arises from the -f option. The argument of date(1) is passed
to strptime(3). Normally, this will return a broken down time in the
local timezone.
This claim confused me somewhat at first. I think a more accurate
statement would be that strptime(3) does not use the TZ at all.
It merely parses the string and fills the fields in struct tm.
The question whether the caller had any particular time zone in
mind does not even arise here.
Since date(1) says:
ENVIRONMENT
TZ The time zone to use when parsing or displaying dates. [...]
the command
$ date -f %H:%M -j 9:00
is correct to essentially just echo "09:00" back at you
because date(1) is specified to use the same time zone
for parsing and printing.
But the '%s' format makes an exception and returns a
date in UTC.
That is indeed true.
So for %s the time zone used for parsing is necessarily different,
while the time zone used for printing is still the $TZ specified
by the user (or the /etc/localtime specified by the sysadmin).
So i think your approach of using timegm(3) for %s and mktime(3)
otherwise is essentially correct.
However, a format string can contain more characters than just
a single conversion specification. For example, somebody might
wish to parse an input file containing a line
SSE=1627519989
and legitimately say
$ date -f SSE=%s -j $(grep SSE= input_file)
which still yields the wrong result even with your patch.
You're right. I was thinking about this too, but couldn't come up with a
sensible example of how to combine '%s' with something else.
Doing a strstr(3) instead of strcmp(3) is surely the better solution.
Even worse, what are the admittedly weird commands
$ date -f %s:%H -j 1627519989:15
$ date -f %H:%s -j 15:1627519989
$ date -f %H:%s:%m -j 15:1627519989:03
supposed to do? Apparently, data from later conversions is supposed
to override data from earlier ones, so should the last conversion
that is related to the the time zone win, i.e. %s:%H use mktime(3)
but %H:%s and %H:%s:%m gmtime(3)? I would argue that is excessive
complexity for little benefit, if any.
One might also argue that %s:%H 1627519989:09 is more usefully
interpreted as "9 o'clock UTC on the day containing 1627519989"
than "9 o'clock in the local TZ on the day containing 1627519989".
In addition to using a consistent input time zone, the former also
avoids fun with crossing the date line that the latter might cause.
The former is also easier to implement, see the patch below.
What do people think?
ok gerhard@
The patch below isn't very beautiful, but fixes the problem:
# date -f %s -j 1627519989
Wed Jul 28 17:53:09 PDT 2021
P.S.
Your patch was mangled (one missing blank line and spurious spaces
inserted before tabs on some lines). Lately, it is becoming annoying
that even very experienced developers no longer seem able to reliably
send email containing patches that actually apply... :-(
Sorry about that. I will write "I shouldn't use thunderbird to submit
patches" a hundred times ;)
Gerhard
Index: date.c
===================================================================
RCS file: /cvs/src/bin/date/date.c,v
retrieving revision 1.56
diff -u -p -r1.56 date.c
--- date.c 8 Aug 2019 02:17:51 -0000 1.56
+++ date.c 6 Aug 2021 17:48:01 -0000
@@ -219,7 +219,11 @@ setthetime(char *p, const char *pformat)
}
/* convert broken-down time to UTC clock time */
- if ((tval = mktime(lt)) == -1)
+ if (pformat != NULL && strstr(pformat, "%s") != NULL)
+ tval = timegm(lt);
+ else
+ tval = mktime(lt);
+ if (tval == -1)
errx(1, "specified date is outside allowed range");
if (jflag)