On Sun, 11 Feb 2024 13:46:50 +0100, Bruno Haible wrote on bug-gnulib:
I wrote:
For these modules, the next function to provide in an MT-safe way is
localtime_r.
Our gmtime_r and localtime_r are MT-safe on native Windows. I ran the
test-gmtime_r-mt and test-localtime_r-mt tests for 2000 seconds each, and
they did not crash.
But the problem is that localtime() and localtime_r() on native Windows
produce nonsensical results:
- They pretend that in France, in 2007, DST began on 2007-03-11. When in
fact, it started on 2007-03-25.
- The hour is wrong.
Witness: The attached program loc.c.
On native Windows, when the 'localtime_s' function [1][2]
is not available, such as on the older Windows versions that Emacs cares
about, the solution is to use GetTimeZoneInformation [3].
None of the GetTimeZoneInformation APIs from the Windows DLLs works either.
They pretend that in German and French time zones, DST starts on March 5,
in all years. Witness: The attached program tzi.c.
So, there is no way around implementing a correct localtime_r, based on
tzdata, in Gnulib.
It will be useful
- for localtime_r on native Windows,
- for nstrftime, c_nstrftime, parse-datetime, which all take a timezone_t
argument.
For reading tzdata: The first question is how to include tzdata in gnulib.
- AFAICS, the main data file (without comments) is tzdata.zi and is about
100 KB large. It can be upgraded simply by copying the newest tzdata.zi
from a newer tzdata distribution. Including such a file in gnulib would
be OK (re copyright, number of files, total size), right?
- Whereas including all files from /usr/share/zoneinfo is probably not
acceptable (> 1300 files, ca. 6 MB total size).
- Access pattern: In a running program, very few among the time zones will
be used. Therefore, caching in memory is essential.
I was looking around to see if there was some way to leverage Windows UWP/.NET
tzdata provided in:
https://github.com/microsoft/icu
https://github.com/search?q=repo%3Amicrosoft%2Ficu%20tzdata&type=code
and noticed that some comments under:
https://learn.microsoft.com/en-ca/dotnet/core/extensions/globalization-icu#icu-on-webassembly
suggest that required data can be shrunk to ~300KB using Brotli!?
Perhaps gnulib could leverage some of the interfaces in:
https://github.com/sillsdev/icu-dotnet
as an alternative interface to access tzdata on native Windows or .NET 5+ Core
platforms, without building in tzdata and requiring updates?
Other options would be for some selection of generated data such as using
zonenow.tab, supporting only the current time onward, for the 449 time zones
currently in:
https://github.com/unicode-org/cldr/blob/main/common/supplemental/windowsZones.xml
perhaps using CBOR format and/or Brotli compression to improve on using a zip or
tar container format:
470KB -> 221KB zip, 134KB tar.gz, 69KB tar.lz, 68KB tar.xz;
or using the usual built-in POSIX TZ parser and the POSIX TZ rules output by:
tail -1 /usr/share/zoneinfo/**/*
for the selected time zones, including time zone ids and rules:
12.6KB, 4KB zip, 3.8KB gz, 3.5KB xz/lz/bz2.
--
Take care. Thanks, Brian Inglis Calgary, Alberta, Canada
La perfection est atteinte Perfection is achieved
non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add
mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut
-- Antoine de Saint-Exupéry