On Mon, 29 Jun 2026 at 10:21, Tomasz Kaminski <[email protected]> wrote: > > > > On Mon, Jun 29, 2026 at 10:34 AM Jonathan Wakely <[email protected]> > wrote: >> >> >> >> On Mon, 29 Jun 2026, 07:23 Tomasz Kaminski, <[email protected]> wrote: >>> >>> >>> >>> On Fri, Jun 26, 2026 at 7:02 PM Jonathan Wakely <[email protected]> wrote: >>>> >>>> Although the systemd docs say that /etc/localtime should be a symlink to >>>> one of the zoneinfo files, some systems make it a symlink to another >>>> path, where that second path is a symlink to a zoneinfo file (e.g. if >>>> /etc is mounted read-only then /etc/localtime can be a symlink to >>>> another symlink on a writable disk, so that the system timezone can be >>>> altered by re-pointing the symlink on the writable disk). >>>> >>>> In that case, using readlink would only tell us the location of the >>>> second symlink, not which zoneinfo file it points to. Therefore, we >>>> would not be able to extract a valid time zone name from the path, and >>>> chrono::current_zone() would fail. >>>> >>>> To support multiple symlinks we could recursively keep resolving >>>> symlinks with readlink until we reach a path from which we can extract a >>>> zone name. Alternatively, we can just use realpath to resolve all >>>> symlinks to a physical file, which will be . This means we only need one >>>> system call and >>>> don't need the extra complexity of calling readlink in a loop. >>>> >>>> The realpath system call also removes redunant slashes, so we can remove >>>> the code that did that manually. >>>> >>>> The possible downsides of this approach that I'm aware of are: >>>> >>>> - When /etc/localtime is a symlink to /invalid/Europe/London but that >>>> file doesn't exist. With the previous implementation we would have >>>> resolved that symlink to the zone "Europe/London" as long as that name >>>> is known to the current chrono::tzdb object. With this change, we >>>> won't get a valid zone name and current_zone() will fail. I'm not sure >>>> how realistic this case is. It might be plausible if libstdc++ is >>>> using the embedded static copy of tzdata.zi and there are no zoneinfo >>>> files on disk at all. In that case the system might still use >>>> /etc/localtime to name a zone, even though the symlink is dangling. >>>> Maybe we could fall back to filesystem::weakly_canonical for this >>>> case? >>> >>> I would be concerned about doing that silently, as this could lead to >>> non-local errors in the deployed environment are extremely hard to debug. >>> What I mean is the situation where tzdata should be mounted. >>> If the mounting fails, we will resolve the zone without issue, >> >> >> Will we? We might resolve a name, but it still needs to be looked up in tzdb >> to find a time_zone. If there's no tzdata that will fail (unless it's one of >> the zones that is always present like UTC and GMT). > > From my reading, we fail first to the embeded static tzdb data if opening > tzdb file fails: > tzdata_stream() : istream(nullptr) > { > if (string path = zoneinfo_file(tzdata_file); !path.empty()) > { > filebuf fbuf; > if (fbuf.open(path, std::ios::in)) > { > std::construct_at(&fb, std::move(fbuf)); > this->init(&fb); > return; > } > } > std::construct_at(&sb); > this->init(&sb); > } > We can successfully look up the zone if we know about it, even if we cannot > read the zone database.
Right - but that seems orthogonal to chrono::current_zone(), and I don't think it is an argument against using weakly_canonical to deal with dangling /etc/localtime symlinks. The semantics of current_zone are to decide on a name, then use locate_zone(str) to look that name up (and maybe retry if the lookup fails and we can try a different name). The presence or absence of a valid tzdb object affects the locate_zone(str) step but I don't think it should affect the process for deciding on a name. With the code on trunk today, we can still extract a name from a dangling symlink (but we can't extract a name from a chain of two or more symlinks). With my proposed patch we would be able to extract a name from a chain of symlinks, but would not be able to get anything from dangling symlinks. > Maybe I am missing something here. > >> >> >>> and then only >>> If the data requires updating, the update will fail. From that perspective >>> I much >>> more prefer loud failure. >> >> >> A dangling symlink to the Etc/GMT or UTC zone would always work OK even if >> there is no tzdata, and maybe that's the expected behaviour for some system. > > This system would be better served by having TZDB disabled, if that is the > consistent behavior. That requires a custom build of libstdc++.so whereas a minimal container could use the system libstdc++.so and remove all the tzdata files to save disk space. >> >> >>> >>> As the fact that this works is libstdc++ specific >> >> >> Which part? > > Not using the binary format. >> >> >>> (other libraries and clients >>> may use the content of the file), >> >> >> They should not do that on Linux. Maybe on other systems, and for ancient >> versions of Linux where /etc/localtime is a real file containing a copy of a >> Zif file. But there nothing we can do in that case, we can't resolve it to a >> chrino::time_zone object without comparing the file with every file under >> the zoneinfo directory. > > Why, if there use binary format, and they do not allow lookups for other > zones, you may simply load the content > of the file pointed out by current_zone. At least as far as I understand. > > For example in such case, the ancient version of linux could be supported, by > loading the content > of the /etc/localtime with time zone database, and storing it under distinct > name in tzdb. Yes, that's true. There's no guarantee that current_zone() returns one of the values in get_tzdb()->zones. And even if it does, it could be a special entry that is created by the /etc/localtime file and not one of the files under /usr/share/zoneinfo. But it wouldn't return anything useful for current_zone()->name() in that case, because there's no name in the binary TZif file. You would have to gTive it a fake name like "localtime". But even in implementations which use the binary TZif files (libc++ and HowardHinnant/date) they don't support /etc/localtime being a TZif file, as Mark de Wever said in https://github.com/llvm/llvm-project/issues/105634#issuecomment-2660976683 Libc++ and date both assume a hardcoded path such as /usr/share/zoneinfo and then simply remove that prefix from the target of /etc/localtime to get the zone name. Libc++ uses filesystem::read_symlink optionally followed by filesystem::canonical (I don't know why they don't just use filesystem::canonical in the first place) and then filesystem::relative to remove the prefix. Howard uses realpath, like my proposed patch, and then removes "*zoneinfo/" from the result, and so that also doesn't support dangling symlinks. I'll just add a comment noting that we don't support dangling symlinks, and then as you suggested we can postpone supporting them until somebody complains. > >>> I think this case (if appears) may be better >>> served by a dedicated enviroment variable. >>>> >>>> >>>> - When /etc/localtime is a symlink to /usr/share/zoneinfo/Foo/Bar where >>>> "Foo/Bar" is a valid zone in the chrono::tzdb object, but the Bar file >>>> is another symlink to ./Baz where "Foo/Bar" is also a valid zone. >>>> With the previous implementation current_zone() would have returned >>>> the "Foo/Bar" zone. With this change it would return "Foo/Baz". I >>>> don't think it's realistic to have two zones which are distinct zones >>>> (not a Zone and a Link to it) >>> >>> Yes, and the link name is not accessible to the program: zone->name >>> returns the target zone, so it is not observable if we located Asia/Istanbul >>> (that was symlink to Europe/Istanbul) or Asia/Istanbul. I think that is OK. >>>> >>>> but where one of them is defined on-disk >>>> using a symlink to the other. >>>> >>>> libstdc++-v3/ChangeLog: >>>> >>>> PR libstdc++/125467 >>> >>> This PR is created by you. Does it come from real user complain, or this >>> was found by you? >> >> >> It was reported directly to me by a friend, as this behaviour is preventing >> their codebase from replacing Howard's date library with C++20 chrono. >> >> >>> >>>> >>>> * src/c++20/tzdb.cc (tzdb::current_zone): Use realpath to >>>> resolve the /etc/localtime symlink instead of readlink. >>>> --- >>>> >>>> Tested x86_64-linux. >>>> >>>> Should we handle the dangling symlink case, maybe by using >>>> filesystem::weakly_canonical? >>> >>> I would wait to see if anybody is actually impacted by it in non-error case. >>> I would got with this with only small modification suggested bellow. >>>> >>>> >>>> >>>> >>>> libstdc++-v3/src/c++20/tzdb.cc | 65 +++++++++++----------------------- >>>> 1 file changed, 20 insertions(+), 45 deletions(-) >>>> >>>> diff --git a/libstdc++-v3/src/c++20/tzdb.cc >>>> b/libstdc++-v3/src/c++20/tzdb.cc >>>> index 0158659f79e1..acfc57f76437 100644 >>>> --- a/libstdc++-v3/src/c++20/tzdb.cc >>>> +++ b/libstdc++-v3/src/c++20/tzdb.cc >>>> @@ -41,8 +41,13 @@ >>>> # include <ext/concurrence.h> // __gnu_cxx::__mutex >>>> #endif >>>> >>>> -#if defined(_GLIBCXX_HAVE_READLINK) && defined(_GLIBCXX_HAVE_UNISTD_H) >>>> -# include <unistd.h> // readlink >>>> +#ifdef _GLIBCXX_HAVE_UNISTD_H >>>> +# include <unistd.h> // _XOPEN_VERSION >>>> +#endif >>>> +#if defined _GLIBCXX_USE_REALPATH && _XOPEN_VERSION >= 700 >>>> +# include <stdlib.h> // malloc, free, realpath >>>> +#else >>>> +# include <filesystem> // filesystem::canonicalize >>>> #endif >>>> >>>> #ifdef _AIX >>>> @@ -2094,58 +2099,28 @@ constinit tzdb_list::_Node::NumLeapSeconds >>>> tzdb_list::_Node::num_leap_seconds; >>>> // to have a way to force a re-read. >>>> >>>> #if !defined(_AIX) && !defined(_GLIBCXX_HAVE_WINDOWS_H) >>>> -#if defined(_GLIBCXX_HAVE_READLINK) && defined(_GLIBCXX_HAVE_UNISTD_H) >>>> - string_view str; >>>> - char buf[128]; // strlen("../usr/share/zoneinfo/...") is usually < 55 >>>> - string dynbuf; >>>> // /etc/localtime should be a symlink that ends with a zone name, >>>> // e.g. /etc/localtime -> /usr/share/zoneinfo/Europe/London >>>> // >>>> https://www.freedesktop.org/software/systemd/man/latest/localtime.html >>>> // This should work on GNU/Linux, macOS, NetBSD, and OpenBSD. >>>> // Some FreeBSD systems also use a symlink for /etc/localtime. >>>> - // Use readlink directly to avoid std::filesystem overhead. >>>> - if (auto n = ::readlink("/etc/localtime", buf, sizeof(buf)); n > 0) >>>> + >>>> +#if defined _GLIBCXX_USE_REALPATH && _XOPEN_VERSION >= 700 >>>> + unique_ptr<char[], void(*)(void*)> buf{ nullptr, &::free }; >>>> + string_view str; >>> >>> A slight stylistic note: I would prefer `str` to always be string_view >>> (substr will do >>> different thing otherwise, that may bite us), and declared before the #if. >> >> >> Will do. >> >>> >>>> >>>> + // Use realpath directly to avoid std::filesystem overhead. >>>> + // We use realpath not readlink to resolve multiple levels of >>>> symlinks. >>>> + if (char* p = ::realpath("/etc/localtime", nullptr)) >>>> { >>>> - if (static_cast<size_t>(n) < sizeof(buf)) >>>> - str = string_view(buf, n); >>>> - else [[unlikely]] >>>> - { >>>> - // We read the symlink but it didn't fit in buf[], use dynbuf. >>>> - do >>>> - { >>>> - n *= 2; >>>> - dynbuf.__resize_and_overwrite(n, [](char* p, size_t len) { >>>> - auto n2 = ::readlink("/etc/localtime", p, len); >>>> - if (n2 == -1) // symlink removed or replaced by file?! >>>> - __throw_runtime_error("tzdb: error reading >>>> /etc/localtime"); >>>> - const size_t r = n2; >>>> - return r < len ? r : 0; >>>> - }); >>>> - } >>>> - while (dynbuf.empty()); >>>> - str = dynbuf; >>>> - } >>>> + buf.reset(p); >>>> + str = p; >>>> } >>>> +#else >>>> + string str = std::filesystem::canonical("/etc/localtime").string(); >>> >>> And here we would have: >>> string buf = >>> std::filesystem::canonical("/etc/localtime").string(); >>> str = buf; >>>> >>>> +#endif >>>> >>>> if (!str.empty()) I'm going to also check str != "/etc/localtime" here, so that if it's not a symlink we don't bother trying to extract a name from it. >>>> { >>>> - // Remove any redundant slashes so we can match zone names. >>>> - // e.g. /usr/share/zoneinfo/Europe//London is a valid symlink, >>>> - // but won't match against "Europe/London". >>>> - if (auto pos = str.rfind("//"); pos != str.npos) [[unlikely]] >>>> - { >>>> - if (str.data() != dynbuf.data()) >>>> - dynbuf = str; >>>> - string::size_type spos = pos; >>>> - do >>>> - { >>>> - dynbuf.erase(spos, 1); >>>> - spos = dynbuf.rfind("//", spos); >>>> - } >>>> - while (spos != dynbuf.npos); >>>> - str = dynbuf; >>>> - } >>>> - >>>> // Check the trailing components of the path against known zone >>>> names. >>>> // Valid IANA times zones can have one, two, or three parts, e.g. >>>> // "UTC", "Europe/London", and "America/Indiana/Indianapolis". >>>> @@ -2171,7 +2146,7 @@ constinit tzdb_list::_Node::NumLeapSeconds >>>> tzdb_list::_Node::num_leap_seconds; >>>> str.substr(pos + 1))) >>>> return tz; >>>> } >>>> -#endif >>>> + >>>> // Otherwise, look for a file naming the time zone. >>>> string_view files[] { >>>> "/etc/timezone", // Debian derivates >>>> -- >>>> 2.54.0 >>>>
