Proposed CHOST change for the 64bit time_t transition
Dear all, in Gentoo Linux we want to change our CHOST triplets for 32-bit glibc systems that use 64-bit time_t, since this is technically an ABI change which breaks binary compatibility [1]. We are thinking of adding a "t64" suffix to the ABI field, resulting in for example i686-pc-linux-gnut64, armv7a-unknown-linux-gnueabihft64, ... [2] * So far my research indicates that in the GNU toolchain (gcc, glibc, binutils) anything behind -gnu is ignored (as ABI version, which this effectively is too). Is this correct or do you foresee problems here? I've had a small chroot rebuild itself (the Gentoo @system set) with i686-pc-linux-gnut64 and only had to add a minor patch to ncurses [3]; everything else worked fine. * clang at the moment expects one of a list of known suffixes (e.g. *-gnu, *-gnueabi, *-gnueabihf). Could this be fixed to be similarly permissive? * I could imagine glibc defaulting to the 64bit interface or hard-enabling it if t64 is present in the ABI field. That would certainly help to enforce binary consistency. We would need then either an automated mechanism based on CHOST or a glibc configure option to hard-enable 64bit time_t support [4]. Not hard-required by Gentoo, we can just force the defines into everything, but would-be-neat. * In an ideal world this change would be synchronized across distributions. Opinions? [5] Deliberately pushing this e-mail out now so maybe it can be discussed at the cauldron. I won't be there, but Sam James and Arsen Arsenovic will be. If this proposal fails, the alternative for us is to add a _t64 suffix to the vendor field, resulting in e.g. i686-pc_t64-linux-gnu. The vendor field is pretty much ignored everywhere, so going alone is safe. Still, that's then a purely Gentoo ugly hack... Cheers, Andreas [1] The ABI of glibc does technically NOT change, however, the type definition of, e.g., time_t does. And as soon as any other library includes that in its public interfaces and data structures, that library changes its ABI. An example for an affected library (found real-world during testing) is gnutls, see https://bugs.gentoo.org/828001 [2] We've brought up this issue previously, just somehow it never caught momentum. See, e.g., * https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html * https://sourceware.org/pipermail/libc-alpha/2023-January/144963.html A more detailed discussion of different possible approaches in Gentoo can be found on a wiki page maintained by Sam, https://wiki.gentoo.org/wiki/Project:Toolchain/time64_migration Discussions within Gentoo have led to the conclusion that a new CHOST makes most sense, with the old one staying at 32bit time_t for legacy binary support as deprecated option. [3] https://bpa.st/HV6BS [4] https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html [5] Note that this entire issue / proposal only affects 32bit architectures and distributions. For Gentoo this would be ix86, arm(32), hppa, mips(32), m68k, ppc(32). riscv32 is special since from beginning it only has the 64bit time_t interface. -- Andreas K. Hüttel dilfri...@gentoo.org Gentoo Linux developer (council, comrel, toolchain, base-system, perl, libreoffice) https://wiki.gentoo.org/wiki/User:Dilfridge signature.asc Description: This is a digitally signed message part.
Re: value-range.cc:2165: ICE in invert
Am 03.09.24 um 18:12 schrieb Andrew MacLeod: On 8/25/24 03:48, Richard Biener wrote: On Sat, Aug 24, 2024 at 6:19 PM Georg-Johann Lay wrote: Trying to use the value-range interface and functions I am running into that ICE when using invert(). From what the sources suggest, invert() computes the complement of the current set (the union of finitely many intervals). For example, when I have a range of [-128, -1] for int8_t, invert() runs into that ICE because that interval is undefined or varying. Take for example, this code that wants to take the complement of the complete interval (with respect to the complete interval given by uint8_t): tree t = unsigned_intQI_type_node; int n_bits = TYPE_PRECISION (t); int_range<2, false /*resizable*/ > j(t); This creates j as up to 2 subranges, and initializes it to VARYING... the maximal range // j is undefined(?), thus set its bounds to the entire range. j.set (t, wi::to_wide (j.lbound()), wi::to_wide (j.ubound())); If j were UNDEFINED, j.lbound() and j.ubound () would trap because UNDEFINED has no known bounds. It is the empty set. as it was initialized to VARYING, this j.set statement is a no-op because you are setting to the same lower and upper bounds. j.invert(); : internal compiler error: in invert, at value-range.cc:2165 And you cannot use invert () on UNDEFINED or VARYING values, for reasons I will explain below. This should just return the empty set; no? And vice versa, the complement of the empty set (with respect to the whole interval) should be the whole interval provided by t, no? If I understand correctly, the int_range<> implementation does not implement a semi-group, and there are sets / intervals that are not allowed, like the empty set or intervals that touch a boundary. Does this mean that I have to test and special-case all these cases and do them by hand? Are there more invalid / disallowed intervals? A lot of functions do not allow undefined_p () ranges, but it's odd that invert doesn't handle varying_p (). But yes, the range API (unfortunately) requires you to check for certain exceptional cases beforehand even though they are not really exceptional for range arithmetic in general. So I guess it is simpler when I write my own interval arithmetic that handles these cases and behaves like a semi group. As there will be at most 3 sub-intervals, that's the way to go...? Isn't there a knob to tell that complement is with respect to the range as provided by type(), and not w.r.t. the integers? Andrew would have to answer that but I think that's how ranges behave. They just have those API requirements that are sometimes annoying. Richard. Special casing of UNDEFINED dates back to concessions that were made when irange was introduced. UNDEFINED does not have a type, and therefore, we have no way to invert it because we do not know what type the resulting VARYING should be. Thats the practical answer. UNDEFINED can also be interpreted differently in different contexts. From an arithmetic point of view, it is the opposite of VARYING. How we interpret it when we use it varies on a case by case basis. For instance, an uninitialized value is treated as UNDEFINED for some purposes, but VARYING for others. An outgoing edge from a conditional that results in UNDEFINED points to unreachable code, but if that code remains in the IL, we have to interpret it sometimes as VARYING and produce results. There are also times when we take UNDEFINED and the optimizer chooses a convenient value for it. Therefore it is safer for the caller to handle manipulations of UNDEFINED, then we get no surprises. As for VARYING, we make the caller handle that case for invert () because by definition: range.invert ().invert () should result in no change to 'range'. If range was VARYING, producing UNDEFINED would then cause the second call to invert() to fail because there is no type associated with range any more. In addition, give the lack of clarity on how the consumer will treat the resulting UNDEFINED, it was not clear that it would always be the expected value. There are not many uses in invert() in our code, and all of them are in well defined circumstances that are not VARYING nor UNDEFINED. If your code consistently handles things this way, you can create your own functional definition of invert that also takes the type void my_invert (irange &r, type t) { if (r.undefined_p ()) r.set_varying (t); else if (r.varying_p ()) r.set_undefined (); else r.invert (); } invert() and all other range operations are always relative to the values of range.type (). IIRC there was a lot of pain related to enums and bitranges for this, but for normal types that have well defined endpoints, there shouldn't be any other surprises. There are no knobs. Many of the other routines do not need special cases for UNDEFINED ( like union_ () and intersect ()),
PIC-text, ABS-data code generation
I've inherited a RISCV port, where we need to generate code such that text references are PIC, but data references are absolute. 1) Does such a mode have a name? It seems very eXecuteInPlace-like? It's be a shame to make up a new name for it. 2) The target is bare metal, no VM, no shared objects. The loader places code at an arbitrary address, but static data is at a fixed location. The code is self-contained, so no references to external libraries (say). 3) I'm pretty sure no GOT is needed in this mode -- everything can be resolved by the static linker. (exception being if there are indeed calls to external locations). 4) The only dynamic relocs needed will be for static data that's statically initialized to a function pointer -- or other text-section resident objects, such as .rodata (maybe). Am I missing something? nathan -- Nathan Sidwell
Re: Proposed CHOST change for the 64bit time_t transition
On Wed, Sep 04, 2024 at 17:48:04 +0200, Andreas K. Huettel wrote: > Dear all, > > in Gentoo Linux we want to change our CHOST triplets for 32-bit glibc systems > that use 64-bit time_t, since > this is technically an ABI change which breaks binary compatibility [1]. > > We are thinking of adding a "t64" suffix to the ABI field, resulting in for > example i686-pc-linux-gnut64, > armv7a-unknown-linux-gnueabihft64, ... [2] > > * So far my research indicates that in the GNU toolchain (gcc, glibc, > binutils) anything behind -gnu is > ignored (as ABI version, which this effectively is too). Is this correct or > do you foresee problems here? > I've had a small chroot rebuild itself (the Gentoo @system set) with > i686-pc-linux-gnut64 and only had to > add a minor patch to ncurses [3]; everything else worked fine. > ... > [3] https://bpa.st/HV6BS FYI, this paste expires in 1 week. Here's the contents: diff '--color=auto' -ruN ncurses-6.4.orig/aclocal.m4 ncurses-6.4/aclocal.m4 --- ncurses-6.4.orig/aclocal.m4 2024-08-29 20:47:34.978057133 + +++ ncurses-6.4/aclocal.m4 2024-08-29 20:48:57.809473044 + @@ -10139,7 +10139,7 @@ cf_xopen_source="-D_SGI_SOURCE" cf_XOPEN_SOURCE= ;; -(linux*gnu|linux*gnuabi64|linux*gnuabin32|linux*gnueabi|linux*gnueabihf|linux*gnux32|uclinux*|gnu*|mint*|k*bsd*-gnu|cygwin|msys|mingw*|linux*uclibc) +(linux*gnu*|uclinux*|gnu*|mint*|k*bsd*-gnu|cygwin|msys|mingw*|linux*uclibc) CF_GNU_SOURCE($cf_XOPEN_SOURCE) ;; (minix*) signature.asc Description: PGP signature
Re: Proposed CHOST change for the 64bit time_t transition
On 2024-09-04 17:48 +0200, Andreas K. Huettel wrote: > Dear all, > > in Gentoo Linux we want to change our CHOST triplets for 32-bit glibc systems > that use 64-bit time_t, since > this is technically an ABI change which breaks binary compatibility [1]. > * In an ideal world this change would be synchronized across distributions. > Opinions? [5] Debian considered this issue over the last 18 months/2 years, and found very little enthusiasm for making new triplets. Every distro that is using 64-bit time (on 32-bit arches) just enabled the flags and changed the ABI without setting a new arch/triplet (or they have dropped 32-bit stuff entirely so sidestepped the issue). Given this, and because users would like to just be upgraded to 64-bit time, not have to install a new architecture, Debian and Ubuntu decided not to try and push a new triplet and do a library-name ABI update within the architecture(s). That went ahead between March and June this year and is now pretty-much done, modulo a few outstanding bugs (https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-...@lists.debian.org;tag=time-t). Debian's thinking and status is summarised here: https://wiki.debian.org/ReleaseGoals/64bit-time So it's interesting that in fact Gentoo _does_ want to do this, but it seems to me that this is now a done deal, and 'everyone' has already switched within the existing triplets, even Debian, which is the hardest place to do this because it involved 1100 library transitions, with another 3500-odd rebuilds. > [1] The ABI of glibc does technically NOT change, however, the type > definition of, e.g., time_t does. > And as soon as any other library includes that in its public interfaces > and data structures, that library > changes its ABI. > An example for an affected library (found real-world during testing) is > gnutls, see > https://bugs.gentoo.org/828001 Yes. We did a big ABI analysis to find out how many libraries were affected (including LFS which glibc ties into this transition) and it was about 700. (there were quite a few more where the automated ABI tools failed and it was easier to do a transition than work out why, so we ended up transitioning 1093 source packages (https://people.canonical.com/~vorlon/armhf-time_t/source-packages). So yes it's an ABI change, but we don't always change the triplet for this, sometimes we just move the baseline along. This happened in the last for glibc 5->6 and libstdc++ v4 to v5 and the long-double redefinition in s390,alpha, sparc, powerpc. In fact, considering the whole-distro collective ABI, this happens every time there is an ABI change in a library. The arch/triplet remains the same, but the new release has a different ABI in some number of libraries and dependencies. So it's always a choice whether the triplet should change or it is just treated as an update, and usually the latter is chosen. This was a borderline case that could have gone either way, but people decided to do it as a transition, not a new triplet. Are you sure gentoo will gain enough value from going it alone here and defining a new triplet? Because every other distro will have (has already got in fact) the t64 ABI with the old triplet. > [2] We've brought up this issue previously, just somehow it never caught > momentum. See, e.g., > * https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html > * https://sourceware.org/pipermail/libc-alpha/2023-January/144963.html > A more detailed discussion of different possible approaches in Gentoo can > be found on a wiki page > maintained by Sam, > https://wiki.gentoo.org/wiki/Project:Toolchain/time64_migration > Discussions within Gentoo have led to the conclusion that a new CHOST > makes most sense, with > the old one staying at 32bit time_t for legacy binary support as > deprecated option. > > [3] https://bpa.st/HV6BS > > [4] https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html > > [5] Note that this entire issue / proposal only affects 32bit architectures > and distributions. > For Gentoo this would be ix86, arm(32), hppa, mips(32), m68k, ppc(32). > riscv32 is special since from beginning it only has the 64bit time_t > interface. > > -- > Andreas K. Hüttel > dilfri...@gentoo.org > Gentoo Linux developer > (council, comrel, toolchain, base-system, perl, libreoffice) > https://wiki.gentoo.org/wiki/User:Dilfridge Wookey -- Principal hats: Debian, Wookware, ARM http://wookware.org/ signature.asc Description: PGP signature
Re: Proposed CHOST change for the 64bit time_t transition
On Wed, Sep 4, 2024 at 7:07 PM Wookey wrote: > > On 2024-09-04 17:48 +0200, Andreas K. Huettel wrote: > > Dear all, > > > > in Gentoo Linux we want to change our CHOST triplets for 32-bit glibc > > systems that use 64-bit time_t, since > > this is technically an ABI change which breaks binary compatibility [1]. > > > * In an ideal world this change would be synchronized across distributions. > > Opinions? [5] > > Debian considered this issue over the last 18 months/2 years, and > found very little enthusiasm for making new triplets. Every distro > that is using 64-bit time (on 32-bit arches) just enabled the flags > and changed the ABI without setting a new arch/triplet (or they have > dropped 32-bit stuff entirely so sidestepped the issue). > > Given this, and because users would like to just be upgraded to 64-bit > time, not have to install a new architecture, Debian and Ubuntu > decided not to try and push a new triplet and do a library-name ABI > update within the architecture(s). That went ahead between March and > June this year and is now pretty-much done, modulo a few outstanding > bugs > (https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-...@lists.debian.org;tag=time-t). > > Debian's thinking and status is summarised here: > https://wiki.debian.org/ReleaseGoals/64bit-time > FWIW, yocto/openembedded have also done the same and offered a distro setting to the users to select 32bit time_t if they wished to but defaulted to 64bit time_t. > So it's interesting that in fact Gentoo _does_ want to do this, but it > seems to me that this is now a done deal, and 'everyone' has already > switched within the existing triplets, even Debian, which is the > hardest place to do this because it involved 1100 library transitions, > with another 3500-odd rebuilds. > > > [1] The ABI of glibc does technically NOT change, however, the type > > definition of, e.g., time_t does. > > And as soon as any other library includes that in its public interfaces > > and data structures, that library > > changes its ABI. > > An example for an affected library (found real-world during testing) is > > gnutls, see > > https://bugs.gentoo.org/828001 > > Yes. We did a big ABI analysis to find out how many libraries were > affected (including LFS which glibc ties into this transition) and it > was about 700. (there were quite a few more where the automated ABI > tools failed and it was easier to do a transition than work out why, > so we ended up transitioning 1093 source packages > (https://people.canonical.com/~vorlon/armhf-time_t/source-packages). > > So yes it's an ABI change, but we don't always change the triplet for > this, sometimes we just move the baseline along. This happened in the > last for glibc 5->6 and libstdc++ v4 to v5 and the long-double > redefinition in s390,alpha, sparc, powerpc. In fact, considering the > whole-distro collective ABI, this happens every time there is an ABI > change in a library. The arch/triplet remains the same, but the new > release has a different ABI in some number of libraries and > dependencies. > > So it's always a choice whether the triplet should change or it is > just treated as an update, and usually the latter is chosen. This was > a borderline case that could have gone either way, but people decided > to do it as a transition, not a new triplet. > > Are you sure gentoo will gain enough value from going it alone here > and defining a new triplet? Because every other distro will have (has > already got in fact) the t64 ABI with the old triplet. > > > > [2] We've brought up this issue previously, just somehow it never caught > > momentum. See, e.g., > > * https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html > > * https://sourceware.org/pipermail/libc-alpha/2023-January/144963.html > > A more detailed discussion of different possible approaches in Gentoo > > can be found on a wiki page > > maintained by Sam, > > https://wiki.gentoo.org/wiki/Project:Toolchain/time64_migration > > Discussions within Gentoo have led to the conclusion that a new CHOST > > makes most sense, with > > the old one staying at 32bit time_t for legacy binary support as > > deprecated option. > > > > [3] https://bpa.st/HV6BS > > > > [4] https://sourceware.org/pipermail/libc-alpha/2022-November/143386.html > > > > [5] Note that this entire issue / proposal only affects 32bit architectures > > and distributions. > > For Gentoo this would be ix86, arm(32), hppa, mips(32), m68k, ppc(32). > > riscv32 is special since from beginning it only has the 64bit time_t > > interface. > > > > -- > > Andreas K. Hüttel > > dilfri...@gentoo.org > > Gentoo Linux developer > > (council, comrel, toolchain, base-system, perl, libreoffice) > > https://wiki.gentoo.org/wiki/User:Dilfridge > > > Wookey > -- > Principal hats: Debian, Wookware, ARM > http://wookware.org/