On Mon, Jul 7, 2025 at 10:50 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Mon, Jul 7, 2025 at 10:39 AM Florian Weimer via Gcc <gcc@gcc.gnu.org> 
> wrote:
> >
> > H.J. proposed to switch the default for GCC 16 (turning on
> > -mtls-dialect=gnu2 by default).  This is a bit tricky because when we
> > tried to make the switch in Fedora (for eventual implementation), we hit
> > an ABI compatibility problem:
> >
> >   _dl_tlsdesc_dynamic doesn't preserve all caller-saved registers
> >   <https://sourceware.org/bugzilla/show_bug.cgi?id=31372>
> >
> > This means that changing the defaults can have backwards compatibility
> > impact with older distributions.
> >
> > (a) Do not nothing special and switch the default.  Maybe try to
> > backport the glibc fix to more release branches and distributions.  I
> > think we implicitly decided to follow this path when we decided thiswas
> > a glibc bug and not a GCC bug.  The downside is that missing the bug fix
> > can result in unexpected, difficult-to-diagnose behavior.  However, when
> > we rebuilt Fedora, the problem was exceedingly rare (we observed one
> > single failure, if I recall correctly).
> >
> > (b) Introduce binary markup to indicate that binaries may need the glibc
> > fix, and that glibc has the fix.
> >
> >   [PATCH] x86-64: Add GLIBC_ABI_GNU2_TLS [BZ #33129]
> >   
> > <https://inbox.sourceware.org/libc-alpha/20250704205341.155335-1-hjl.to...@gmail.com/>
> >
> > This requires changes to all linkers, GCC and glibc.
> >
> > (c) Introduce a new relocation type with the same behavior as
> > R_X86_64_TLSDESC.  Unpatched glibc will not support it and error out
> > during relocation processing.  Requires linker changes, GCC and glibc
> > changes.  Does not produce a nice error message, unlike the
> > GLIBC_ABI_GNU2_TLS change.  Ideally would need package manager changes
> > to produce the right dependencies (with GLIBC_ABI_GNU2_TLS, this could
> > happen automatically).
> >
> > (d) Make the GCC default conditional on the glibc version used at GCC
> > build time.  Add __memcmpeq support to GCC 16.  Maybe add
> > errno@@GLIBC_2.43 to glibc 2.43.  Even today, it is likely that binaries
> > contain at least one symbol version reference to something that is
> > relatively recent, and the __memcmpeq and errno changes would increase
> > this effect.  Combined with the backport mentioned under (a), that could
> > be enough to force glibc upgrades in pretty much all cases.  We have
> > __libc_start_main@@GLIBC_2.34, so if the glibc backports go back to 2.34
> > (or even 2.31), only shared objects suffer from this issue.  Among the
> > Fedora binaries, the outliers without dependencies on recent glibc are
> > mostly Perl modules, and I expect the errno and __memcmpeq would cover
> > at least some of these.  This is not as clean as (b) and (c), but only
> > needs glibc and GCC changes (for __memcmpeq).  It does not achieve 100%
> > bug prevention, but given that bugs seem to be rare, this may be good
> > enough.
> >
> > (e) Skip over GNU2 TLS altogether and implement inline TLS sequences
> > (GNU3 descriptors?) that do not have the dlopen incompatibility of
> > initial-exec TLS.  This is currently vaporware.  It requires nontrivial
> > glibc changes, GCC changes, linker changes, and x86-64 psABI work to
> > define new relocation types and perhaps relaxations.  This is probably
> > what we want long-term.  User experience is similar to (c), but with
> > more implementation sequences.
> >
> > For comparison with an initial-exec TLS read,
> >
> >         movq    threadvar@gottpoff(%rip), %rax
> >         movl    %fs:(%rax), %eax
> >
> > this could look like this:
> >
> >         movl    threadvar@gottpslot, %eax
> >         movq    %fs:(%rax), %rax
> >         movl    threadvar@gottlsslotoff, %ecx
> >         movl    (%rcx, %rax), %eax
> >
> > Or with the descriptor in one word:
> >
> >         movq    threadvar@gottpslotoff, %rax
> >         movq    %rax, %rdx
> >         movq    %fs:(%eax), %rax
> >         shrq    $32, %rdx
> >         movl    %(rax, %rdx), %eax
> >
> > Or with a bit shorter instruction, using a 32-bit descriptor (which
> > still could cover at least 3 GiB of TLS data per thread):
> >
> >         movl    threadvar@gottpslotoff, %rax
> >         movzbl  %al, %edx
> >         shr     $8, %eax
> >         movq    %fs:64(%edx), %rdx
> >         mov     (%rdx, %rax), %eax
> >
> > And if we want a negative TLS slot index (which glibc would not use, and
> > I think it's incompatible with local-exec TLS anyway):
> >
> >         movq    threadvar@gottpslotoff, %rax
> >         movslq  %eax, %rdx
> >         shrq    $32, %rax
> >         movq    %fs:(%rdx), %rdx
> >         movl    %(rdx, %rax), %eax
> >
> > There might be other variant sequences.
> >
> > Implementing this on the glibc side would require fundamental changes to
> > the TLS allocator, which is why this isn't straightforward.
> >
> > (f) A less ambitions variant of (e): A new TLS descriptor call back that
> > returns the address of the TLS variable, and not the offset from the
> > thread pointer.  This is much easier to implement on the glibc side.
> > The current GNU2 TLS descriptor callback is optimized for static TLS
> > access.  We can avoid a memory access in the static TLS callback if we
> > use the RDFSBASE instruction (if glibc detects run-time support).  It's
> > a new relocation type, so this too needs GCC, linker, ABI changes.
> > However, these changes are largely mechanical (except perhaps for the
> > relaxation support).  Basically, TLS accesses would change from
> >
> >         leaq    threadvar@TLSDESC(%rip), %rax
> >         call    *threadvar@TLSCALL(%rax)
> >         movl    %fs:(%rax), %eax
> >
> > to:
> >
> >         leaq    threadvar@TLSDESC2(%rip), %rax
> >         call    *threadvar@TLSCALL2(%rax)
> >         movl    (%rax), %eax
> >
> > And the implementation of the static TLS case would change from
> >
> >         endbr64
> >         movq    8(%rax), %rax
> >         retq
> >
> > to:
> >
> >         endbr64
> >         rdfsbase %rax
> >         addq    %rsi, %rax
> >         retq
> >
> > But I don't think this detour is worth it if we eventually want to land
> > on (e).
> >
> >
> > I'm personally leaning towards (d) or (a) for GCC 16.  I dislike (b).
> > And (e) is unrealistic in the short term.
>
> I think both (a) or (d) are reasonable, though I am missing a
> configure time flag to override the changed default.  Even with
> glibc fixed we likely do not want to have this change in older
> enterprise code streams given there might be unknown external
> tooling that might be confused.

Oh, and what exactly is the advantage of GNU TLS2 descriptors?

>
> Richard.
>
> > Thanks,
> > Florian
> >

Reply via email to