On Thu, Oct 11, 2018 at 08:18:37PM -0300, Alexandre Oliva wrote:
> On Oct 11, 2018, Rich Felker <dal...@libc.org> wrote:
> 
> > This is indeed the big risk for glibc right now (with lazy,
> > non-fail-safe allocation of dynamic TLS)
> 
> Yeah, dynamic TLS was a can of works in that regard even before lazy TLS
> relocations.
> 
> > that it's unlikely for vector-heavy code to be using TLS where the TLS
> > address load can't be hoisted out of the blocks where the
> > call-clobbered vector regs are in use. Generally, if such hoisting is
> > performed, the main/only advantage of avoiding clobbers is for
> > registers which may contain incoming arguments.
> 
> I see.  Well, the more registers are preserved, the better for the ideal
> fast path, but even if some are not, you're still better off than
> explicitly calling tls_get_addr...

Also, it seems gcc is failing to do this hoisting right on x86_64
right now, regardless of which TLS model is used. Bug report should be
coming soon.

> > unless there is some future-proof approach to
> > save-all/restore-all that works on all archs with TLSDESC
> 
> Please don't single-out TLSDESC as if the problem affected it alone.
> Lazy relocation with traditional PLT entries for functions are also
> supposed to save and restore all registers, and the same issues arise,

Right, but lazy relocations are a "feature" you can easily just omit,
and we do. However the only way to omit this path from TLSDESC is
installing the new TLS to all live threads at dlopen time. That's
actually not a bad idea -- it drops the compare/branch from the
dynamic tlsdesc code path, and likewise in __tls_get_addr, making both
forms of dynamic TLS (possibly considerably) faster. I'm just
concerned about whether it can be done without making thread
creation/exit significantly slower.

> except they're a lot more common.  The only difference is that failures
> to preserve registers are less visible, because most of the time you're
> resolving them to functions that abide by the normal ABI, but once
> specialized calling conventions kick in, the very same issues arise.
> TLS descriptors are just one case of such specialized calling
> conventions.  Indeed, one of the reasons that made me decide this
> arrangement was acceptable was precisely because the problem already
> existed with preexisting lazy PLT resolution.

I see. From that perspective, it's less of a constraint than
constraints that already existed elsewhere. Unfortunately from our
perspective in musl those greater constraints don't exist, and the one
imposed by TLSDESC is the unique one of its kind.

Rich

Reply via email to