Hello,
On Mon, 10 Jul 2023, Alexander Monakov wrote:
> I think the main question is why you're going with this (weak) form
> instead of the (strong) form "may only clobber the low XMM regs":
I want to provide both. One of them allows more arbitrary function
definitions, the other allows more register (parts) to be preserved. I
feel both have their place.
> as Richi noted, surely for libcalls we'd like to know they preserve
> AVX-512 mask registers as well?
Yeah, mask registers. I'm still pondering this. We would need to split
the 8 maskregs into two parts. Hmm.
> Note this interacts with anything that interposes between the caller
> and the callee, like the Glibc lazy binding stub (which used to
> zero out high halves of 512-bit arguments in ZMM registers).
> Not an immediate problem for the patch, just something to mind perhaps.
Yeah, needs to be kept in mind indeed. Anything coming in between the
caller and a so-marked callee needs to preserve things.
> > I chose to make it possible to write function definitions with that
> > attribute with GCC adding the necessary callee save/restore code in
> > the xlogue itself.
>
> But you can't trivially restore if the callee is sibcalling — what
> happens then (a testcase might be nice)?
I hoped early on that the generic code that prohibits sibcalls between
call sites of too "different" ABIs would deal with this, and then forgot
to check. Turns out you had a good hunch here, it actually does a
sibcall, destroying the guarantees. Thanks! :)
> > Carefully note that this is only possible for the SSE2 registers, as
> > other parts of them would need instructions that are only optional.
>
> What is supposed to happen on 32-bit x86 with -msse -mno-sse2?
Hmm. I feel the best answer here is "that should error out". I'll add a
test and adjust patch if necessary.
> > When a function doesn't contain calls to
> > unknown functions we can be a bit more lenient: we can make it so that
> > GCC simply doesn't touch xmm8-15 at all, then no save/restore is
> > necessary.
>
> What if the source code has a local register variable bound to xmm15,
> i.e. register double x asm("xmm15"); asm("..." : "+x"(x)); ?
Makes a good testcase as well. My take: it's acceptable with the
only-sse2-preserved attribute (xmm15 will in this case be saved/restored),
and should be an error with the everything-preserved attribute (maybe we
can make an exception as here we only specify an XMM reg, instead of
larger parts).
> > To that end I introduce actually two related attributes (for naming
> > see below):
> > * nosseclobber: claims (and ensures) that xmm8-15 aren't clobbered
>
> This is the weak/active form; I'd suggest "preserve_high_sse".
But it preserves only the low parts :-) You swapped the two in your
mind when writing the reply?
> > I would welcome any comments, about the names, the approach, the attempt
> > at documenting the intricacies of these attributes and anything.
>
> I hope the new attributes are supposed to be usable with function
> pointers? From the code it looks that way, but the documentation doesn't
> promise that.
Yes, like all ABI influencing attributes they _have_ to be part of the
functions type (and hence transfer to function pointers), with appropriate
incompatible-conversion errors and warnings at the appropriate places. (I
know that this isn't always the way we're dealing with ABI-infuencing
attributes, and often refer to a decl only. All those are actual bugs.)
And yes, I will adjust the docu to be explicit about this.
Ciao,
Michael.