On Wed Sep 10, 2025 at 2:41 AM EDT, LIU Hao wrote:
> 在 2025-9-10 13:41, Trevor Gross 写道:
>> Just following up here, how would you like me to proceed LH?
>> 
>
> I agree that `_Float16` should be passed in a vector register, because the 
> hardware expects it, like in 
> `vcvtph2ps xmm0, xmm1` and the inverse, both the destination and source 
> operands are in SSE registers.
>
> However the commit message should probably be rephrased. This is purely a GNU 
> extension, and Microsoft 
> doc isn't helpful on this aspect.

Sounds good, I will adjust the message and resend. 

>> Related question, do you have any thoughts for the f128 return ABI?
>> Currently it is both passed (effectively required) and returned on the
>> stack. However, i128 is returned in xmm0, so it would be reasonable for
>> f128 to be treated the same. This is what Clang does for both f128 and
>> i128 as of recently, pass on the stack and return in xmm0.
>> 
>> I was planning to submit a patch to return f128 in xmm0, but do you have
>> any feedback before I do so?
> Unlike the above, I think both of these should be passed in memory.
>
> We take `int quadmath_snprintf (char *s, size_t size, const char *format,
> ...)` as reference. In this code:
>
>     quadmath_snprintf(buf, sizeof buf, "%Qg", some_f128);
>
> Arguments are passed as follows:
>
>     buf          => RCX
>     sizeof buf   => RDX
>     "%Qg"        => R8
>     some_f128    => ???
>
> In the prologue of a variadic function, all arguments which correspond to the 
> ellipse are stored into 
> their home slots, so they have consecutive addresses, to simplify `va_arg()`. 
> As the callee can't know 
> the types and size of incoming arguments, the ABI says that floating-point 
> arguments must be also passed 
> in integer registers.
>
> In this case if we passed f128 in XMM3, then there would be no space if it 
> should be stored into its 
> 8-byte home shot, and the value couldn't be passed in an integer register at 
> all.

No objection here; I agree that passing in SSE would not be correct, for
the variadic reasons listed, as well as matching what Microsoft requires
for __m128. So GCC (and Clang) is already handling the argument side
correctly, for both i128 and f128.

> When they are returned from a function, this also makes some sense if we 
> consider them as user-defined 
> structs.

Do you think it is worth changing i128 to return on the stack rather
than in xmm0? I agree that this seems more consistent since no other
scalar integers are passed/returned in vector registers. Seems like the
current ABI probably got carried over from __m128.

For f128 I think returning in xmm0 still makes sense, because other
float types return in xmm0.

(Returning i128 indirectly and f128 in xmm0 would also match the SysV
return ABI.)

> I don't think passing them in SSE registers is correct, but I suggest you ask 
> clang maintainers for certain.

I have brought this up on their discord (under #windows).

- Trevor

Reply via email to