On Wed Sep 10, 2025 at 2:41 AM EDT, LIU Hao wrote: > 在 2025-9-10 13:41, Trevor Gross 写道: >> Just following up here, how would you like me to proceed LH? >> > > I agree that `_Float16` should be passed in a vector register, because the > hardware expects it, like in > `vcvtph2ps xmm0, xmm1` and the inverse, both the destination and source > operands are in SSE registers. > > However the commit message should probably be rephrased. This is purely a GNU > extension, and Microsoft > doc isn't helpful on this aspect.
Sounds good, I will adjust the message and resend. >> Related question, do you have any thoughts for the f128 return ABI? >> Currently it is both passed (effectively required) and returned on the >> stack. However, i128 is returned in xmm0, so it would be reasonable for >> f128 to be treated the same. This is what Clang does for both f128 and >> i128 as of recently, pass on the stack and return in xmm0. >> >> I was planning to submit a patch to return f128 in xmm0, but do you have >> any feedback before I do so? > Unlike the above, I think both of these should be passed in memory. > > We take `int quadmath_snprintf (char *s, size_t size, const char *format, > ...)` as reference. In this code: > > quadmath_snprintf(buf, sizeof buf, "%Qg", some_f128); > > Arguments are passed as follows: > > buf => RCX > sizeof buf => RDX > "%Qg" => R8 > some_f128 => ??? > > In the prologue of a variadic function, all arguments which correspond to the > ellipse are stored into > their home slots, so they have consecutive addresses, to simplify `va_arg()`. > As the callee can't know > the types and size of incoming arguments, the ABI says that floating-point > arguments must be also passed > in integer registers. > > In this case if we passed f128 in XMM3, then there would be no space if it > should be stored into its > 8-byte home shot, and the value couldn't be passed in an integer register at > all. No objection here; I agree that passing in SSE would not be correct, for the variadic reasons listed, as well as matching what Microsoft requires for __m128. So GCC (and Clang) is already handling the argument side correctly, for both i128 and f128. > When they are returned from a function, this also makes some sense if we > consider them as user-defined > structs. Do you think it is worth changing i128 to return on the stack rather than in xmm0? I agree that this seems more consistent since no other scalar integers are passed/returned in vector registers. Seems like the current ABI probably got carried over from __m128. For f128 I think returning in xmm0 still makes sense, because other float types return in xmm0. (Returning i128 indirectly and f128 in xmm0 would also match the SysV return ABI.) > I don't think passing them in SSE registers is correct, but I suggest you ask > clang maintainers for certain. I have brought this up on their discord (under #windows). - Trevor
