[clang] [llvm] [SystemZ] Add support for half (fp16) (PR #109164)

Ulrich Weigand via cfe-commits Wed, 23 Oct 2024 11:45:23 -0700

uweigand wrote:


> My understanding is that in GCC's `__gnu_h2f_ieee`/`__gnu_f2h_ieee` is always 
> `i32`<->`i16` (integer ABI), then `__extendhfsf2`/`__truncsfhf2` uses either 
> `int16_t` or `_Float16` on a per-target basis as controlled by 
> `__LIBGCC_HAS_HF_MODE__` (I don't know where this gets set). In LLVM 
> compiler-rt, `COMPILER_RT_HAS_FLOAT16` is the control to do the same thing 
> but it affects `extend`/`trunc` as well as `h2f`/`f2h`. I think the 
> discrepancy works out here because if a target has `_Float16`, it will never 
> be calling `__gnu_h2f_ieee` `__gnu_f2h_ieee`.

>From what I can see in the libgcc sources, `__gnu_h2f_ieee`/`__gnu_f2h_ieee` 
>is indeed always `i32`<->`i16`, but it is only present on 32-bit ARM, no other 
>platforms.   On AArch64, GCC will always use inline instructions to perform 
>the conversion.  On 32-bit and 64-bit Intel, the compiler will use inline 
>instructions if AVX512-FP16 is available; if not, but SSE2 is available, the 
>compiler will use `__extendhfsf2`/`__truncsfhf2` with a `HFmode` argument 
>(this corresponds to `_Float16`, i.e. it is passed in SSE2 registers, not like 
>an integer); if not even SSE2 is available, using the type will result in an 
>error.

I never see `__extendhfsf2`/`__truncsfhf2` being used with `int16_t`, even in 
principle, on any platform in libgcc.  There is indeed a setting 
`__LIBGCC_HAS_HF_MODE__` (controlled indirectly by the GCC target back-end's 
`TARGET_LIBGCC_FLOATING_POINT_MODE_SUPPORTED_P` setting), but the only thing 
that appears to be controlled by this flag is whether routines for complex 
multiplication and division (`__mulhc3` / `__divhc3`) are being built.   Am I 
missing something here?

 > From your first two sentences it sounds like `f16` is getting passed in a FP 
 > register but going 
 > FP->GPR->__gnu_h2f_ieee->FP->some_math_op->FP->__gnu_f2h_ieee->GPR->FP? I 
 > think it makes sense to either always pass `f16` as `i16` and avoid the FP 
 > registers, or make `_Float16` available so `COMPILER_RT_HAS_FLOAT16` can be 
 > used.
> 
> @uweigand mentioned figuring out an ABI for `_Float16`, is this possible? 
> That seems like the best option.

Yes, we're working on that.  What we're planning to do is to have `_Float16` be 
passed and returned in the same way as `float` and `double`, i.e. using (part 
of) certain floating-point registers.  These registers are available on every 
SystemZ architecture level, so we would not have to guard their use (like Intel 
does with the SSE2 registers).
 
> A quick check seems to show that GCC 13 does not support `_Float16` on s390x, 
> nor does the crossbuild `libgcc.a` provide `__gnu_h2f_ieee`, 
> `__gnu_f2h_ieee`, `__extendhfsf2`, or `__truncsfhf2`. So I think LLVM will be 
> the one to set the precedent here.

Yes, we'd have to add those.  I don't think we want `__gnu_h2f_ieee` or 
`__gnu_f2h_ieee` as those are ARM-only.  We'd be defining and using 
`__extendhfsf2` and `__truncsfhf2`, which would be defined with `_Float16` 
arguments passed in floating-point registers.  Either way, we should define the 
same set of routines (with the same ABI) in libgcc and compiler-rt.

> Note that there are some common issues with these conversions, would probably 
> be good to test against them if possible #97981 #97975.

Thanks for pointing this out!

https://github.com/llvm/llvm-project/pull/109164
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SystemZ] Add support for half (fp16) (PR #109164)

Reply via email to