https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100638
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|FP16 vector compare missed |FP16 (vector) compare |optimization on AArch64 |missed optimization on | |AArch64 --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- (In reply to Tamar Christina from comment #0) > However even the lowered operations are inefficient: > > ``` > fcvt s23, h23 > fcmpe s23, #0.0 > ``` Actually that comes from expand: ``` ;; _16 = _15 < 0.0; (insn 48 47 49 (set (reg:SF 194) (float_extend:SF (reg:HF 102 [ _15 ]))) "/app/example.c":8:16 -1 (nil)) (insn 49 48 50 (set (reg:HF 196) (const_double:HF 0.0 [0x0.0p+0])) "/app/example.c":8:16 -1 (nil)) (insn 50 49 51 (set (reg:SF 195) (float_extend:SF (reg:HF 196))) "/app/example.c":8:16 -1 (nil)) (insn 51 50 52 (set (reg:CCFPE 66 cc) (compare:CCFPE (reg:SF 194) (reg:SF 195))) "/app/example.c":8:16 -1 (nil)) ``` Which can reproduce with just a simple: ``` void foo(_Float16 *x, unsigned short *out) { *out = -(*x < 0.0f16); } ```