https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100638
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Summary|FP16 vector compare missed |FP16 (vector) compare
|optimization on AArch64 |missed optimization on
| |AArch64
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #0)
> However even the lowered operations are inefficient:
>
> ```
> fcvt s23, h23
> fcmpe s23, #0.0
> ```
Actually that comes from expand:
```
;; _16 = _15 < 0.0;
(insn 48 47 49 (set (reg:SF 194)
(float_extend:SF (reg:HF 102 [ _15 ]))) "/app/example.c":8:16 -1
(nil))
(insn 49 48 50 (set (reg:HF 196)
(const_double:HF 0.0 [0x0.0p+0])) "/app/example.c":8:16 -1
(nil))
(insn 50 49 51 (set (reg:SF 195)
(float_extend:SF (reg:HF 196))) "/app/example.c":8:16 -1
(nil))
(insn 51 50 52 (set (reg:CCFPE 66 cc)
(compare:CCFPE (reg:SF 194)
(reg:SF 195))) "/app/example.c":8:16 -1
(nil))
```
Which can reproduce with just a simple:
```
void foo(_Float16 *x, unsigned short *out) {
*out = -(*x < 0.0f16);
}
```