https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100638

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|FP16 vector compare missed  |FP16 (vector) compare
                   |optimization on AArch64     |missed optimization on
                   |                            |AArch64

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #0)
> However even the lowered operations are inefficient:
> 
> ```
>         fcvt    s23, h23
>         fcmpe   s23, #0.0
> ```
Actually that comes from expand:
```
;; _16 = _15 < 0.0;

(insn 48 47 49 (set (reg:SF 194)
        (float_extend:SF (reg:HF 102 [ _15 ]))) "/app/example.c":8:16 -1
     (nil))

(insn 49 48 50 (set (reg:HF 196)
        (const_double:HF 0.0 [0x0.0p+0])) "/app/example.c":8:16 -1
     (nil))

(insn 50 49 51 (set (reg:SF 195)
        (float_extend:SF (reg:HF 196))) "/app/example.c":8:16 -1
     (nil))

(insn 51 50 52 (set (reg:CCFPE 66 cc)
        (compare:CCFPE (reg:SF 194)
            (reg:SF 195))) "/app/example.c":8:16 -1
     (nil))
```

Which can reproduce with just a simple:
```
void foo(_Float16 *x, unsigned short *out) {
    *out = -(*x < 0.0f16);
}
```

Reply via email to