[Bug target/122074] Not fusim unaligned load into cmp with mask for avx512 intrinsic

pinskia at gcc dot gnu.org via Gcc-bugs Sat, 27 Sep 2025 04:25:31 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122074


Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Status|WAITING                     |UNCONFIRMED
     Ever confirmed|1                           |0
            Summary|Wrong code for avx512       |Not fusim unaligned load
                   |intrinsic                   |into cmp with mask for
                   |                            |avx512 intrinsic
           Keywords|                            |missed-optimization

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> Suffix "_u" in __m256i_u emphpasize we are > using an unaligned vector which 
> should be > processed specially

No it does not mean that. It does mean it is unaligned.
And gcc uses an unaligned load even:
        vmovdqu ymm1, YMMWORD PTR [rdi]

And which is why at -O0, the loads are via bytes.


Now there is a missed optimization of not fusing the load into the compare.

[Bug target/122074] Not fusim unaligned load into cmp with mask for avx512 intrinsic

Reply via email to