https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118057

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |riscv

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
I would expect this to be always slower when vectorized unless the core is
seriously bottle-necked on the frontend.  The loads/stores need to be
decomposed to separate uops, there's no actual vector operation.  The vector op
introduces an artificial dependence between otherwise independent lanes which
could execute OOO in scalar.

I think GCC behaves better here.

Reply via email to