https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104151

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
with -fno-tree-vectorize, gcc also produce optimal code.

        mov     rax, rsi
        mov     rdx, rdi
        bswap   rax
        bswap   rdx
        ret

Guess it's related to vectorizer cost model.

Reply via email to