[Bug target/119468] PPCLE: Inefficient implementation of __builtin_parityll

jens.seifert at de dot ibm.com via Gcc-bugs Wed, 09 Apr 2025 01:12:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468


--- Comment #2 from Jens Seifert <jens.seifert at de dot ibm.com> ---
popcnt + parity is slower than just
64-bit popcount and extracting last bit.
"missed-optimization" opportunity applies as well to big endian.

Optimal code:
        popcntd 3, 3
        clrldi  3, 3, 63
        blr

current code:
        popcntb 3,3
        prtyd 3,3
        extsw 3,3
        blr

prtyd has longer latency than clrldi.

[Bug target/119468] PPCLE: Inefficient implementation of __builtin_parityll

Reply via email to