https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119468

--- Comment #2 from Jens Seifert <jens.seifert at de dot ibm.com> ---
popcnt + parity is slower than just
64-bit popcount and extracting last bit.
"missed-optimization" opportunity applies as well to big endian.

Optimal code:
        popcntd 3, 3
        clrldi  3, 3, 63
        blr

current code:
        popcntb 3,3
        prtyd 3,3
        extsw 3,3
        blr

prtyd has longer latency than clrldi.

Reply via email to