https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50168

Aliaksei Kandratsenka <alkondratenko at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alkondratenko at gmail dot com

--- Comment #10 from Aliaksei Kandratsenka <alkondratenko at gmail dot com> ---
There is similar issue with bsr and __builtin_clz.

Looks like for __builtin_clz gcc does 31 - <bsr-result>. And 31 - __builtin_clz
does gets compiled optimized to plain bsr, but only under --march=haswell or
later amd cpus.

Under earlier cpus it generates 2 redundant 31 - arg computations.

This is easy to play with at: https://godbolt.org/g/o7gNSS

Clang-en doesn't have that same problem (but they have another. Under
-march=haswell they sometimes too strongly prefer lzcnt which returns different
result and thus requires extra computation).

Reply via email to