https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88626

--- Comment #1 from Marc Glisse <glisse at gcc dot gnu.org> ---
In my application (quite a bit bigger than the testcase...), looking at the
optimized dump, I see that the function is inlined without the
__builtin_constant_p code, but when I add the __builtin_constant_p code
(__builtin_constant_p should essentially always be false in this case), a lot
of calls remain. Writing __attribute__((always_inline)) on the function "fixes"
the performance issue, it has no measurable impact on the original code, and
gives the _bcp code the same perf as the code without _bcp. However, the
attribute is not a real solution, "always" is too strong...

Reply via email to