On Mon, Jul 15, 2019 at 02:39:04PM +0000, Jan Beulich wrote: > This is faster than using the software implementation, and the insn is > available on all half-way recent hardware. Therefore convert > generic_hweight<N>() to out-of-line functions (without affecting Arm) > and use alternatives patching to replace the function calls. > > Note that the approach doesn#t work for clang, due to it not recognizing > -ffixed-*.
I've been giving this a look, and I wonder if it would be fine to simply push and pop the scratch registers in the 'call' path of the alternative, as that won't require any specific compiler option. Thanks, Roger.
