Re: [PATCH] xen/smp: Speed up on_selected_cpus()

Andrew Cooper Mon, 07 Feb 2022 09:06:55 -0800

On 07/02/2022 08:11, Jan Beulich wrote:
> On 04.02.2022 21:31, Andrew Cooper wrote:
>> cpumask_weight() is a horribly expensive way to find if no bits are set, made
>> worse by the fact that the calculation is performed with the global call_lock
>> held.
>>
>> Switch to using cpumask_empty() instead, which will short circuit as soon as
>> it find any set bit in the cpumask.
>>
>> Signed-off-by: Andrew Cooper <[email protected]>
> May I suggest to drop "horribly"? How expensive one is compared to the other
> depends on the number of CPUs actually enumerated in the system.


In absolute terms perhaps, but they both scale as O(nr_cpus).  Hamming
weight has a far larger constant.

>  (And of
> course I still have that conversion to POPCNT alternatives patching pending,
> where Roger did ask for some re-work in reply to v2, but where it has
> remained unclear whether investing time into that wouldn't be in vein,
> considering some of your replies on v1. Thus would have further shrunk the
> difference, without me meaning to say the change here isn't a good one.)

There is a perfectly clear and simple way forward.  It's the one which
doesn't fight the optimiser and actively regress the code generation in
the calling functions, and add an unreasonable quantity technical debt
into the marginal paths.

I will ack a version where you're not adding complexity for negative gains.

~Andrew

Re: [PATCH] xen/smp: Speed up on_selected_cpus()

Reply via email to