Re: [PATCH 3/9] all: replace bitmap_weigth() with bitmap_{empty,full,eq,gt,le}
On Sat, Nov 27, 2021 at 07:56:58PM -0800, Yury Norov wrote: > bitmap_weight() counts all set bits in the bitmap unconditionally. > However in some cases we can traverse a part of bitmap when we > only need to check if number of set bits is greater, less or equal > to some number. > > This patch replaces bitmap_weight() with one of > bitmap_{empty,full,eq,gt,le), as appropriate. > > In some places driver code has been optimized further, where it's > trivial. > > Signed-off-by: Yury Norov > --- > arch/nds32/kernel/perf_event_cpu.c | 4 +--- > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++-- > arch/x86/kvm/hyperv.c | 8 > drivers/crypto/ccp/ccp-dev-v5.c| 5 + > drivers/gpu/drm/msm/disp/mdp5/mdp5_smp.c | 2 +- > drivers/iio/adc/mxs-lradc-adc.c| 3 +-- > drivers/iio/dummy/iio_simple_dummy_buffer.c| 4 ++-- > drivers/iio/industrialio-buffer.c | 2 +- > drivers/iio/industrialio-trigger.c | 2 +- > drivers/memstick/core/ms_block.c | 4 ++-- > drivers/net/dsa/b53/b53_common.c | 2 +- > drivers/net/ethernet/broadcom/bcmsysport.c | 6 +- > drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 4 ++-- > drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 2 +- > .../ethernet/marvell/octeontx2/nic/otx2_ethtool.c | 2 +- > .../ethernet/marvell/octeontx2/nic/otx2_flows.c| 8 > .../net/ethernet/marvell/octeontx2/nic/otx2_pf.c | 2 +- > drivers/net/ethernet/mellanox/mlx4/cmd.c | 10 +++--- > drivers/net/ethernet/mellanox/mlx4/eq.c| 4 ++-- > drivers/net/ethernet/mellanox/mlx4/main.c | 2 +- > .../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +- > drivers/net/ethernet/qlogic/qed/qed_dev.c | 3 +-- > drivers/net/ethernet/qlogic/qed/qed_rdma.c | 4 ++-- > drivers/net/ethernet/qlogic/qed/qed_roce.c | 2 +- > drivers/perf/arm-cci.c | 2 +- > drivers/perf/arm_pmu.c | 4 ++-- > drivers/perf/hisilicon/hisi_uncore_pmu.c | 2 +- > drivers/perf/thunderx2_pmu.c | 3 +-- > drivers/perf/xgene_pmu.c | 2 +- > drivers/pwm/pwm-pca9685.c | 2 +- > drivers/staging/media/tegra-video/vi.c | 2 +- > drivers/thermal/intel/intel_powerclamp.c | 10 -- > fs/ocfs2/cluster/heartbeat.c | 14 +++--- > 33 files changed, 57 insertions(+), 75 deletions(-) After you get the new functions added to the kernel tree, this patch should be broken up into one-patch-per-subsystem and submitted through the various subsystem trees. thanks, greg k-h ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
Excerpts from Yury Norov's message of November 28, 2021 1:56 pm: > In many cases people use bitmap_weight()-based functions like this: > > if (num_present_cpus() > 1) > do_something(); > > This may take considerable amount of time on many-cpus machines because > num_present_cpus() will traverse every word of underlying cpumask > unconditionally. > > We can significantly improve on it for many real cases if stop traversing > the mask as soon as we count present cpus to any number greater than 1: > > if (num_present_cpus_gt(1)) > do_something(); > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > functions together with corresponding wrappers in cpumask and nodemask. There would be no change to callers if you maintain counters like what is done for num_online_cpus() today. Maybe some fixes to arch code that does not use set_cpu_possible() etc APIs required, but AFAIKS it would be better to fix such cases anyway. Thanks, Nick ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le}
On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace num_*_cpus() > with one of new functions where appropriate. This allows num_*_cpus_*() > to return earlier depending on the condition. [] > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c [] > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) >* if platform didn't set the present map already, do it now >* boot cpu is set to present already by init/main.c >*/ > - if (num_present_cpus() <= 1) > + if (num_present_cpus_le(2)) > init_cpu_present(cpu_possible_mask); ? is this supposed to be 2 or 1 > diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c [] > @@ -593,7 +593,7 @@ static int __init pcc_cpufreq_init(void) > return ret; > } > > - if (num_present_cpus() > 4) { > + if (num_present_cpus_gt(4)) { > pcc_cpufreq_driver.flags |= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING; > pr_err("%s: Too many CPUs, dynamic performance scaling > disabled\n", > __func__); It looks as if the present variants should be using the same values so the _le test above with 1 changed to 2 looks odd. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le}
On Sun, Nov 28, 2021 at 09:07:52AM -0800, Joe Perches wrote: > On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace num_*_cpus() > > with one of new functions where appropriate. This allows num_*_cpus_*() > > to return earlier depending on the condition. > [] > > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c > [] > > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > > * if platform didn't set the present map already, do it now > > * boot cpu is set to present already by init/main.c > > */ > > - if (num_present_cpus() <= 1) > > + if (num_present_cpus_le(2)) > > init_cpu_present(cpu_possible_mask); > > ? is this supposed to be 2 or 1 X <= 1 is the equivalent of X < 2. > > diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c > [] > > @@ -593,7 +593,7 @@ static int __init pcc_cpufreq_init(void) > > return ret; > > } > > > > - if (num_present_cpus() > 4) { > > + if (num_present_cpus_gt(4)) { > > pcc_cpufreq_driver.flags |= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING; > > pr_err("%s: Too many CPUs, dynamic performance scaling > > disabled\n", > >__func__); > > It looks as if the present variants should be using the same values > so the _le test above with 1 changed to 2 looks odd. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le}
Hello, On Sun, Nov 28, 2021 at 09:43:20AM -0800, Yury Norov wrote: > On Sun, Nov 28, 2021 at 09:07:52AM -0800, Joe Perches wrote: > > On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > > > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace num_*_cpus() > > > with one of new functions where appropriate. This allows num_*_cpus_*() > > > to return earlier depending on the condition. > > [] > > > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c > > [] > > > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > > >* if platform didn't set the present map already, do it now > > >* boot cpu is set to present already by init/main.c > > >*/ > > > - if (num_present_cpus() <= 1) > > > + if (num_present_cpus_le(2)) > > > init_cpu_present(cpu_possible_mask); > > > > ? is this supposed to be 2 or 1 > > X <= 1 is the equivalent of X < 2. > > > > diff --git a/drivers/cpufreq/pcc-cpufreq.c b/drivers/cpufreq/pcc-cpufreq.c > > [] > > > @@ -593,7 +593,7 @@ static int __init pcc_cpufreq_init(void) > > > return ret; > > > } > > > > > > - if (num_present_cpus() > 4) { > > > + if (num_present_cpus_gt(4)) { > > > pcc_cpufreq_driver.flags |= CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING; > > > pr_err("%s: Too many CPUs, dynamic performance scaling > > > disabled\n", > > > __func__); > > > > It looks as if the present variants should be using the same values > > so the _le test above with 1 changed to 2 looks odd. > I think the confusion comes from le meaning less than rather than lt. Given the general convention of: lt (<), le (<=), eg (=), ge (>=), gt (>), I'd consider renaming your le to lt. Thanks, Dennis ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible, present, active}_cpus_{eq, gt, le}
On Sun, 28 Nov 2021 at 18:43, Yury Norov wrote: > On Sun, Nov 28, 2021 at 09:07:52AM -0800, Joe Perches wrote: > > On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > > > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace num_*_cpus() > > > with one of new functions where appropriate. This allows num_*_cpus_*() > > > to return earlier depending on the condition. > > [] > > > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c > > [] > > > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > > > * if platform didn't set the present map already, do it now > > > * boot cpu is set to present already by init/main.c > > > */ > > > - if (num_present_cpus() <= 1) > > > + if (num_present_cpus_le(2)) > > > init_cpu_present(cpu_possible_mask); > > > > ? is this supposed to be 2 or 1 > > X <= 1 is the equivalent of X < 2. Ah, then the function is confusing. Usually it's lt = less than and lt = less than or equal. Same idea for gt vs ge. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le}
On Sun, 2021-11-28 at 09:43 -0800, Yury Norov wrote: > On Sun, Nov 28, 2021 at 09:07:52AM -0800, Joe Perches wrote: > > On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > > > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace num_*_cpus() > > > with one of new functions where appropriate. This allows num_*_cpus_*() > > > to return earlier depending on the condition. > > [] > > > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c > > [] > > > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > > >* if platform didn't set the present map already, do it now > > >* boot cpu is set to present already by init/main.c > > >*/ > > > - if (num_present_cpus() <= 1) > > > + if (num_present_cpus_le(2)) > > > init_cpu_present(cpu_possible_mask); > > > > ? is this supposed to be 2 or 1 > > X <= 1 is the equivalent of X < 2. True. The call though is _le not _lt ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 7/9] lib/cpumask: add num_{possible,present,active}_cpus_{eq,gt,le}
On Sun, Nov 28, 2021 at 12:54:00PM -0500, Dennis Zhou wrote: > Hello, > > On Sun, Nov 28, 2021 at 09:43:20AM -0800, Yury Norov wrote: > > On Sun, Nov 28, 2021 at 09:07:52AM -0800, Joe Perches wrote: > > > On Sat, 2021-11-27 at 19:57 -0800, Yury Norov wrote: > > > > Add num_{possible,present,active}_cpus_{eq,gt,le} and replace > > > > num_*_cpus() > > > > with one of new functions where appropriate. This allows num_*_cpus_*() > > > > to return earlier depending on the condition. > > > [] > > > > diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c > > > [] > > > > @@ -103,7 +103,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus) > > > > * if platform didn't set the present map already, do it now > > > > * boot cpu is set to present already by init/main.c > > > > */ > > > > - if (num_present_cpus() <= 1) > > > > + if (num_present_cpus_le(2)) > > > > init_cpu_present(cpu_possible_mask); > > > > > > ? is this supposed to be 2 or 1 > > > > X <= 1 is the equivalent of X < 2. > > > > > > diff --git a/drivers/cpufreq/pcc-cpufreq.c > > > > b/drivers/cpufreq/pcc-cpufreq.c > > > [] > > > > @@ -593,7 +593,7 @@ static int __init pcc_cpufreq_init(void) > > > > return ret; > > > > } > > > > > > > > - if (num_present_cpus() > 4) { > > > > + if (num_present_cpus_gt(4)) { > > > > pcc_cpufreq_driver.flags |= > > > > CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING; > > > > pr_err("%s: Too many CPUs, dynamic performance scaling > > > > disabled\n", > > > >__func__); > > > > > > It looks as if the present variants should be using the same values > > > so the _le test above with 1 changed to 2 looks odd. > > > > I think the confusion comes from le meaning less than rather than lt. > Given the general convention of: lt (<), le (<=), eg (=), ge (>=), > gt (>), I'd consider renaming your le to lt. Ok, makes sense. I'll rename in v2 and add <= and >= versions. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
On Sun, Nov 28, 2021 at 09:08:41PM +1000, Nicholas Piggin wrote: > Excerpts from Yury Norov's message of November 28, 2021 1:56 pm: > > In many cases people use bitmap_weight()-based functions like this: > > > > if (num_present_cpus() > 1) > > do_something(); > > > > This may take considerable amount of time on many-cpus machines because > > num_present_cpus() will traverse every word of underlying cpumask > > unconditionally. > > > > We can significantly improve on it for many real cases if stop traversing > > the mask as soon as we count present cpus to any number greater than 1: > > > > if (num_present_cpus_gt(1)) > > do_something(); > > > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > > functions together with corresponding wrappers in cpumask and nodemask. > > There would be no change to callers if you maintain counters like what > is done for num_online_cpus() today. Maybe some fixes to arch code that > does not use set_cpu_possible() etc APIs required, but AFAIKS it would > be better to fix such cases anyway. Thanks, Nick. I'll try to do this. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 0/9] lib/bitmap: optimize bitmap_weight() usage
On Sun, Nov 28, 2021 at 07:03:41PM +0100, mirq-t...@rere.qmqm.pl wrote: > On Sat, Nov 27, 2021 at 07:56:55PM -0800, Yury Norov wrote: > > In many cases people use bitmap_weight()-based functions like this: > > > > if (num_present_cpus() > 1) > > do_something(); > > > > This may take considerable amount of time on many-cpus machines because > > num_present_cpus() will traverse every word of underlying cpumask > > unconditionally. > > > > We can significantly improve on it for many real cases if stop traversing > > the mask as soon as we count present cpus to any number greater than 1: > > > > if (num_present_cpus_gt(1)) > > do_something(); > > > > To implement this idea, the series adds bitmap_weight_{eq,gt,le} > > functions together with corresponding wrappers in cpumask and nodemask. > > Having slept on it I have more structured thoughts: > > First, I like substituting bitmap_empty/full where possible - I think > the change stands on its own, so could be split and sent as is. Ok, I can do it. > I don't like the proposed API very much. One problem is that it hides > the comparison operator and makes call sites less readable: > > bitmap_weight(...) > N > > becomes: > > bitmap_weight_gt(..., N) > > and: > bitmap_weight(...) <= N > > becomes: > > bitmap_weight_lt(..., N+1) > or: > !bitmap_weight_gt(..., N) > > I'd rather see something resembling memcmp() API that's known enough > to be easier to grasp. For above examples: > > bitmap_weight_cmp(..., N) > 0 > bitmap_weight_cmp(..., N) <= 0 > ... bitmap_weight_cmp() cannot be efficient. Consider this example: bitmap_weight_lt(1000 , 1) == false ^ stop here bitmap_weight_cmp(1000 , 1) == 0 ^ stop here I agree that '_gt' is less verbose than '>', but the advantage of '_gt' over '>' is proportional to length of bitmap, and it means that this API should exist. > This would also make the implementation easier in not having to > copy and paste the code three times. Could also use a simple > optimization reducing code size: In the next version I'll reduce code duplication like this: bool bitmap_eq(..., N); bool bitmap_ge(..., N); #define bitmap_weight_gt(..., N) bitmap_weight_ge(..., N + 1) #define bitmap_weight_lt(..., N) !bitmap_weight_ge(..., N) #define bitmap_weight_le(..., N) !bitmap_weight_gt(..., N) Thanks, Yury ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc