Re: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays

Benjamin Root Thu, 17 Dec 2015 10:45:51 -0800

Would it make sense to at all to bring that optimization to np.sum()? I
know that I have np.sum() all over the place instead of count_nonzero,
partly because it is a MatLab-ism and partly because it is easier to write.
I had no clue that there was a performance difference.


Cheers!
Ben Root


On Thu, Dec 17, 2015 at 1:37 PM, CJ Carey <perimosocord...@gmail.com> wrote:

> I believe this line is the reason:
>
> https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110
>
> On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V <rag...@gmail.com> wrote:
>
>> I was just playing with `count_nonzero` and found it to be significantly
>> faster for boolean arrays compared to integer arrays
>>
>>
>>     >>> a = np.random.randint(0, 2, (100, 5))
>>     >>> a_bool = a.astype(bool)
>>
>>     >>> %timeit np.sum(a)
>>     100000 loops, best of 3: 5.64 µs per loop
>>
>>     >>> %timeit np.count_nonzero(a)
>>     1000000 loops, best of 3: 1.42 us per loop
>>
>>     >>> %timeit np.count_nonzero(a_bool)
>>     1000000 loops, best of 3: 279 ns per loop (but why?)
>>
>> I tried looking into the code and dug my way through to this line
>> <https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2172>.
>> I am unable to dig further.
>>
>> I know this is probably a trivial question, but was wondering if anyone
>> could provide insight on why this is so?
>>
>> Thanks
>>
>> R
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays

Reply via email to