[Numpy-discussion] Re: Feature query: fetch top/bottom k from array

2022-02-23 Thread Joseph Bolton
Morning!

I find myself often requiring the indices and/or values of the top (or
bottom) k items in a numpy array. I am aware of solutions involving
*partition*/*argpartition *but I find these inelegant (or using *sort *but
these are inefficient).

Is this a feature that would benefit the numpy package, or bloat it? I am
happy to code it up.

Here are some examples:

>> import numpy as np

>> a = np.array( [ [5,8,1,3,0], [5,6,2,1,3], [1,4,9,1,3], [8,0,4,7,0] ] )



>> # PROPOSED FEATURE: return (ordered) top 4 values in array:

>> a.top_k(k=4)

array([9, 8, 8, 7])



>> # CURRENT METHOD: return (ordered) top 4 values in array:

>> np.sort( np.partition(a.flatten(), -4)[-4:] )[::-1]# faster method


array([9, 8, 8, 7])

>> np.sort(a.flatten())[::-1][:4] # slower method

array([9, 8, 8, 7])



>> # PROPOSED FEATURE: return INDICES of (ordered) top 4 values in array:

>> a.top_k(k=4, return_indices=True)

array([12,1,15,18])



>> # CURRENT METHOD: return  INDICES   of (ordered) top 4 values in array:

>> (-a.flatten()).argsort()[:4]

array([12,1,15,18])



>> # PROPOSED FEATURE: multidimensional examples:

>> a.top_k(k=3, axis=0)

array( [8,5,1], [8,6,4], [9,4,2], [7,3,1], [3,3,0] )

>> a.top_k(k=3, axis=1)

array( [8,5,3], [6,5,2], [9,4,3], [8,7,4] )




I'd also consider including functionality for bottom k values, and methods
for returning indices in the case of tied values (e.g. "first appearance",
"random" etc.).

Cheers
Joe


On Tue, 22 Feb 2022 at 15:30, Joseph Fox-Rabinovitz <
jfoxrabinov...@gmail.com> wrote:

> Joe,
>
> Could you show an example that you find inelegant and elaborate on how you
> intend to improve it? It's hard to discuss without more specific
> information.
>
> - Joe
>
> On Tue, Feb 22, 2022, 07:23 Joseph Bolton 
> wrote:
>
>> Morning,
>>
>> My apologies if this deviates from the vision of numpy:
>>
>> I find myself often requiring the indices and/or values of the top (or
>> bottom) k items in a numpy array.
>>
>> I am aware of solutions involving partition/argpartition but these are
>> inelegant.
>>
>> I am thinking of 1-dimensional arrays, but this concept extends to an
>> arbitrary number of dimensions.
>>
>> Is this a feature that would benefit the numpy package? I am happy to
>> code it up.
>>
>> Thanks for your time!
>>
>> Best regards
>> Joe
>>
>>
>>
>>
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: jfoxrabinov...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: joseph.jazz.bol...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Feature query: fetch top/bottom k from array

2022-02-23 Thread Friedrich Romstedt
Am Di., 22. Feb. 2022 um 14:25 Uhr schrieb Joseph Bolton
:
>
> I find myself often requiring the indices and/or values of the top (or 
> bottom) k items in a numpy array.

There has been discussion about this last year:

https://mail.python.org/archives/list/numpy-discussion@python.org/thread/F4P5UVTAKRJJ3OORI6UOWFSUEE5CNTSC/

Mentioned in that thread is the following pull request, which has some
more discussion:

https://github.com/numpy/numpy/pull/19117

Friedrich
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Feature query: fetch top/bottom k from array

2022-02-23 Thread Brock Mendel
pandas.Series has a nlargest/nsmallest method that might be upstream-able.

On Wed, Feb 23, 2022 at 6:28 AM Friedrich Romstedt <
friedrichromst...@gmail.com> wrote:

> Am Di., 22. Feb. 2022 um 14:25 Uhr schrieb Joseph Bolton
> :
> >
> > I find myself often requiring the indices and/or values of the top (or
> bottom) k items in a numpy array.
>
> There has been discussion about this last year:
>
>
> https://mail.python.org/archives/list/numpy-discussion@python.org/thread/F4P5UVTAKRJJ3OORI6UOWFSUEE5CNTSC/
>
> Mentioned in that thread is the following pull request, which has some
> more discussion:
>
> https://github.com/numpy/numpy/pull/19117
>
> Friedrich
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: jbrockmen...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] NumPy Newcomers' Hour – Thursday, February 24th

2022-02-23 Thread Inessa Pawson
Our next Newcomers' Hour is tomorrow, February 24th at 4 pm UTC. We have no
agenda this time. Stop by to ask questions or just to say hi.

Join the meeting via Zoom: https://us02web.zoom.us/j/87192457898

Cheers,
Inessa

Inessa Pawson
NumPy Contributor Experience Lead
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com