Re: [Numpy-discussion] copy="never" discussion and no deprecation cycle?

2021-06-24 Thread Ralf Gommers
On Thu, Jun 24, 2021 at 6:12 AM Gagandeep Singh 
wrote:

> To me, adding enums as attributes of the `np.copy` function seems like a
> pretty good idea. This trick might resolve the only relatively important
> issue with Enums. Then, the benefits of Enum might outweigh the
> disadvantage of uncommon of usage of Enums in NumPy APIs. As an end user, I
> would like Enums rather than strings as the former would provide fixed
> number of choices (hence, easy debugging) as compared to the latter (in
> which case, infinite choices for passing strings and the code may work
> silently, imagine, passing, `if_neded` instead of `if_needed` and it
> working perfectly fine (silently). This thing has happened to me while
> using another library.
>

Any well-designed function accepting strings should do input validation
though, to raise an error in case of mis-spelling.


> On Thu, Jun 24, 2021 at 8:05 AM Benjamin Root 
> wrote:
>
>> Why not both? The definition of the enum might live in a proper namespace
>> location, but I see no reason why `np.copy.IF_NEEDED =
>> np.flags.CopyFlgs.IF_NEEDED` can't be done (I mean, adding the enum members
>> as attributes to the `np.copy()` function). Seems perfectly reasonable to
>> me, and reads pretty nicely, too. It isn't like we are dropping support for
>> the booleans, so those are still around for easy typing.
>>
>> Ben Root
>>
>> On Wed, Jun 23, 2021 at 10:26 PM Stefan van der Walt <
>> stef...@berkeley.edu> wrote:
>>
>>> On Wed, Jun 23, 2021, at 18:01, Juan Nunez-Iglesias wrote:
>>> > Personally I was a fan of the Enum approach. People dislike it because
>>> > it is not “Pythonic”, but imho that is an accident of history because
>>> > Enums only appeared (iirc) in Python 3.4. In fact, they are the right
>>> > data structure for this particular problem, so for my money we should
>>> > *make it* Pythonic by starting to use it everywhere where we have a
>>> > finite list of choices.
>>>
>>> The enum definitely feels like the right abstraction. But the resulting
>>> API is clunky because of naming and top-level scarcity.
>>>
>>
I agree with this. Enums are nice _in theory_, but once you start using
them you quickly figure out they're clunky, plus the all-caps looks bad
(I'd consider ignoring that style recommendation). For API design they
don't make all that much sense compared to "here's a list of strings we
accept, and everything else raises an informative error". The only reasons
I can think of to use them are:

1. Cases like never-copy, when there's a reason to have an object we can
add a method too (`__bool__` here)
2. There's a long list of options and we want to give users  a way to
explore or iterate over those, so a public object is useful. so cases where
we'd otherwise use a class (instance) instead of documenting the string
options. I can't think of many examples like this, padding modes for
`scipy.ndimage.convolve` is the only one that comes to mind.

In general I don't expect we'd need (m)any more. Hence I'd suggest adding a
new namespace like `np.flags` is not a good idea. Right now all we need is
a single object, if we end up going the enum route.

For this one, I'd say it kinda looks like we do need one, so then  let's
just add one and be done with it, rather than inventing odd patterns like
tacking enum members onto an existing function.

Cheers,
Ralf



>
>>> Hence the suggestion to tag it onto np.copy, but there is an argument to
>>> be made for consistency by placing all enums under np.flags or similar.
>>>
>>> Still, np.flags.copy.IF_NEEDED gets long.
>>>
>>> Stéfan
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] copy="never" discussion and no deprecation cycle?

2021-06-24 Thread Stefan van der Walt
On Thu, Jun 24, 2021, at 01:03, Ralf Gommers wrote:
> For this one, I'd say it kinda looks like we do need one, so then  let's just 
> add one and be done with it, rather than inventing odd patterns like tacking 
> enum members onto an existing function.

There are two arguments on the table that resonate with me:

1. Chuck argues that the current `copy=False` behavior (which, in fact, means 
copy-if-needed) is nonsensical and should be fixed.
2. Ralf argues that strings are ultimately the interface we'd like to see.

To achieve (1), we would need a deprecation cycle.  During that deprecation 
cycle, we would need to provide a way to continue providing 'copy-if-needed' 
behavior.  This can be achieved either with an enum or by accepting strings.

Stephan argues that accepting strings will be harmful to new code running on 
old versions of NumPy.  I would still like to get a sense of how often this 
happens, or if that is a hit we are willing to take.  If we decide that the 
concern is a significant one, then we would have to go the enum route, at least 
for a while.  However, I see no compelling reason to have that enum live in the 
top-level namespace though: it is for relatively advanced use, and it will be 
temporary.

If we take the enum route, how do we get to (2)?  We add a type check for a few 
releases and raise an error on string arguments (or, alternatively, handle 
'always'/'never'/'if_needed' without advertising that functionality).  Then, 
once we switch to string arguments, users will get an error (for old NumPy) or 
it will work as expected (for new NumPy).

I didn't think so originally, but I suppose we are in NEP territory now.

Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] copy="never" discussion and no deprecation cycle?

2021-06-24 Thread Stephan Hoyer
On Thu, Jun 24, 2021 at 1:03 AM Ralf Gommers  wrote:

> I agree with this. Enums are nice _in theory_, but once you start using
> them you quickly figure out they're clunky, plus the all-caps looks bad
> (I'd consider ignoring that style recommendation). For API design they
> don't make all that much sense compared to "here's a list of strings we
> accept, and everything else raises an informative error". The only reasons
> I can think of to use them are:
>
> 1. Cases like never-copy, when there's a reason to have an object we can
> add a method too (`__bool__` here)
> 2. There's a long list of options and we want to give users  a way to
> explore or iterate over those, so a public object is useful. so cases where
> we'd otherwise use a class (instance) instead of documenting the string
> options. I can't think of many examples like this, padding modes for
> `scipy.ndimage.convolve` is the only one that comes to mind.
>

I think Enums are a very clean abstraction for capturing a discrete set of
options in a type-safe way, both at runtime and with static checks. You
also don't have to keep lists of strings in sync, which makes them a little
easier to document.

That said, I agree that in most cases the overall benefits are rather
marginal. I don't think it's worth a mass migration of existing NumPy
functions, which uses strings for categorical options.

In this particular case, I think there is a clear advantage to using an
enum, to avoid inadvertent bugs with old versions of NumPy.


> In general I don't expect we'd need (m)any more. Hence I'd suggest adding
> a new namespace like `np.flags` is not a good idea. Right now all we need
> is a single object, if we end up going the enum route.
>
> For this one, I'd say it kinda looks like we do need one, so then  let's
> just add one and be done with it, rather than inventing odd patterns like
> tacking enum members onto an existing function.
>

I agree with both of these. If we're only going to add a couple of enums,
it's not worth worrying about a couple of extra objects polluting NumPy's
namespace. I would just add np.CopyMode, rather than inventing a new design
pattern.

At some point in the future, we might either:
(1) switch the interface to use strings, in which case we would stop
recommending/documenting CopyMode (like plenty of other top level objects
in the NumPy namespace)
(2) add many more enums, in which case we can consider assigning enums as
function attributes or putting them in a namespace. But so far the only
other enum I've heard suggested is np.ClipMode. Adding two enums to the
NumPy namespace would hardly make a difference at this point, given how
many objects are already there.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion