[Numpy-discussion] Documentation Team meeting - Monday December 7

2020-12-04 Thread Melissa Mendonça
Hi all!

Our next Documentation Team meeting will be on *Monday, December 7* at ***4PM
UTC***. All are welcome - you don't need to already be a contributor to
join. If you have questions or are curious about what we're doing, we'll be
happy to meet you!

If you wish to join on Zoom, use this link:

https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09


Here's the permanent hackmd document with the meeting notes (still being
updated in the next few days!):

https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg


Hope to see you around!

** You can click this link to get the correct time at your timezone:
https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20201207T16&p1=1440&ah=1


- Melissa
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rules for argument parsing/forwarding in __array_function__ and __array_ufunc__

2020-12-04 Thread Sebastian Berg
On Wed, 2020-12-02 at 21:07 -0800, Stefan van der Walt wrote:
> Hi Sebastian,
> 
> Looking at these three rules, they all seem to stem from one simple
> question: do we desire for a single code snippet to be runnable on
> multiple array implementations?
> 
> On Wed, Dec 2, 2020, at 15:34, Sebastian Berg wrote:
> > 1. If an argument is invalid in NumPy it is considered and error.
> >For example:
> > 
> >np.log(arr, my_weird_argument=True)
> > 
> >is always an error even if the `__array_function__`
> > implementation
> >of `arr` would support it.
> >NEP 18 explicitly says that allowing forwarding could be done,
> > but
> >will not be done at this time.
> 
> Relaxing this rule will mean that code working for one array
> implementation (which has this keyword) may not work for another.


Indeed, while NEP 18 mentions it, I personally don't see why we should
relax it. (The NEP 13 implementation does so, but this is an
unintentional, and not optimal, implementation detail.)


> 
> > 2. Arguments must only be forwarded if they are passed in:
> > 
> >np.mean(cupy_array)
> > 
> >ends up as `cupy.mean(cupy_array)` and not:
> > 
> >cupy.mean(cupy_array, axis=None, dtype=None, out=None,
> >  keepdims=False, where=True)
> > 
> >meaning that CuPy does not need to implement all of those kwargs
> > and
> >NumPy can add new ones without breaking anyones code.
> 
> This may ultimately make it harder for array implementors (they will
> only see errors once someone tries to pass in an argument that they
> forgot to implement).  Perhaps better to pass all so they know what
> they're dealing with?

True, we do this for `np.mean(obj)`, etc. which end up calling
`obj.mean()`, but compared to protocols which explicitly ask for NumPy
compatibility, those method forwards are not as clearly defined.
So maybe we should actually pass on everything (including the default
value?), that is actually safer if we ever update the default.

The downside would remain that a newer NumPy is likely to cause a break
until the project updates (e.g. if we add a keyword argument).

If we were open to this (plus an insignificant change in subclass
handling), it would be easy to at least half the overhead of Python
`__array_function__` dispatching.
That is because it would allow us to inline (in python):

def function(arg1, arg2, kwarg1=None):
dispatched = dispatch((arg1,), arg1, arg2, kwarg1=kwarg1)
if dispatched is not NotImplemented:
return dispatched

# normal code here (some argument validation could come first)


This may look strange, but has to go through 1-2 function calls
where currently we go through 4.

The other change, would also allow us to remove *all* overhead for
functions defined in C.


> 
> > 3. NumPy should not check the *validity* of the arguments. For
> > example:
> >`np.add.reduce(xarray, axis="long")` should probably work in
> > xarray.
> >(`xarray.DataArray` does not actually implement the above.)
> >But a string cannot be used as an axis in NumPy.
> 
> Getting back to the original question: if this code is to be run on
> multiple implementations, we should ensure that no strange values
> pass through.
> 
> Personally, I like the idea of a single API that works on multiple
> backends.  As such, I would 1) not pass through unknown arguments, 2)
> always pass through all arguments, and 3) validate inputs to each
> call.


Thanks for the input!  I think point 2) is in the sense the  most
interesting, because the approach `pytorch` takes to remove the
overhead of array-function gets very complicated without it.

In the end, parsing validity should maybe be considered an
implementation detail... I.e. if there is a good reason why validating
is a problem, we can stop doing it and otherwise there is no need to
worry about it. (Although for ufuncs, I would go the non-validating
route for now personally.)

Cheers,

Sebastian


> 
> Best regards,
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion