[Numpy-discussion] Subclassing vs. dispatch

2021-01-15 Thread Israel, Daniel M
I hope this is the right place to post this.

The numpy documentation talks about two methods for making ndarray-like 
objects, subclassing and dispatching, but it is not clear to me which one is 
most appropriate for which purpose.  Can someone provide, or point me to, some 
guidance, about this?  I’m particularly interested in what happens if there are 
multiple layers of subclassing.  Can you subclass from a subclass?  Dispatch 
from a dispatch?  Subclass from a dispatch and vice versa?

My specific application is a pair of classes, SpectralArray and PhysicalArray 
that uses numpy.fft to provides a to_physical() and to_spectral() method, 
respectively, to simplify writing pseudo-spectral codes.  Initially this will 
be serial, but the implementation will eventually use a mechanism similar to 
mpi4py-fft to allow the arrays to be distributed.  Further, it would be nice to 
be able to make the code interoperable with the cupy CUDA numpy implementation, 
so that the sub array on each MPI process could use GPU accelerated FFTs.

Advice?  Thanks.

—
Daniel M. Israel, Ph. D.
XCP-4: Methods & Algorithms
Mailstop F644
Los Alamos National Laboratory
505 665 5664
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Subclassing vs. dispatch

2021-01-15 Thread Sebastian Berg
On Fri, 2021-01-15 at 18:38 +, Israel, Daniel M wrote:
> I hope this is the right place to post this.
> 
> The numpy documentation talks about two methods for making ndarray-
> like objects, subclassing and dispatching, but it is not clear to me
> which one is most appropriate for which purpose.  Can someone
> provide, or point me to, some guidance, about this?  I’m particularly
> interested in what happens if there are multiple layers of
> subclassing.  Can you subclass from a subclass?  Dispatch from a
> dispatch?  Subclass from a dispatch and vice versa?

All of those things can be made to work with appropriate use of
`super()`. Subclassing and dispatching are not exclusive (an example is
astropy.quantitile`).

If you want to go well beyond typical NumPy behaviour, I would suggest
to focus on dispatching. If all you want is to add a single method,
subclassing should be a pretty good fit. (Assuming you don't mind if
some operations may end up giving you a normal array, or return your
array when a normal array would fit better.)

For example, MaskedArray in NumPy is a subclass, but adds so much
additional things that dispatching without subclassing is likely a
better fit. (Opinions will probably differ; I expect using subclassing
some things will "just work". However, sometimes the things that "just
work" may also do the wrong thing).  Ignoring the mask of a MaskedArray
is always a serious issues.

> My specific application is a pair of classes, SpectralArray and
> PhysicalArray that uses numpy.fft to provides a to_physical() and
> to_spectral() method, respectively, to simplify writing pseudo-
> spectral codes.  Initially this will be serial, but the
> implementation will eventually use a mechanism similar to mpi4py-fft
> to allow the arrays to be distributed.  Further, it would be nice to
> be able to make the code interoperable with the cupy CUDA numpy
> implementation, so that the sub array on each MPI process could use
> GPU accelerated FFTs.

It sounds like you mostly want to add a set of method, so making a
MixIn class and using subclassing may well be a good option.  You can
still add `__array_function__` or `__array_ufunc__` with a fallback to
`super()` to override specific functions.

If there is more to it (e.g. metadata for frequency scales or similar),
it may be better to skip subclassing altogether. (Just to mention: in
such a case `xarray` may be interesting.)

Since you are also looking for distributed arrays, you should probably
look into Dask (I do not know `mpyi4py` though). Dask arrays consist of
distributed NumPy or CuPy arrays and make use of the dispatching in
NumPy.
Note that NumPy arrays cannot be distributed or gpu backed, and you
cannot add using a subclass. So if that is the aim, do not subclass
ndarray unless you were prepared to create multiple (sub)classes
(ndarray, dask array, cupy array).

Cheers,

Sebastian


> 
> Advice?  Thanks.
> 
> —
> Daniel M. Israel, Ph. D.
> XCP-4: Methods & Algorithms
> Mailstop F644
> Los Alamos National Laboratory
> 505 665 5664
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Documentation Team meeting - Monday January 18

2021-01-15 Thread Melissa Mendonça
Hi all!

Our next Documentation Team meeting will be on *Monday, January 18* at ***4PM
UTC***. All are welcome - you don't need to already be a contributor to
join. If you have questions or are curious about what we're doing, we'll be
happy to meet you!

If you wish to join on Zoom, use this link:

https://zoom.us/j/96219574921?pwd=VTRNeGwwOUlrYVNYSENpVVBRRjlkZz09


Here's the permanent hackmd document with the meeting notes (still being
updated in the next few days!):

https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg


Hope to see you around!

** You can click this link to get the correct time at your timezone:
https://www.timeanddate.com/worldclock/fixedtime.html?msg=NumPy+Documentation+Team+Meeting&iso=20210118T16&p1=1440&ah=1


- Melissa
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion