from:"Hameer Abbasi"

[Numpy-discussion] Plans for new sparse compilation backend for PyData/Sparse

2024-01-03 Thread Hameer Abbasi

Hello everyone,

The stated goal for sparse is to provide a NumPy-like API with a sparse 
representation of arrays. To this end, Quansight and I have been collaborating 
with researchers at MIT CSAIL  - in particular 
Prof. Amarasinge's group  and 
the TACO  team - to develop a 
performant and production-ready package for N-dimensional sparse arrays. There 
were several attempts made to explore this over the last couple of years, 
including a LLVM back-end 
 for TACO 
, and a pure-C++ 
template-metaprogramming approach called XSparse 
.

To this end, we, at Quansight, are happy to announce that we have received 
funding from DARPA, together with our partners from MIT, under their Small 
Business Innovation Research (SBIR) program 
 to 
build out sparse using state-of-the-art just-in-time compilation strategies to 
boost performance for users. Additionally, as an interface, we'll adopt the 
Array API standard  which was 
championed by major libraries like NumPy, PyTorch and CuPy.

More details about the plan are posted on GitHub 
 — please join in the 
discussion there, to keep it all in one place.

Best Regards,

Hameer Abbasi___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] [ANN] sparse 0.15.1 released

2024-01-10 Thread Hameer Abbasi

Hi all,

On behalf of the sparse developers, I’m pleased to announce the release of 
sparse 0.15.1. This is a significant release with Array API support, 
pyproject.toml packaging, einsum support and other miscellaneous improvements.
Source Code: https://github.com/pydata/sparse
Documentation: https://sparse.pydata.org/en/stable/
Discussions: https://github.com/pydata/sparse/discussions

Best regards,
Hameer Abbasi___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-08 Thread Hameer Abbasi

Not that I’m against different “levels” of ndarray granularity, but I just
don’t want it to introduce complexity for the end-user. For example, it
would be unreasonable to expect the end-user to check for all parts of the
interface that they need support for separately.

Keeping this in view; different levels only make sense if and only if they
are strict sub/supersets of each other, so the user can just check for the
highest level of compatibility they require, but even then they would need
to learn about the different “levels".

PS, thanks for putting this together! I was thinking of doing it this
weekend but you beat me to it and covered aspects I wouldn’t have thought
of.

The name “asarraylike” appeals to me, as does a “custom=“ kwarg for
asanyarray.


Sent from Astro  for Mac

On Mar 9, 2018 at 02:51, Juan Nunez-Iglesias  wrote:


On Fri, Mar 9, 2018, at 5:56 AM, Stephan Hoyer wrote:

Marten's case 1: works exactly like ndarray, but stores data differently:
parallel arrays (e.g., dask.array), sparse arrays (e.g.,
https://github.com/pydata/sparse), hypothetical non-strided arrays (e.g.,
always C ordered).


Two other "hypotheticals" that would fit nicely in this space:
- the Open Connectome folks (https://neurodata.io) proposed linearising
indices using space-filling curves, which minimizes cache misses (or IO
reads) for giant volumes. I believe they implemented this but can't find it
currently.
- the N5 format for chunked arrays on disk:
https://github.com/saalfeldlab/n5

Finally for the name, what about `asduckarray`? Thought perhaps that could
be a source of confusion, and given the gradation of duck array like types.


I suggest that the name should *not* use programmer lingo, so neither
"abstract" nor "duck" should be in there. My humble proposal is
"arraylike". (I know that this term has included things like "list-of-list"
before but only in text, not code, as far as I know.)

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ENH: softmax

2018-03-14 Thread Hameer Abbasi

I possible, write it as a `gufunc`, so duck arrays can override with
`__array_ufunc__` if

necessary. -- Marten

Softmax is a very simple combination of elementary `ufunc`s with two
inputs, the weight vector `w` and the data `x`. Writing it as a `gufunc`
would be going overboard, IMO. Writing it as a combination of `ufunc`s and
avoiding Numpy-specific stuff should be good enough.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-16 Thread Hameer Abbasi

 Hello, everyone. I’ve submitted a PR to add a initializer kwarg to
ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
a “default” value for identity-less ufunc reductions, and specify an
initial value for reductions such as sum (other than zero.)

Please feel free to review or leave feedback, (although I think Eric and
Marten have picked it apart pretty well).

https://github.com/numpy/numpy/pull/10635

Thanks,

Hameer
Sent from Astro  for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-17 Thread Hameer Abbasi

It would be nice if there was an IntEnum [1] that was taken is an input to
`np.asarrayish` and `np.isarrayish` to require a combination of the groups
of attributes/methods/semantics.


Don’t you mean IntFlag ?
I like Marten’s idea of “grouping together” related functionality via ABCs
and implementing different parts via ABCs (for example, in pydata/sparse we
use NDArrayOperatorsMixin for exactly this), but I believe that separate
ABCs should be provided for different parts of the interface.

Then we can either:

   1. Check with isinstance for the ABCs, or
   2. Check with hasattr.

I like the IntFlag idea most (it seems to be designed for use-cases like
these), but a string-based (np.aspyarray(x,
functionality=‘arithmetic|reductions')) or list-based (np.aspyarray(x,
functionality=[‘arithmetic’, ‘reductions’]) is also fine.

It might help to have some sort of a “dry-run” interface that (given a run
of code) figures out which parts you need.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] new NEP: np.AbstractArray and np.asabstractarray

2018-03-22 Thread Hameer Abbasi

I think that with your comments in mind, it may just be best to embrace
duck typing, like Matthew suggested. I propose the following workflow:

   - __array_concatenate__ and similar "protocol" functions return
   NotImplemented if they won't work.
   - "Base functions" that can be called directly like __getitem__ raise
   NotImplementedError if they won't work.
   - __arrayish__ = True

Then, something like np.concatenate would do the following:

   - Call __array_concatenate__ following the same order as ufunc arguments.
   - If everything fails, raise NotImplementedError (or convert everything
   to ndarray).

Overloaded functions would do something like this (perhaps a simple
decorator will do for the repetitive work?):

   - Try with np.arrayish
   - Catch NotImplementedError
  - Try with np.array

Then, we use abstract classes just to overload functionality or implement
things in terms of others. If something fails, we have a decent fallback.
We don't need to do anything special in order to "check" functionality.

Feel free to propose changes, but this is the best I could come up with
that would require the smallest incremental changes to Numpy while also
supporting everything right from the start.

On Thu, Mar 22, 2018 at 9:14 AM, Nathaniel Smith  wrote:

> On Sat, Mar 10, 2018 at 4:27 AM, Matthew Rocklin 
> wrote:
> > I'm very glad to see this discussion.
> >
> > I think that coming up with a single definition of array-like may be
> > difficult, and that we might end up wanting to embrace duck typing
> instead.
> >
> > It seems to me that different array-like classes will implement different
> > mixtures of features.  It may be difficult to pin down a single
> definition
> > that includes anything except for the most basic attributes (shape and
> > dtype?).  Consider two extreme cases of restrictive functionality:
> >
> > LinearOperators (support dot in a numpy-like way)
> > Storage objects like h5py (support getitem in a numpy-like way)
> >
> > I can imagine authors of both groups saying that they should qualify as
> > array-like because downstream projects that consume them should not
> convert
> > them to numpy arrays in important contexts.
>
> I think this is an important point -- there are a lot of subtleties in
> the interfaces that different objects might want to provide. Some
> interesting ones that haven't been mentioned:
>
> - a "duck array" that has everything except fancy indexing
> - xarray's arrays are just like numpy arrays in most ways, but they
> have incompatible broadcasting semantics
> - immutable vs. mutable arrays
>
> When faced with this kind of situation, always it's tempting to try to
> write down some classification system to capture every possible
> configuration of interesting behavior. In fact, this is one of the
> most classic nerd snipes; it's been catching people for literally
> thousands of years [1]. Most of these attempts fail though :-).
>
> So let's back up -- I probably erred in not making this more clear in
> the NEP, but I actually have a fairly concrete use case in mind here.
> What happened is, I started working on a NEP for
> __array_concatenate__, and my thought pattern went as follows:
>
> 1) Cool, this should work for np.concatenate.
> 2) But what about all the other variants, like np.row_stack. We don't
> want __array_row_stack__; we want to express row_stack in terms of
> concatenate.
> 3) Ok, what's row_stack? It's:
>   np.concatenate([np.atleast_2d(arr) for arr in arrs], axis=0)
> 4) So I need to make atleast_2d work on duck arrays. What's
> atleast_2d? It's: asarray + some shape checks and indexing with
> newaxis
> 5) Okay, so I need something atleast_2d can call instead of asarray [2].
>
> And this kind of pattern shows up everywhere inside numpy, e.g. it's
> the first thing inside lots of functions in np.linalg b/c they do some
> futzing with dtypes and shape before delegating to ufuncs, it's the
> first thing the mean() function does b/c it needs to check arr.dtype
> before proceeding, etc. etc.
>
> So, we need something we can use in these functions as a first step
> towards unlocking the use of duck arrays in general. But we can't
> realistically go through each of these functions, make an exact list
> of all the operations/attributes it cares about, and then come up with
> exactly the right type constraint for it to impose at the top. And
> these functions aren't generally going to work on LinearOperators or
> h5py datasets anyway.
>
> We also don't want to go through every function in numpy and add new
> arguments to control this coercion behavior.
>
> What we can do, at least to start, is to have a mechanism that passes
> through objects that aspire to be "complete" duck arrays, like dask
> arrays or sparse arrays or astropy's unit arrays, and then if it turns
> out that in practice people find uses for finer-grained distinctions,
> we can iteratively add those as a second pass. Notice that if a
> function starts out requiring a "comple

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Hameer Abbasi

 It calls it `initializer` - See
https://docs.python.org/3.5/library/functools.html#functools.reduce

Sent from Astro <https://www.helloastro.com> for Mac

On Mar 26, 2018 at 09:54, Eric Wieser  wrote:


It turns out I mispoke - functools.reduce calls the argument `initial`

On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer  wrote:

> This looks like a very logical addition to the reduce interface. It has my
> support!
>
> I would have preferred the more descriptive name "initial_value", but
> consistency with functools.reduce makes a compelling case for "initializer".
>
> On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser 
> wrote:
>
>> To reiterate my comments in the issue - I'm in favor of this.
>>
>> It seems seem especially valuable for identity-less functions (`min`,
>> `max`, `lcm`), and the argument name is consistent with `functools.reduce`.
>> too.
>>
>> The only argument I can see against merging this would be `kwarg`-creep
>> of `reduce`, and I think this has enough use cases to justify that.
>>
>> I'd like to merge in a few days, if no one else has any opinions.
>>
>> Eric
>>
>> On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi 
>> wrote:
>>
>>> Hello, everyone. I’ve submitted a PR to add a initializer kwarg to
>>> ufunc.reduce. This is useful in a few cases, e.g., it allows one to supply
>>> a “default” value for identity-less ufunc reductions, and specify an
>>> initial value for reductions such as sum (other than zero.)
>>>
>>> Please feel free to review or leave feedback, (although I think Eric and
>>> Marten have picked it apart pretty well).
>>>
>>> https://github.com/numpy/numpy/pull/10635
>>>
>>> Thanks,
>>>
>>> Hameer
>>> Sent from Astro <https://www.helloastro.com> for Mac
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Hameer Abbasi

 Actually, the behavior right now isn’t that of `default` but that of
`initializer` or `start`.

This was discussed further down in the PR but to reiterate: `np.sum([10],
initializer=5)` becomes `15`.

Also, `np.min([5], initializer=0)` becomes `0`, so it isn’t really the
default value, it’s the initial value among which the reduction is
performed.

This was the reason to call it initializer in the first place. I like
`initial` and `initial_value` as well, and `start` also makes sense but
isn’t descriptive enough.

Hameer
Sent from Astro <https://www.helloastro.com> for Mac

On Mar 26, 2018 at 12:06, Sebastian Berg  wrote:


Initializer or this sounds fine to me. As an other data point which I
think has been mentioned before, `sum` uses start and min/max use
default. `start` does not work, unless we also change the code to
always use the identity if given (currently that is not the case), in
which case it might be nice. However, "start" seems a bit like solving
a different issue in any case.

Anyway, mostly noise. I really like adding this, the only thing worth
discussing a bit is the name :).

- Sebastian


On Mon, 2018-03-26 at 05:57 -0400, Hameer Abbasi wrote:

It calls it `initializer` - See https://docs.python.org/3.5/library/f
unctools.html#functools.reduce

Sent from Astro for Mac

On Mar 26, 2018 at 09:54, Eric Wieser 
wrote:

It turns out I mispoke - functools.reduce calls the argument
`initial`

On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer 
wrote:

This looks like a very logical addition to the reduce interface.
It has my support!

I would have preferred the more descriptive name "initial_value",
but consistency with functools.reduce makes a compelling case for
"initializer".

On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser  wrote:

To reiterate my comments in the issue - I'm in favor of this.

It seems seem especially valuable for identity-less functions
(`min`, `max`, `lcm`), and the argument name is consistent with
`functools.reduce`. too.

The only argument I can see against merging this would be
`kwarg`-creep of `reduce`, and I think this has enough use
cases to justify that.

I'd like to merge in a few days, if no one else has any
opinions.

Eric

On Fri, 16 Mar 2018 at 10:13 Hameer Abbasi  wrote:

Hello, everyone. I’ve submitted a PR to add a initializer
kwarg to ufunc.reduce. This is useful in a few cases, e.g.,
it allows one to supply a “default” value for identity-less
ufunc reductions, and specify an initial value for reductions
such as sum (other than zero.)

Please feel free to review or leave feedback, (although I
think Eric and Marten have picked it apart pretty well).

https://github.com/numpy/numpy/pull/10635

Thanks,

Hameer
Sent from Astro for Mac

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Hameer Abbasi

That is the idea, but NaN functions are in a separate branch for
another PR to be discussed later. You can see it on my fork, if you're
interested. On 26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat.
I imagine it would finally give some people a choice on what
np.nansum([np.nan]) should return? It caused a huge hullabeloo a few
years ago when we changed it from returning NaN to returning zero. Ben
Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
 wrote: OK, the new documentation is
actually clear: initializer : scalar, optional The value with which to
start the reduction. Defaults to the `~numpy.ufunc.identity` of the
ufunc. If ``None`` is given, the first element of the reduction is
used, and an error is thrown if the reduction is empty. If ``a.dtype``
is ``object``, then the initializer is _only_ used if reduction is
empty. I would actually like to say that I do not like the object
special case much (and it is probably the reason why I was confused),
nor am I quite sure this is what helps a lot? Logically, I would argue
there are two things: 1. initializer/start (always used) 2. default
(oly used for empty reductions) For example, I might like to give
`np.nan` as the default for some empty reductions, this will not work.
I understand that this is a minimal invasive PR and I am not sure I
find the solution bad enough to really dislike it, but what do other
think? My first expectation was the default behaviour (in all cases,
not just object case) for some reason. To be honest, for now I just
wonder a bit: How hard would it be to do both, or is that too
annoying? It would at least get rid of that annoying thing with object
ufuncs (which currently have a default, but not really an
identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
-0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t
that of `default` but that of > `initializer` or `start`. > > This was
discussed further down in the PR but to reiterate: > `np.sum([10],
initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
becomes `0`, so it isn’t really > the default value, it’s the initial
value among which the reduction > is performed. > > This was the
reason to call it initializer in the first place. I like > `initial`
and `initial_value` as well, and `start` also makes sense > but isn’t
descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
26, 2018 at 12:06, Sebastian Berg  > t>
wrote: > > > > Initializer or this sounds fine to me. As an other data
point which > > I > > think has been mentioned before, `sum` uses
start and min/max use > > default. `start` does not work, unless we
also change the code to > > always use the identity if given
(currently that is not the case), > > in > > which case it might be
nice. However, "start" seems a bit like > > solving > > a different
issue in any case. > > > > Anyway, mostly noise. I really like adding
this, the only thing > > worth > > discussing a bit is the name :). >
> > > - Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400,
Hameer Abbasi wrote: > > > It calls it `initializer` - See
https://docs.python.org/3.5/libra > > > ry/f > > >
unctools.html#functools.reduce > > > > > > Sent from Astro for Mac > >
> > > > > On Mar 26, 2018 at 09:54, Eric Wieser
 > > > com> > > > > wrote: > > > > > > > >
It turns out I mispoke - functools.reduce calls the argument > > > >
`initial` > > > > > > > > On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer
 > > > > wrote: > > > > > This looks like a very
logical addition to the reduce > > > > > interface. > > > > > It has
my support! > > > > > > > > > > I would have preferred the more
descriptive name > > > > > "initial_value", > > > > > but consistency
with functools.reduce makes a compelling case > > > > > for > > > > >
"initializer". > > > > > > > > > > On Sun, Mar 25, 2018 at 1:15 PM
Eric Wieser  > > > > y@gm > > > > > ail.com> wrote:
> > > > > > To reiterate my comments in the issue - I'm in favor of >
> > > > > this. > > > > > > > > > > > > It seems seem especially
valuable for identity-less > > > > > > functions > > > > > > (`min`,
`max`, `lcm`), and the argument name is consistent > > > > > > with >
> > > > > `functools.reduce`. too. > > > > > > > > > >

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Hameer Abbasi

It'll need to be thought out for object arrays and subclasses. But for
Regular numeric stuff, Numpy uses fmin and this would have the desired
effect. On 26/03/2018 at 17:45, Sebastian wrote: On Mon, 2018-03-26 at
11:39 -0400, Hameer Abbasi wrote: That is the idea, but NaN functions
are in a separate branch for another PR to be discussed later. You can
see it on my fork, if you're interested. Except that as far as I
understand I am not sure it will help much with it, since it is not a
default, but an initializer. Initializing to NaN would just make all
results NaN. - Sebastian On 26/03/2018 at 17:35, Benjamin wrote: Hmm,
this is neat. I imagine it would finally give some people a choice on
what np.nansum([np.nan]) should return? It caused a huge hullabeloo a
few years ago when we changed it from returning NaN to returning zero.
Ben Root On Mon, Mar 26, 2018 at 11:16 AM, Sebastian Berg
 wrote: OK, the new documentation is
actually clear: initializer : scalar, optional The value with which to
start the reduction. Defaults to the `~numpy.ufunc.identity` of the
ufunc. If ``None`` is given, the first element of the reduction is
used, and an error is thrown if the reduction is empty. If ``a.dtype``
is ``object``, then the initializer is _only_ used if reduction is
empty. I would actually like to say that I do not like the object
special case much (and it is probably the reason why I was confused),
nor am I quite sure this is what helps a lot? Logically, I would argue
there are two things: 1. initializer/start (always used) 2. default
(oly used for empty reductions) For example, I might like to give
`np.nan` as the default for some empty reductions, this will not work.
I understand that this is a minimal invasive PR and I am not sure I
find the solution bad enough to really dislike it, but what do other
think? My first expectation was the default behaviour (in all cases,
not just object case) for some reason. To be honest, for now I just
wonder a bit: How hard would it be to do both, or is that too
annoying? It would at least get rid of that annoying thing with object
ufuncs (which currently have a default, but not really an
identity/initializer). Best, Sebastian On Mon, 2018-03-26 at 08:20
-0400, Hameer Abbasi wrote: > Actually, the behavior right now isn’t
that of `default` but that of > `initializer` or `start`. > > This was
discussed further down in the PR but to reiterate: > `np.sum([10],
initializer=5)` becomes `15`. > > Also, `np.min([5], initializer=0)`
becomes `0`, so it isn’t really > the default value, it’s the initial
value among which the reduction > is performed. > > This was the
reason to call it initializer in the first place. I like > `initial`
and `initial_value` as well, and `start` also makes sense > but isn’t
descriptive enough. > > Hameer > Sent from Astro for Mac > > > On Mar
26, 2018 at 12:06, Sebastian Berg  > t>
wrote: > > > > Initializer or this sounds fine to me. As an other data
point which > > I > > think has been mentioned before, `sum` uses
start and min/max use > > default. `start` does not work, unless we
also change the code to > > always use the identity if given
(currently that is not the case), > > in > > which case it might be
nice. However, "start" seems a bit like > > solving > > a different
issue in any case. > > > > Anyway, mostly noise. I really like adding
this, the only thing > > worth > > discussing a bit is the name :). >
- Sebastian > > > > > > On Mon, 2018-03-26 at 05:57 -0400, Hameer
Abbasi wrote: > > > It calls it `initializer` - See
https://docs.python.org/3.5/libra > > > ry/f > > >
unctools.html#functools.reduce > > > > > > Sent from Astro for Mac >
On Mar 26, 2018 at 09:54, Eric Wieser  > >
> com> > > > > wrote: > > > > > > > > It turns out I mispoke -
functools.reduce calls the argument > > > > `initial` > > > > > > > >
On Mon, 26 Mar 2018 at 00:17 Stephan Hoyer  > > > >
wrote: > > > > > This looks like a very logical addition to the reduce
> > > > > interface. > > > > > It has my support! > > > > > > > > > >
I would have preferred the more descriptive name > > > > >
"initial_value", > > > > > but consistency with functools.reduce makes
a compelling case > > > > > for > > > > > "initializer". > > > > > > >
> > > On Sun, Mar 25, 2018 at 1:15 PM Eric Wieser 
> > > > y@gm > > > > > ail.com> wrote: To reiterate my comments in the
issue - I'm in favor of > this. > > > > > > > > > &g

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-03-26 Thread Hameer Abbasi

That may be complicated. Currently, the identity isn't used in object
dtype reductions. We may need to change that, which could cause a
whole lot of other backwards incompatible changes. For example, sum
actually including zero in object reductions. Or we could pass in a
flag saying an initializer was passed in to change that behaviour. If
this is agreed upon and someone is kind enough to point me to the
code, I'd be willing to make this change. On 26/03/2018 at 18:54,
Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian Berg
wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote: It'll
need to be thought out for object arrays and subclasses. But for
Regular numeric stuff, Numpy uses fmin and this would have the desired
effect. I do not want to block this, but I would like a clearer
opinion about this issue, `np.nansum` as Benjamin noted would require
something like: np.nansum([np.nan], default=np.nan) because
np.sum([1], initializer=np.nan) np.nansum([1], initializer=np.nan)
would both give NaN if the logic is the same as the current `np.sum`.
And yes, I guess for fmin/fmax NaN happens to work. And then there are
many nonsense reduces which could make sense with `initializer`. Now
nansum is not implemented in a way that could make use of the new
kwarg anyway, so maybe it does not matter in some sense. We can in
principle use `default` in nansum and at some point possibly add
`default` to the normal ufuncs. If we argue like that, the only
annoying thing is the `object` dtype which confuses the two use cases
currently. This confusion IMO is not harmless, because I might want to
use it (e.g. sum with initializer=5), and I would expect things like
dropping in `decimal.Decimal` to work most of the time, while here it
would give silently bad results. In other words: I am very very much
in favor if you get rid that object dtype special case. I frankly not
see why not (except that it needs a bit more code change). If given
explicitly, we might as well force the use and not do the funny stuff
which is designed to be more type agnostic! If it happens to fail due
to not being type agnostic, it will at least fail loudly. If you leave
that object special case I am *very* hesitant about it. That I think I
would like a `default` argument as well, is another issue and it can
wait to another day. - Sebastian - Sebastian On 26/03/2018 at 17:45,
Sebastian wrote: On Mon, 2018-03-26 at 11:39 -0400, Hameer Abbasi
wrote: That is the idea, but NaN functions are in a separate branch
for another PR to be discussed later. You can see it on my fork, if
you're interested. Except that as far as I understand I am not sure it
will help much with it, since it is not a default, but an initializer.
Initializing to NaN would just make all results NaN. - Sebastian On
26/03/2018 at 17:35, Benjamin wrote: Hmm, this is neat. I imagine it
would finally give some people a choice on what np.nansum([np.nan])
should return? It caused a huge hullabeloo a few years ago when we
changed it from returning NaN to returning zero. Ben Root On Mon, Mar
26, 2018 at 11:16 AM, Sebastian Berg 
wrote: OK, the new documentation is actually clear: initializer :
scalar, optional The value with which to start the reduction. Defaults
to the `~numpy.ufunc.identity` of the ufunc. If ``None`` is given, the
first element of the reduction is used, and an error is thrown if the
reduction is empty. If ``a.dtype`` is ``object``, then the initializer
is _only_ used if reduction is empty. I would actually like to say
that I do not like the object special case much (and it is probably
the reason why I was confused), nor am I quite sure this is what helps
a lot? Logically, I would argue there are two things: 1.
initializer/start (always used) 2. default (oly used for empty
reductions) For example, I might like to give `np.nan` as the default
for some empty reductions, this will not work. I understand that this
is a minimal invasive PR and I am not sure I find the solution bad
enough to really dislike it, but what do other think? My first
expectation was the default behaviour (in all cases, not just object
case) for some reason. To be honest, for now I just wonder a bit: How
hard would it be to do both, or is that too annoying? It would at
least get rid of that annoying thing with object ufuncs (which
currently have a default, but not really an identity/initializer).
Best, Sebastian On Mon, 2018-03-26 at 08:20 -0400, Hameer Abbasi
wrote: > Actually, the behavior right now isn’t that of `default` but
that of > `initializer` or `start`. > > This was discussed further
down in the PR but to reiterate: > `np.sum([10], initializer=5)`
becomes `15`. > > Also, `np.min([5], initializer=0)` becomes `0`, so
it isn’t really > the default value, it’s the initial value among
which the reduction > is performed. > > This was the reason to call it
initializer in the first place. I like > `initial` and `initial_value`
as we

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-04-09 Thread Hameer Abbasi

I've renamed the kwarg to `initial`. I'm willing to make the object dtype
changes as well, if someone pointed me to relevant bits of code.

Unfortunately, currently, the identity is only used for object dtypes if
the reduction is empty. I think this is to prevent things like `0` being
passed in the sum of objects (and similar cases), which makes sense.

However, with the kwarg, it makes sense to include it in the reduction. I
think the change will be somewhere along the lines of: Detect if `initial`
was passed, if so, include for object, otherwise exclude.

I personally feel `initial` renders `default` redundant. It can be used for
both purposes. I can't think of a reasonable use case where you would want
the default to be different from the initial value. However, I do agree
that fixing the object case is important, we don't want users to get used
to this behaviour and then rely on it later.

Hameer

On Mon, Mar 26, 2018 at 8:09 PM, Sebastian Berg 
wrote:

> On Mon, 2018-03-26 at 17:40 +, Eric Wieser wrote:
> > The difficulty in supporting object arrays is that func.reduce(arr,
> > initial=func.identity) and func.reduce(arr) have different meanings -
> > whereas with the current patch, they are equivalent.
> >
>
> True, but the current meaning is:
>
> func.reduce(arr, intial=, default=func.identity)
>
> in the case for object dtype. Luckily for normal dtypes, func.identity
> is both the correct default "default" and a no-op for initial. Thus the
> name "identity" kinda works there. I am also not really sure that both
> kwargs would make real sense (plus initial probably disallows
> default...), but I got some feeling that the "default" meaning may be
> even more useful to simplify special casing the empty case.
>
> Anyway, still just pointing out that I it gives me some headaches to
> see such a special case for objects :(.
>
> - Sebastian
>
>
> >
> > On Mon, 26 Mar 2018 at 10:10 Sebastian Berg  > et> wrote:
> > > On Mon, 2018-03-26 at 12:59 -0400, Hameer Abbasi wrote:
> > > > That may be complicated. Currently, the identity isn't used in
> > > object
> > > > dtype reductions. We may need to change that, which could cause a
> > > > whole lot of other backwards incompatible changes. For example,
> > > sum
> > > > actually including zero in object reductions. Or we could pass in
> > > a
> > > > flag saying an initializer was passed in to change that
> > > behaviour. If
> > > > this is agreed upon and someone is kind enough to point me to the
> > > > code, I'd be willing to make this change.
> > >
> > > I realize the implication, I am not suggesting to change the
> > > default
> > > behaviour (when no initial=... is passed), I would think about
> > > deprecating it, but probably only if we also have the `default`
> > > argument, since otherwise you cannot replicate the old behaviour.
> > >
> > > What I think I would like to see is to change how it works if (and
> > > only
> > > if) the initializer is passed in. Yes, this will require holding on
> > > to
> > > some extra information since you will have to know/remember whether
> > > the
> > > "identity" was passed in or defined otherwise.
> > >
> > > I did not check the code, but I would hope that it is not awfully
> > > tricky to do that.
> > >
> > > - Sebastian
> > >
> > >
> > > PS: A side note, but I see your emails as a single block of text
> > > with
> > > no/broken new-lines.
> > >
> > >
> > > >  On 26/03/2018 at 18:54,
> > > > Sebastian wrote: On Mon, 2018-03-26 at 18:48 +0200, Sebastian
> > > Berg
> > > > wrote: On Mon, 2018-03-26 at 11:53 -0400, Hameer Abbasi wrote:
> > > It'll
> > > > need to be thought out for object arrays and subclasses. But for
> > > > Regular numeric stuff, Numpy uses fmin and this would have the
> > > > desired
> > > > effect. I do not want to block this, but I would like a clearer
> > > > opinion about this issue, `np.nansum` as Benjamin noted would
> > > require
> > > > something like: np.nansum([np.nan], default=np.nan) because
> > > > np.sum([1], initializer=np.nan) np.nansum([1],
> > > initializer=np.nan)
> > > > would both give NaN if the logic is the same as the current
> > > `np.sum`.
> > > > And yes, I guess for fmin/fmax NaN happens to work. And then
> > > there
> > > > ar

Re: [Numpy-discussion] PR to add an initializer kwarg to ufunc.reduce (and similar functions)

2018-04-09 Thread Hameer Abbasi

>
> The reason would be the case of NaN which is not a possible initial
> value for the reduction.
>

Ah, I didn't think of that. However, at least for `min` and `max` this can
be accomplished with `fmin` and `fmax`.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?

2018-04-26 Thread Hameer Abbasi

 Hi Nathan,

np.any and np.all call np.or.reduce and np.and.reduce respectively, and
unfortunately the underlying function (ufunc.reduce) has no way of
detecting that the value isn’t going to change anymore. It’s also used for
(for example) np.sum (np.add.reduce), np.prod (np.multiply.reduce),
np.min(np.minimum.reduce), np.max(np.maximum.reduce).

You can find more information about this on the ufunc doc page
<https://docs.scipy.org/doc/numpy/reference/ufuncs.html>. I don’t think
it’s worth it to break this machinery for any and all, as it has numerous
other advantages (such as being able to override in duck arrays, etc)

Best regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On Apr 26, 2018 at 18:45, Nathan Goldbaum  wrote:


Hi all,

I was surprised recently to discover that both np.any and np.all() do not
have a way to exit early:

In [1]: import numpy as np

In [2]: data = np.arange(1e6)

In [3]: print(data[:10])
[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]

In [4]: %timeit np.any(data)
724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)

In [5]: data = np.zeros(int(1e6))

In [6]: %timeit np.any(data)
732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)

I don't see any discussions about this on the NumPy issue tracker but
perhaps I'm missing something.

I'm curious if there's a way to get a fast early-terminating search in
NumPy? Perhaps there's another package I can depend on that does this? I
guess I could also write a bit of cython code that does this but so far
this project is pure python and I don't want to deal with the packaging
headache of getting wheels built and conda-forge packages set up on all
platforms.

Thanks for your help!

-Nathan

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Short-circuiting equivalent of np.any or np.all?

2018-04-26 Thread Hameer Abbasi

Ah, in that case, if exotic platforms aren’t important for you, Numba can
do the trick quite well.

Best regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On Apr 26, 2018 at 18:58, Nathan Goldbaum  wrote:




On Thu, Apr 26, 2018 at 11:52 AM Hameer Abbasi 
wrote:

> Hi Nathan,
>
> np.any and np.all call np.or.reduce and np.and.reduce respectively, and
> unfortunately the underlying function (ufunc.reduce) has no way of
> detecting that the value isn’t going to change anymore. It’s also used for
> (for example) np.sum (np.add.reduce), np.prod (np.multiply.reduce),
> np.min(np.minimum.reduce), np.max(np.maximum.reduce).
>
> You can find more information about this on the ufunc doc page
> <https://docs.scipy.org/doc/numpy/reference/ufuncs.html>. I don’t think
> it’s worth it to break this machinery for any and all, as it has numerous
> other advantages (such as being able to override in duck arrays, etc)
>

Sure, I'm not saying that numpy should change, more trying to see if
there's an alternate way to get what I want in NumPy or some other package.


>
> Best regards,
> Hameer Abbasi
> Sent from Astro <https://www.helloastro.com> for Mac
>
> On Apr 26, 2018 at 18:45, Nathan Goldbaum  wrote:
>
>
> Hi all,
>
> I was surprised recently to discover that both np.any and np.all() do not
> have a way to exit early:
>
> In [1]: import numpy as np
>
> In [2]: data = np.arange(1e6)
>
> In [3]: print(data[:10])
> [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
>
> In [4]: %timeit np.any(data)
> 724 us +- 42.4 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)
>
> In [5]: data = np.zeros(int(1e6))
>
> In [6]: %timeit np.any(data)
> 732 us +- 52.9 us per loop (mean +- std. dev. of 7 runs, 1000 loops each)
>
> I don't see any discussions about this on the NumPy issue tracker but
> perhaps I'm missing something.
>
> I'm curious if there's a way to get a fast early-terminating search in
> NumPy? Perhaps there's another package I can depend on that does this? I
> guess I could also write a bit of cython code that does this but so far
> this project is pure python and I don't want to deal with the packaging
> headache of getting wheels built and conda-forge packages set up on all
> platforms.
>
> Thanks for your help!
>
> -Nathan
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Extending ufunc signature syntax for matmul, frozen dimensions

2018-05-01 Thread Hameer Abbasi

I agree with Eric here. As one of the users of __array_ufunc__, I'd
much rather have three separate gufuncs or a single one with axis
insertion and removal. On 30/04/2018 at 23:38, Eric wrote: I think I’m
-1 on this - this just makes things harder on the implementers of
_array_ufunc__ who now might have to work out which signature matches.
I’d prefer the solution where np.matmul is a wrapper around one of
three gufuncs (or maybe just around one with axis insertion) - this is
similar to how np.linalg already works. Eric On Mon, 30 Apr 2018 at
14:34 Stephan Hoyer  wrote: On Sun, Apr 29, 2018 at
2:48 AM Matti Picus  wrote: The proposed
solution to issue #9029 is to extend the meaning of a signature so
"syntax like (n?,k),(k,m?)->(n?,m?) could mean that n and m are
optional dimensions; if missing in the input, they're treated as 1,
and then dropped from the output" I agree that this is an elegant fix
for matmul, but are there other use-cases for "optional dimensions" in
gufuncs? It feels a little wrong to add gufunc features if we can only
think of one function that can use them.
___ NumPy-Discussion
mailing list NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Extending ufunc signature syntax for matmul, frozen dimensions

2018-05-02 Thread Hameer Abbasi

There is always the option of any downstream object overriding matmul, and
I fail to see which objects won't have a shape. - Hameer


On 01/05/2018 at 21:08, Marten wrote:

Just for completeness: there are *four* gufuncs (matmat, matvec,
vecmat, and vecvec).

I remain torn about the best way forward. The main argument against
using them inside matmul is that in order to decide which of the four
to use, matmul has to have access to the `shape` of the arguments.
This meants that means that `__array_ufunc__` cannot be used to
override `matmul` (or `@`) for any object which does not have a shape.
>From that perspective, multiple signatures is definitely a more
elegant solution.

An advantage of the separate solution is that they are useful
independently of whether they are used internally in `matmul`; though,
then again, with a multi-signature matmul, these would be trivially
created as convenience functions.

-- Marten
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Casting scalars

2018-05-10 Thread Hameer Abbasi

Hello, everyone!

I might be missing something and this might be a very stupid and redundant
question, but is there a way to cast a scalar to a given dtype?

Hameer
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Casting scalars

2018-05-10 Thread Hameer Abbasi

Yes, that I know. I meant given a dtype string such as 'uint8' or a
dtype object. I know I can possibly do np.array(scalar,
dtype=dtype)[()] but I was looking for a less hacky method. On
11/05/2018 at 07:50, Stuart wrote: np.float(scalar) On Thu, May 10,
2018 at 7:49 PM Hameer Abbasi  wrote:
Hello, everyone! I might be missing something and this might be a very
stupid and redundant question, but is there a way to cast a scalar to
a given dtype? Hameer ___
NumPy-Discussion mailing list NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Casting scalars

2018-05-10 Thread Hameer Abbasi

This is exactly what I needed! Thanks! On 11/05/2018 at 08:20, Warren
wrote: On Thu, May 10, 2018 at 10:53 PM, Hameer Abbasi
 wrote: Yes, that I know. I meant given a
dtype string such as 'uint8' or a dtype object. I know I can possibly
do np.array(scalar, dtype=dtype)[()] but I was looking for a less
hacky method. Apparently the `dtype` object has the attribute `type`
that creates objects of that dtype. For example, In [30]: a Out[30]:
array([ 1., 2., 3.]) In [31]: dt = a.dtype In [32]: dt Out[32]:
dtype('float64') In [33]: x = dt.type(8675309) # Convert the scalar to
a's dtype. In [34]: x Out[34]: 8675309.0 In [35]: type(x) Out[35]:
numpy.float64 Warren On 11/05/2018 at 07:50, Stuart wrote:
np.float(scalar) On Thu, May 10, 2018 at 7:49 PM Hameer Abbasi
 wrote: Hello, everyone! I might be missing
something and this might be a very stupid and redundant question, but
is there a way to cast a scalar to a given dtype? Hameer
___ NumPy-Discussion
mailing list NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___ NumPy-Discussion
mailing list NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Turn numpy.ones_like into a ufunc

2018-05-18 Thread Hameer Abbasi

You can preserve this with (for example) __array_ufunc__. On
18/05/2018 at 18:57, Nathan wrote: I don't particularly need this,
although it would be nice to make this behavior explicit, instead of
happening more or less by accident: In [1]: from yt.units import km In
[2]: import numpy as np In [3]: data = [1, 2, 3]*km In [4]:
np.ones_like(data) Out[4]: YTArray([1., 1., 1.]) km On Fri, May 18,
2018 at 9:51 AM, Marten van Kerkwijk 
wrote: I'm greatly in favour, especially if the same can be done for
`zeros_like` and `empty_like`, but note that a tricky part is that
ufuncs do not deal very graciously with structured (void) and string
dtypes. -- Marten ___
NumPy-Discussion mailing list NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Splitting MaskedArray into a separate package

2018-05-24 Thread Hameer Abbasi

I also somewhat like the idea of taking it out (once we have a first
replacement) in the case that we have a plan to do a better/lower level
replacement at a later point within numpy.
Removal generally has its merits, but if a (mid term) replacement will
come in any case, it would be nice to get those started first if
possible.
Otherwise downstream might end up having to fix up things twice.

- Sebastian


I also like the idea of designing a replacement first (using modern array
protocols, perhaps in a separate repository) and then deprecating
MaskedArray second. Deprecating an entire class in NumPy seems
counterproductive, although I will admit I’ve never found use from it. From
this thread, it’s clear that others have, though.

Sent from Astro  for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Allowing broadcasting of code dimensions in generalized ufuncs

2018-05-31 Thread Hameer Abbasi

While "on average" short-circuiting only gives a speedup of 2x, in many
situations you can arrange your algorithm so short circuiting will
happen early, eg usually in the first 10 elements of a 10^6 element
array, giving enormous speedups.

Also, I do not imagine these as free-floating ufuncs, I think we can
arrange them in a logical way in a gufunc ecosystem. There would be some
"core ufuncs", with "associated gufuncs" accessible as attributes. For
instance, any_less_than will be accessible as less.any

binary "comparison" ufuncs would have attributes

less.any
less.all
less.first # returns first matching index
less.count # counts matches without intermediate bool array

This adds on to the existing attributes, for instance
ufuncs already have:

add.reduce
add.accumulate
add.reduceat
add.outer
add.at

It is unfortunate that all ufuncs currently have these attributes even
if they are unimplemented/inappropriate (eg, np.sin.reduce), I would
like to remove the inappropriate ones, so each core ufunc will only
have the appropriate attribute "associated gufuncs".


I’m definitely in favour of all this. It’d be great to have this, and it’d
be an excellent ecosystem. I’ll add that composing ufuncs is something I’ve
wanted, and that has come up from time to time.

Incidentally, once we make reduce/accumuate/... into "associated
gufuncs", I propose completely removing the "method" argument of
__array_ufunc__, since it is no longer needed and adds a lot
of complexity which implementors of an __array_ufunc__ are forced to
account for.


While removing ‘method’ is okay in my book, there should at least be a way
to detect if something is e.g., a reduction, or an element-wise ufunc (this
one is obvious, all shapes involved will be ()). We, for example, use this
in pydata/sparse. As you can imagine, for sparse arrays, element-wise
operations behave a certain way and there turns out to be a general way to
do reductions that have certain properties as well. See my paper’s draft[1]
for details. I don’t mind the __array_ufunc__ api changing, but I’d like it
if there was a way to still access the information that was previously
available.

[1] https://github.com/scipy-conference/scipy_proceedings/pull/388

Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

2018-06-02 Thread Hameer Abbasi

Perhaps I missed this but I didn’t see: what happens when both
__array_ufunc__ and __array_function__ are defined? I might want to do this
to for example add support for functions like concatenate or stack to a
class that already has an __array_ufunc__ defines.


This is mentioned in the section “Non-goals”, which says that ufuncs and
their methods should be excluded, along with a few other classes of
functions/methods.

Sent from Astro  for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

2018-06-03 Thread Hameer Abbasi

The rules for dispatch with ``__array_function__`` match those for
``__array_ufunc__`` (see
`NEP-13 `_).
In particular:

-  NumPy will gather implementations of ``__array_function__`` from all
   specified inputs and call them in order: subclasses before
   superclasses, and otherwise left to right. Note that in some edge cases,
   this differs slightly from the
   `current behavior `_ of Python.
-  Implementations of ``__array_function__`` indicate that they can
   handle the operation by returning any value other than
   ``NotImplemented``.
-  If all ``__array_function__`` methods return ``NotImplemented``,
   NumPy will raise ``TypeError``.


I’d like to propose two changes to this:

   - ``np.NotImplementedButCoercible`` be a part of the standard from the
   start.
  - If all implementations return this, only then should it be coerced.
 - In the future, it might be good to mark something as coercible
 to coerce it to ``ndarray`` before passing to another object’s
 ``__array_ufunc__``.
  - This is necessary if libraries want to keep old behaviour for some
  functions, while overriding others.
  - Otherwise they have to implement overloads for all functions. This
  seems rather like an all-or-nothing choice, which I’d like to avoid.
  - It isn’t too hard to implement in practice.
   - Objects that don’t implement ``__array_function__`` should be treated
   as having returned ``np.NotImplementedButCoercible``.
  - This has the effect of coercing ``list``, etc.
  - At a minimum, to maintain compatibility, if all objects don’t
  implement ``__array_function__``, the old behaviour should stay.

Also, I’m +1 on Marten’s suggestion that ``ndarray`` itself should
implement ``__array_function__``.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

2018-06-03 Thread Hameer Abbasi

I also am not sure there is an actual problem: In the scheme as proposed,
implementations could just coerce themselves to array and call the routine
again. (Or, in the scheme I proposed, call the routine again but with
`coerce=True`.)


Ah, I didn’t think of the first solution. `coerce=True` may not produce the
desired solution in cases where some arguments can be coerced and some
can’t.

However, such a design may still have some benefits. For example:

   - ``array1.HANDLED_TYPES = [array1]``
   - ``array2.HANDLED_TYPES = [array1, array2]``
   - ``array1`` is coercible.
   - None of these is a sub/super class of the other or of ``ndarray``
   - When calling ``np.func(array1(), array2())``, ``array1`` would be
   coerced with your solution (because of the left-to-right rule and
   ``array1`` choosing to coerce itself) but not with
   ``np.NotImplementedButCoercible``.

I think that in the proposed scheme this is effectively what happens.


Not really, the current scheme is unclear on what happens if none of the
arguments implement ``__array_function__`` (or at least it doesn’t
explicitly state it that I can see).
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

2018-06-03 Thread Hameer Abbasi

Mixed return values of NotImplementedButCoercible and NotImplemented would
still result in TypeError, and there would be no second chances for
overloads.


I would like to differ with you here: It can be quite useful to have second
chances for overloads. Think ``np.func(list, custom_array))``: If second
rounds did not exist, custom_array would need to have a list of coercible
types (which is not nice IMO).

It can also help in cases where performance/feature degradation isn’t an
issue, so coercing all arguments that returned
``NotImplementedButCoercible`` would allow ``__array_function__`` to
succeed where it wouldn’t normally. I mean, that’s one of the major uses of
this sentinel right?

If done in a for loop, it wouldn’t even slow down the nominal cases. It
would have the adverse effect of not allowing for a default implementation
to be as simple as you stated, though.

One thing we could do is manually (inside ``__array_function__``) coerce
anything that didn’t implement ``__array_function__``, and that’s
acceptable to me too.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Random Number Generator Policy

2018-06-03 Thread Hameer Abbasi

 How about this:

"There will be no concept of a separate RNG version. In order to get
consistent or reproducible results from the RNG, it will be necessary to
specify the NumPy version that was used to generate those results. Results
from the RNG may change across different releases of Num Py."

Sent from Astro  for Mac

On 4. Jun 2018 at 10:47, Robert Kern  wrote:

On Sun, Jun 3, 2018 at 10:29 PM Charles R Harris 
wrote:

>
>
> On Sun, Jun 3, 2018 at 11:03 PM, Robert Kern 
> wrote:
>
>> On Sun, Jun 3, 2018 at 9:24 PM Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>> On Sat, Jun 2, 2018 at 1:04 PM, Robert Kern 
>>> wrote:

 This policy was first instated in Nov 2008 (in essence; the full set of
 weasel

>>>
>>> Instituted?
>>>
>>
>> I meant "instated"; c.f. for another usage:
>> https://www.youredm.com/2018/06/01/spotify-new-policy-update/
>>
>> But "instituted" would work just as well. It may be that "instated a
>> policy" is just an idiosyncratic back-formation of "reinstated a policy",
>> which even to me feels more right.
>>
>> Not Versioning
 --

 For a long time, we considered that the way to allow algorithmic
 improvements
 while maintaining the stream was to apply some form of versioning.
 That is,
 every time we make a stream change in one of the distributions, we
 increment
 some version number somewhere.  ``numpy.random`` would keep all past
 versions
 of the code, and there would be a way to get the old versions.
 Proposals of
 how to do this exactly varied widely, but we will not exhaustively list
 them
 here.  We spent years going back and forth on these designs and were
 not able
 to find one that sufficed.  Let that time lost, and more importantly,
 the
 contributors that we lost while we dithered, serve as evidence against
 the
 notion.

 Concretely, adding in versioning makes maintenance of ``numpy.random``
 difficult.  Necessarily, we would be keeping lots of versions of the
 same code
 around.  Adding a new algorithm safely would still be quite hard.

 But most importantly, versioning is fundamentally difficult to *use*
 correctly.
 We want to make it easy and straightforward to get the latest, fastest,
 best
 versions of the distribution algorithms; otherwise, what's the point?
 The way
 to make that easy is to make the latest the default.  But the default
 will
 necessarily change from release to release, so the user’s code would
 need to be
 altered anyway to specify the specific version that one wants to
 replicate.

 Adding in versioning to maintain stream-compatibility would still only
 provide
 the same level of stream-compatibility that we currently do, with all
 of the
 limitations described earlier.  Given that the standard practice for
 such needs
 is to pin the release of ``numpy`` as a whole, versioning
 ``RandomState`` alone
 is superfluous.

>>>
>>> This section is a bit unclear. Would it be correct to say that the rng
>>> version is the numpy version? If so, it might be best to say that up front
>>> before justifying it.
>>>
>>
>> I'm sorry, I'm unclear on what you are asking me to make clearer. There
>> is currently no such thing as "the rng version". The thrust of this section
>> of the NEP is to reject the previously floated idea of introducing the
>> concept at all. So I would certainly not say anything along the lines that
>> "the rng version is the numpy version". I do say, here and earlier, that
>> the way to get the same RNG code is to get the same version of numpy.
>>
>
> Just so, and you could make that clearer, as you do here.
>

I don't understand. All I did was repeat what I already said twice. If
you'd like to provide some text that would have clarified things for you,
I'll see about inserting it, but I'm at a loss for writing that text.

-- 
Robert Kern

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP: Dispatch Mechanism for NumPy’s high level API

2018-06-06 Thread Hameer Abbasi

On 6. Jun 2018 at 05:41, Nathan Goldbaum  wrote:

Oh wait, since the decorated version of the ufunc will be the one in the
public numpy API it won't break. It would only break if the callable that
was passed in *wasn't* the decorated version, so it kinda *has* to pass in
the decorated function to preserve backward compatibility. Apologies for
the noise.

On Tue, Jun 5, 2018 at 7:39 PM, Nathan Goldbaum 
wrote:

> Hmm, does this mean the callable that gets passed into __array_ufunc__
> will change? I'm pretty sure that will break the dispatch mechanism I'm
> using in my __array_ufunc__ implementation, which directly checks whether
> the callable is in one of several tuples of functions that have different
> behavior.
>

Section “Non-Goals” states that Ufuncs will not be part of this protocol,
__array_ufunc__ will be used to override those as usual.

Sent from Astro  for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Circle CI moving from 1.0 to 2.0

2018-06-14 Thread Hameer Abbasi

Hi Matti,

It seems the CircleCI config is already on Version 2.0. See here, notice
the 2.0 in front of every successful build.
https://circleci.com/gh/numpy/numpy

I can also see that some failed builds have 1.0 in front of them... But
this shouldn't happen.

Most likely this is a CircleCI issue, not one with our configuration. It
can be safely ignored.

Regards,
Hameer Abbasi



On 14/06/2018 at 23:13, Matti wrote:

I stumbled across this notice (only seems to appear in a failed build)

"This project is currently running on CircleCI 1.0 which will no longer
be supported after August 31, 2018. Please start migrating this project
to CircleCI 2.0 <https://circleci.com/docs/2.0/migration/>."

Here is the original link https://circleci.com/gh/numpy/numpy/2080

Is this an artifact that can be ignored or do we need to migrate, if so
has anyone already done it for their project?
Matti
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping Python 3.4 support for NumPy 1.16

2018-06-14 Thread Hameer Abbasi

It was a small task. I created a PR for it here
<https://github.com/numpy/numpy/pull/11337>. Feel free to merge after CI
passes or close.

Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 14. Jun 2018 at 22:50, Marten van Kerkwijk 
wrote:


It seems everyone is in favour - anybody in for making a PR reducing the
travis testing accordingly? (It seems a bit of overkill more generally -
would be good to reduce the kWhr footprint a little...) -- Marten

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Hameer Abbasi

 I second this design. If we were to consider the general case of a tuple
`idx`, then we’d not be moving forward at all. Design changes would be
impossible. I’d argue that this newer model would be easier for library
maintainers overall (who are the kind of people using this), reducing
maintenance cost in the long run because it’d lead to simpler code.

I would also that the “internal” classes expressing outer as vectorised
indexing etc. should be exposed, for maintainers of duck arrays to use. God
knows how many utility functions I’ve had to write to avoid relying on
undocumented NumPy internals for pydata/sparse, fearing that I’d have to
rewrite/modify them when behaviour changes or I find other corner cases.

Best Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 26. Jun 2018 at 09:46, Robert Kern  wrote:

On Tue, Jun 26, 2018 at 12:13 AM Eric Wieser 
wrote:

> > I don't think it should be relegated to the "officially discouraged"
> ghetto of `.legacy_index`
>
> The way I read it, the new spelling lof that would be the explicit but not
> discouraged `image.vindex[rr, cc]`.
>

Okay, I missed that the first time through. I think having more
self-contained descriptions of the semantics of each of these would be a
good idea. The current description of `.vindex` spends more time talking
about what it doesn't do, compared to the other methods, than what it does.

Some more typical, less-exotic examples would be a good idea.

> I would reserve warnings for the cases where the current behavior is
> something no one really wants, like mixing slices and integer arrays.
>
> These are the cases that would only be available under `legacy_index`.
>

I'm still leaning towards not warning on current, unproblematic common
uses. It's unnecessary churn for currently working, understandable code. I
would still reserve warnings and deprecation for the cases where the
current behavior gives us something that no one wants. Those are the real
traps that people need to be warned away from.

If someone is mixing slices and integer indices, that's a really good sign
that they thought indexing behaved in a different way (e.g. orthogonal
indexing).

If someone is just using multiple index arrays that would currently not
give an error, that's actually a really good sign that they are using it
correctly and are getting the semantics that they desired. If they wanted
orthogonal indexing, it is *really* likely that their index arrays would
*not* broadcast together. And even if they did, the wrong shape of the
result is one of the more easily noticed things. These are not silent
errors that would motivate adding a new warning.

-- 
Robert Kern

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Hameer Abbasi

 > Boolean indices are not supported. All indices must be integers, integer
arrays or slices.

I would hope that there’s at least some way to do boolean indexing. I often
find myself needing it. I realise that
`arr.vindex[np.nonzero(boolean_idx)]` works, but it is slightly too verbose
for my liking. Maybe we can have `arr.bindex[boolean_index]` as an alias to
exactly that?

Or is boolean indexing preserved as-is n the newest proposal? If so, great!

Another thing I’d say is `arr.?index` should be replaced with `arr.?idx`. I
personally prefer `arr.?x` for my fingers but I realise that for someone
not super into NumPy indexing, this is kind of opaque to read, so I propose
this less verbose but hopefully equally clear version, for my (and others’)
brains.

Best Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Hameer Abbasi

I actually had to think a lot, read docs, use SO and so on to realise what
those meant the first time around, I didn’t understand them on sight.

And I had to keep coming back to the docs from time to time as I wasn’t
exactly using them too much (for exactly this reason, when some problems
could be solved more simply by doing just that).

I’d prefer something that sticks in your head and “underscore” for
“indexing” didn't do that for me.

Of course, this was my experience as a first-timer. I’d prefer not to up
the learning curve for others in the same situation.

An experienced user might disagree. :-)

Best Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 26. Jun 2018 at 10:28, Eric Wieser  wrote:


Another thing I’d say is arr.?index should be replaced with arr.?idx.

Or perhaps arr.o_[] and arr.v_[], to match the style of our existing
np.r_, np.c_, np.s_, etc?
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP 21: Simplified and explicit advanced indexing

2018-06-26 Thread Hameer Abbasi

 I would disagree here. For libraries like Dask, XArray, pydata/sparse,
XND, etc., it would be bad for them if there was continued use of “weird”
indexing behaviour (no warnings means more code written that’s… well… not
exactly the best design). Of course, we could just choose to not support
it. But that means a lot of code won’t support us, or support us later than
we desire.

I agree with your design of “let’s limit the number of
warnings/deprecations to cases that make very little sense” but there
should be warnings.

Specifically, I recommend warnings for mixed slices and fancy indexes, and
warnings followed by errors for cases where the transposing behaviour
occurs.

Best Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 26. Jun 2018 at 10:33, Robert Kern  wrote:

On Tue, Jun 26, 2018 at 1:26 AM Travis Oliphant 
wrote:

> I like the proposal generally.  NumPy could use a good orthogonal indexing
> method and a vectorized-indexing method is fine too.
>
> Robert Kern is spot on with his concerns as well.  Please do not change
> what arr[idx] does except to provide warnings and perhaps point people to
> new .oix and .vix methods.  What indexing does is documented (if hard to
> understand and surprising in a particular sub-case).
>
> There is one specific place in the code where I would make a change to
> raise an error rather than change the order of the axes of the output to
> provide a consistent subspace.  Even then, it should be done as a
> deprecation warning and then raise the error.
>
> Otherwise, just add the new methods and don't make any other changes until
> a major release.
>

I'd suggest that the NEP explicitly disclaim deprecating current behavior.
Let the NEP just be about putting the new features out there. Once we have
some experience with them for a year or three, then let's talk about
deprecating parts of the current behavior and make a new NEP then if we
want to go that route. We're only contemplating *long* deprecation cycles
anyways; we're not in a race. The success of these new features doesn't
really rely on the deprecation of current indexing, so let's separate those
issues.

-- 
Robert Kern

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-26 Thread Hameer Abbasi

On 27. Jun 2018 at 07:48, Stephan Hoyer  wrote:


After much discussion (and the addition of three new co-authors!), I’m
pleased to present a significantly revision of NumPy Enhancement Proposal
18: A dispatch mechanism for NumPy's high level array functions:
http://www.numpy.org/neps/nep-0018-array-function-protocol.html

The full text is also included below.

Best,
Stephan

===
A dispatch mechanism for NumPy's high level array functions
===

:Author: Stephan Hoyer 
:Author: Matthew Rocklin 
:Author: Marten van Kerkwijk 
:Author: Hameer Abbasi 
:Author: Eric Wieser 
:Status: Draft
:Type: Standards Track
:Created: 2018-05-29

Abstact
---

We propose the ``__array_function__`` protocol, to allow arguments of NumPy
functions to define how that function operates on them. This will allow
using NumPy as a high level API for efficient multi-dimensional array
operations, even with array implementations that differ greatly from
``numpy.ndarray``.

Detailed description


NumPy's high level ndarray API has been implemented several times
outside of NumPy itself for different architectures, such as for GPU
arrays (CuPy), Sparse arrays (scipy.sparse, pydata/sparse) and parallel
arrays (Dask array) as well as various NumPy-like implementations in the
deep learning frameworks, like TensorFlow and PyTorch.

Similarly there are many projects that build on top of the NumPy API
for labeled and indexed arrays (XArray), automatic differentiation
(Autograd, Tangent), masked arrays (numpy.ma), physical units
(astropy.units,
pint, unyt), etc. that add additional functionality on top of the NumPy API.
Most of these project also implement a close variation of NumPy's level high
API.

We would like to be able to use these libraries together, for example we
would like to be able to place a CuPy array within XArray, or perform
automatic differentiation on Dask array code. This would be easier to
accomplish if code written for NumPy ndarrays could also be used by
other NumPy-like projects.

For example, we would like for the following code example to work
equally well with any NumPy-like array object:

.. code:: python

def f(x):
y = np.tensordot(x, x.T)
return np.mean(np.exp(y))

Some of this is possible today with various protocol mechanisms within
NumPy.

-  The ``np.exp`` function checks the ``__array_ufunc__`` protocol
-  The ``.T`` method works using Python's method dispatch
-  The ``np.mean`` function explicitly checks for a ``.mean`` method on
   the argument

However other functions, like ``np.tensordot`` do not dispatch, and
instead are likely to coerce to a NumPy array (using the ``__array__``)
protocol, or err outright. To achieve enough coverage of the NumPy API
to support downstream projects like XArray and autograd we want to
support *almost all* functions within NumPy, which calls for a more
reaching protocol than just ``__array_ufunc__``. We would like a
protocol that allows arguments of a NumPy function to take control and
divert execution to another function (for example a GPU or parallel
implementation) in a way that is safe and consistent across projects.

Implementation
--

We propose adding support for a new protocol in NumPy,
``__array_function__``.

This protocol is intended to be a catch-all for NumPy functionality that
is not covered by the ``__array_ufunc__`` protocol for universal functions
(like ``np.exp``). The semantics are very similar to ``__array_ufunc__``,
except
the operation is specified by an arbitrary callable object rather than a
ufunc
instance and method.

A prototype implementation can be found in
`this notebook <
https://nbviewer.jupyter.org/gist/shoyer/1f0a308a06cd96df20879a1ddb8f0006
>`_.

The interface
~

We propose the following signature for implementations of
``__array_function__``:

.. code-block:: python

def __array_function__(self, func, types, args, kwargs)

-  ``func`` is an arbitrary callable exposed by NumPy's public API,
   which was called in the form ``func(*args, **kwargs)``.
-  ``types`` is a ``frozenset`` of unique argument types from the original
NumPy
   function call that implement ``__array_function__``.
-  The tuple ``args`` and dict ``kwargs`` are directly passed on from the
   original call.

Unlike ``__array_ufunc__``, there are no high-level guarantees about the
type of ``func``, or about which of ``args`` and ``kwargs`` may contain
objects
implementing the array API.

As a convenience for ``__array_function__`` implementors, ``types``
provides all
argument types with an ``'__array_function__'`` attribute. This
allows downstream implementations to quickly determine if they are likely
able
to support the operation. A ``frozenset`` is used to ensure that
``__array_function__`` implementations cannot rely on the iteration order of
``type

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Hameer Abbasi

I think the usefulness of this feature is actually needed. Consider
`np.random.RandomState`. If we were to add what I proposed, the two could
work very nicely to (for example) do things like creating Dask random
arrays, from RandomState objects.

For reproducibility, Dask could generate multiple RandomState objects with
a seed sequential in the job numbers.

Looping in Matt Rocklin for this — He might have some input about the
design.

Best Regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 28. Jun 2018 at 14:37, Marten van Kerkwijk 
wrote:

On Wed, Jun 27, 2018 at 3:50 PM, Stephan Hoyer  wrote:

> So perhaps it's worth "future proofing" the interface by passing `obj` and
> `method` to __array_function__ rather than only `func`. It is slower to
> call a func via func.__call__ than func, but only very marginally (~100 ns
> in my tests).
>

That would make it more similar yet to `__array_ufunc__`, but I'm not sure
how useful it is, as you cannot generically assume the methods have the
same arguments and hence they need their own dispatcher. Once you're there
you might as well pass them on directly (since any callable can be used as
the function). Indeed, for `__array_ufunc__`, this might not have been a
bad idea either...

-- Marten

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-28 Thread Hameer Abbasi

Hi Martin,

It is. The point of the proposed feature was to handle array generation
mechanisms, that don't take an array as input in the standard NumPy API.
Giving them a reference handles both the dispatch and the decision about
which implementation to call.

I'm confused: Isn't your reference array just `self`?
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Hameer Abbasi

Hi Marten,

Sorry, I had clearly misunderstood. It would indeed be nice for overrides
to work on functions like `zeros` or `arange` as well, but it seems strange
to change the signature just for that. As a possible alternative, should we
perhaps generally check for overrides on `dtype`?


While this very clearly makes sense for something like astropy, it has a
few drawbacks:

   - Other duck arrays such as Dask need more information than just the
   dtype. For example, Dask needs chunk sizes, XArray needs axis labels, and
   pydata/sparse needs to know the type of the reference array in order to
   make one of the same type. The information in a reference array is a strict
   superset of information in the dtype.
   - There’s a need for a separate protocol, which might be a lot harder to
   work with for both NumPy and library authors.
   - Some things, like numpy.random.RandomState, don’t accept a dtype
   argument.

As for your concern about changing the signature, it’s easy enough with a
decorator. We’ll need a separate decorator for array generation functions.
Something like:

def array_generation_function(func):
@functools.wraps(func)
def wrapped(*args, **kwargs, array_reference=np._NoValue):
if array_reference is not np._NoValue:
success, result = try_array_function_override(wrapped,
[array_reference], args, kwargs)

if success:
return result

return func(*args, **kwargs)

return wrapped

Hameer Abbasi
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Revised NEP-18, __array_function__ protocol

2018-06-30 Thread Hameer Abbasi

Hi Marten,

Still, I'm not sure whether this should be included in the present NEP or
is best done separately after, with a few concrete examples of where it
would be useful.


There already are concrete examples from Dask and CuPy, and this is
currently a blocker for them, which is part of the reason I’m pushing so
hard for it. See #11074 <https://github.com/numpy/numpy/issues/11074> for a
context, and I think it was part of the reason that inspired Matt and
Stephan to write this protocol in the first place.

Best Regards,
Hameer Abbasi
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] backwards compatibility and deprecation policy NEP

2018-07-21 Thread Hameer Abbasi

Hello,

Very well written article! It takes a lot of important things into account.
I think a number of things should be mentioned, if only in the alternatives:

- One major version number change, with lots of “major version change”
deprecations grouped into it, along with an LTS release.
- The possibility of another major version change (possibly the same
one) where we re-write all portions that were agreed upon (via NEPs) to be
re-written, with a longer LTS release (3 years? 5?).
- I’m thinking this one could be similar to the Python 2 -> Python 3
transition. Note that this is different from having constant breakages,
this will be a mostly one-time effort and one-time breakage.
- We break the ABI, but not most of the C API.
- We port at least bug fixes and possibly oft-requested functionality
to the old version for a long time.
- But we fix all of the little things that are agreed upon by the
community to be “missing” or “wrong” in the current release. It may be a
while before this is adopted but it’ll be really beneficial in
the long run.
- We ping the dev-discussions of most major downstream users (SciPy,
all the scikits, Matplotlib, etc.) for their “pain points” and
also if they
think this is a good idea. This way, the amount of users included aren’t
just those on the NumPy mailing list.
- We enforce good practices in our code. For example, we will
explicitly disallow subclassing from ndarray, we get rid of
scalars, we fix
the type system.

This may sound radical (I myself think so), but consider that if we get rid
of a large amount of technical debt on the onset, have a reputation for a
clean code-base (rather than one that’s decades old), then we could onboard
a lot more active developers and existing developers can also get a lot
more work done. I may be getting ahead of myself on this, but feel free to
leave your thoughts and opinions.

Best regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 22. Jul 2018 at 01:48, Ralf Gommers wrote:

Hi all,

Here is a first draft of a NEP on backwards compatibility and deprecation
policy. This I think mostly formalized what we've done for the last couple
of years, however I'm sure opinions and wish lists will differ here.

Pull request: https://github.com/numpy/numpy/pull/11596

Rendered version:
https://github.com/rgommers/numpy/blob/nep-backcompat/doc/neps/nep-0023-backwards-compatibility.rst

Full text below (ducks).

Cheers,
Ralf

===
NEP 23 - Backwards compatibility and deprecation policy
===

:Author: Ralf Gommers
:Status: Draft
:Type: Process
:Created: 2018-07-14
:Resolution: (required for Accepted | Rejected | Withdrawn)

Abstract

In this NEP we describe NumPy's approach to backwards compatibility,
its deprecation and removal policy, and the trade-offs and decision
processes for individual cases where breaking backwards compatibility
is considered.

Detailed description

NumPy has a very large user base. Those users rely on NumPy being stable
and the code they write that uses NumPy functionality to keep working.
NumPy is also actively maintained and improved -- and sometimes improvements
require, or are made much easier, by breaking backwards compatibility.
Finally, there are trade-offs in stability for existing users vs. avoiding
errors or having a better user experience for new users. These competing
needs often give rise to heated debates and delays in accepting or rejecting
contributions. This NEP tries to address that by providing a policy as well
as examples and rationales for when it is or isn't a good idea to break
backwards compatibility.

General principles:

- Aim not to break users' code unnecessarily.
- Aim never to change code in ways that can result in users silently getting
incorrect results from their previously working code.
- Backwards incompatible changes can be made, provided the benefits outweigh
the costs.
- When assessing the costs, keep in mind that most users do not read the
mailing
list, do not look at deprecation warnings, and sometimes wait more than
one or
two years before upgrading from their old version. And that NumPy has
many hundreds of thousands or even a couple of million users, so "no one
will
do or use this" is very likely incorrect.
- Benefits include improved functionality, usability and performance (in
order
of importance), as well as lower maintenance cost and improved future
extensibility.
- Bug fixes are exempt from the backwards compatibility policy. However in
case
of serious impact on users (e.g. a downstream library doesn't build
anymore),
even bug fixes may have to be delayed for one or more releases.
- The Python API and the C API will be treated in th

Re: [Numpy-discussion] backwards compatibility and deprecation policy NEP

2018-07-24 Thread Hameer Abbasi

On 23. Jul 2018 at 19:46, Stephan Hoyer  wrote:

On Sat, Jul 21, 2018 at 6:40 PM Marten van Kerkwijk <
m.h.vankerkw...@gmail.com> wrote:

> But I think the subclassing section is somewhat misleading in suggesting
> `ndarray` is not well designed to be subclassed. At least, for neither my
> work on Quantity nor that on MaskedArray, I've found that the design of
> `ndarray` itself was a problem. Instead, it was the functions that were, as
> most were not written with subclassing or duck typing in mind, but rather
> with the assumption that all input should be an array, and that somehow it
> is useful to pass anything users pass in through `asarray`. With then
> layers on top to avoid this in specific circumstances... But perhaps this
> is what you meant?
>

I can't speak for Ralf, but yes, this is part of what I had in mind. I
don't think you can separate "core" objects/methods from functions that act
on them. Either the entire system is designed to handle subclassing through
some well-defined interface or is it not.

If you don't design a system for subclassing but allow it anyways (

and it's impossible to prohibit problematically in Python

This isn’t really true. Metaprogramming to the rescue I guess.
https://stackoverflow.com/questions/16564198/pythons-equivalent-of-nets-sealed-class#16564232

Best regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

), then you can easily end up with very fragile systems that are difficult
to modify or extend. As Ralf noted in the NEP, "Some of them change the
behavior of ndarray methods, making it difficult to write code that accepts
array duck-types." These changes end up having implications for apparently
unrelated functions (e.g., np.median needing to call np.mean internally to
handle units properly). I don't think anyone really wants that sort of
behavior or lock-in in NumPy itself, but of course that is the price we pay
for not having well-defined interfaces :). Hopefully NEP-18 will change
that, and eventually we will be able to remove hacks from NumPy that we
added only because there weren't any better alternatives available.

For the NEP itself, i would not mention "A future change in NumPy to not
support subclassing," because it's not as if subclassing is suddenly not
going to work as of a certain NumPy release.  Certain types of subclasses
(e.g., those that only add extra methods and/or metadata and do not modify
any existing functionality) have never been a problem and will be fine to
support indefinitely.

Rather, we might state that "At some point in the future, the NumPy
development team may no longer interested in maintaining workarounds for
specific subclasses, because other interfaces for extending NumPy are
believed to be more maintainable/preferred."

Overall, it seems to me that these days in the python eco-system
> subclassing is simply expected to work.
>

I don't think this is true. You can use subclassing on builtin types like
dict, but just because you can do it doesn't mean it's a good idea. If you
change built-in methods to work in different ways other things will break
in unexpected ways (or simply not change, also in unexpected ways).
Probably the only really safe way to subclass a dictionary is to define the
__missing__() method and not change any other aspects of the public
interface directly.

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Roadmap proposal, v3

2018-07-24 Thread Hameer Abbasi

Hey Stefan/Ralf/Stephan,

This looks nice, generally what the community agrees on. Great work, and
thanks for putting this together.

Best regards,
Hameer Abbasi
Sent from Astro <https://www.helloastro.com> for Mac

On 24. Jul 2018 at 21:04, Stefan van der Walt  wrote:


Hi everyone,

Please take a look at the latest roadmap proposal:

https://github.com/numpy/numpy/pull/11611

This is a living document, so can easily be modified in the future, but
we'd like to get in place a document that corresponds fairly closely
with current community priorities.

Best regards,
Stéfan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-02 Thread Hameer Abbasi

> On 28. Jul 2018, at 00:02, Stefan van der Walt  wrote:
> 
> Hi everyone,
> 
> A while ago, SciPy (the library) adopted its Code of Conduct:
> https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_of_conduct.html
> 
> We worked hard to make that document friendly, while at the same time
> stating clearly the kinds of behavior that would and would not be
> tolerated.
> 
> I propose that we adopt the SciPy code of conduct for NumPy as well.  It
> is a good way to signal to newcomers that this is a community that cares
> about how people are treated.  And I think we should do anything in our
> power to make NumPy as attractive as possible!
> 
> If we adopt this document as policy, we will need to select a Code of
> Conduct committee, to whom potential transgressions can be reported.
> The individuals doing this for SciPy may very well be happy to do the
> same for NumPy, but the community should decide whom will best serve
> those roles.
> 
> Let me know your thoughts.
> 
> Thanks!
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hi all!

I’ve been following this thread mainly from the sidelines and thought I’d give 
a few of my thoughts.
I like the idea that one of the rules or “protected classes” such as people of 
a certain race, gender, political affiliation etc. can’t use their “class 
status” to break any of the other rules. I believe we can make that clear in 
not so many words.
Nathaniel’s WeAllJS CoC seems a bit too conservative, and might promote an 
overly uptight and formal atmosphere, cruising through some of the examples. 
People should be allowed to joke and express themselves, so long as it isn’t 
derogatory towards others. Use of the word “crazy” should be allowed if it 
isn’t directed towards a person/group/work, or if it expresses extremes rather 
than a mental condition.
However, I do agree that a some people do like to insult people/groups/work out 
of habit and then just call it “jokes” or “shitposting”. No version of this 
should be allowed, even in humour.

Best Regards,
Hameer Abbasi___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-03 Thread Hameer Abbasi

Hi!

I feel we should make it clear that being in any specific category does not 
give you a free pass on any of the other rules. We should make that explicit, 
otherwise this looks fine to me.

Best Regards,
Hameer Abbasi

> On 3. Aug 2018, at 10:57, Stefan van der Walt  wrote:
> 
> 
> 
> On August 3, 2018 10:35:57 Stefan van der Walt  wrote:
> 
>> On August 3, 2018 09:50:38 Robert Kern  wrote:
>>> On Thu, Aug 2, 2018 at 11:01 PM Robert Kern >> <mailto:robert.k...@gmail.com>> wrote:
>>>  Nope, concision is definitely not my strength. But I hope I 
>>> made the argument clear, at least.
>>> 
>>> No, wait. I got it:
>>> 
>>> Bad actors use "diversity of political beliefs" in bad faith as cover for 
>>> undermining the goals of the diversity statement. Marginalized groups want 
>>> more assurance that our community (1) isn't one of those bad actors and (2) 
>>> is willing and capable of resisting those bad actors when they come.
>> 
>> 
>> That's a very useful summary; thank you. 
>> 
>> I think we can fairly easily add a sentence that encourages participation 
>> from a wide diversity of people, while making it clear that including 
>> someone in the conversation does not give them free reigns in contradiction 
>> with the rest of the guidelines. 
>> 
>> Ralf, if you agree, shall we do this for SciPy, and use the new version for 
>> NumPy too?
> 
> 
> Although, perhaps, a better question to answer is how many people feel that 
> the current document is deficient, and does not go far enough in stating 
> explicitly what we want from our community interactions. 
> 
> It is always hard to tell the opinion of the sometimes silent majority? 
> 
> Best regards, 
> Stéfan 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-03 Thread Hameer Abbasi


> On 3. Aug 2018, at 17:44, Ralf Gommers  wrote:
> 
> 
> 
> On Fri, Aug 3, 2018 at 2:04 AM, Matthew Brett  > wrote:
> Hi,
> 
> On Fri, Aug 3, 2018 at 9:35 AM, Stefan van der Walt
> mailto:stef...@berkeley.edu>> wrote:
> > On August 3, 2018 09:50:38 Robert Kern  > > wrote:
> >>
> >> On Thu, Aug 2, 2018 at 11:01 PM Robert Kern  >> > wrote:
> >>>
> >>>  Nope, concision is definitely not my strength. But I hope I
> >>> made the argument clear, at least.
> >>
> >>
> >> No, wait. I got it:
> >>
> >> Bad actors use "diversity of political beliefs" in bad faith as cover for
> >> undermining the goals of the diversity statement. Marginalized groups want
> >> more assurance that our community (1) isn't one of those bad actors and (2)
> >> is willing and capable of resisting those bad actors when they come.
> >
> >
> > That's a very useful summary; thank you.
> >
> > I think we can fairly easily add a sentence that encourages participation
> > from a wide diversity of people, while making it clear that including
> > someone in the conversation does not give them free reigns in contradiction
> > with the rest of the guidelines.
> >
> > Ralf, if you agree, shall we do this for SciPy, and use the new version for
> > NumPy too?
> 
> If someone with good wordsmithing skills could draft 1-2 sentences and send a 
> PR to the SciPy repo, so we have something concrete to discuss/approve, that 
> would be great. If not, I can take a stab at it early next week.
> 
> 
> I must say, I disagree.  I think we're already treading close to the
> edge with the current document, and it's more likely we'd get closer
> still with virtually any addition on this line.   I'm in favor of
> keeping the political beliefs in there, on the basis 
> 
> There's a much more straightforward basis one can think of. There are many 
> countries in the world that have dictatorships or one-party rule. This 
> includes countries that we get regular contributions from. Expressing support 
> for, e.g., democratic elections, can land you in all sorts of trouble there.
> 
> For a US conference it may be okay to take a purely US perspective, and even 
> then the inclusion/removal of "political beliefs" can be argued (as evidenced 
> by this thread). For a project with a global reach like NumPy it's really not 
> very good to take into account only US/Western voices.
> 
> it's really not
> too hard to distinguish good-faith political beliefs, and the current
> atmosphere is so repellent to people who would not identify as
> progressive, that I would like them to feel they have some protection.
> If you will not allow me "no change"
> 
> I think "not allow" is too strong. Your opinion matters as well, so I'm happy 
> to have/facilitate a higher bandwidth discussion on this if you want (after 
> Monday). 
>  
> and you offered me a) paragraph
> by group of the not-discriminated trying to imagine something
> comforting to imagined extremely sensitive and progressive (name your
> other group here) or b) no stated defense for not-progressive persons,
> I'd take b).
> 
> Imho Robert made a very compelling argument here, so I don't completely 
> understand the choice.
> 
> Cheers,
> Ralf
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org 
> https://mail.python.org/mailman/listinfo/numpy-discussion 
> 

I’ve created a PR, and I’ve kept the language “not too stern”. 
https://github.com/scipy/scipy/pull/9109 


Hameer Abbasi___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] pytest, fixture and parametrize

2018-08-08 Thread Hameer Abbasi

> On 8. Aug 2018, at 17:23, Charles R Harris  wrote:
> 
> Hi All,
> 
> I'm writing some tests and would like to come to a common understanding as to 
> whether we are should use pytest fixtures and parametrized classes and 
> functions. The immediate problem is testing over all types that should be 
> supported, which something we should to much more of and is most easily 
> achieved using one of those pytest tools. The advantage is code clarity and 
> more informative error messages, the disadvantage is being dependent on 
> pytest as unittest does not support that functionality.
> 
> Thoughts?
> 
> Chuck
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hi all!

I think it mainly depends on how likely we are to move away from Pytest. And 
I’m assuming not very likely. Anything likely to replace it (which won’t happen 
short-term in any case) will already have something similar, IMO.

So my vote is we go ahead and use them.

Hameer Abbasi
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-13 Thread Hameer Abbasi

 numpy point release. Or any numpy release.
> Nor should we. If we're serious about keeping this experimental – and
> I think that's an excellent idea for now! – then IMO we need to do
> something more to avoid getting trapped by backwards compatibility.
> 
> My suggestion: at numpy import time, check for an envvar, like say
> NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1. If it's not set, then all the
> __array_function__ dispatches turn into no-ops. This lets interested
> downstream libraries and users try this out, but makes sure that we
> won't have a hundred thousand end users depending on it without
> realizing. Other advantages:
> 
> - makes it easy for end-users to check how much overhead this adds (by
> running their code with it enabled vs disabled)
> - if/when we decide to commit to supporting it for real, we just
> remove the envvar.

We also have to consider that this might hinder adoption. But I’m fine with 
that. Properly > Quickly, as long as it doesn’t take too long. I’m +0 on this 
until we properly hammer out this stuff, then we remove it and make this the 
default.

However, I also realise that pydata/sparse is in the early stages, and can 
probably wait. Other duck array implementations such as Dask and XArray might 
need this soon-ish.

> 
> With this change, I'm overall +1 on the proposal. Without it, I...
> would like more convincing, at least :-).
> 
> # Minor quibbles
> 
> I don't really understand the 'types' frozenset. The NEP says "it will
> be used by most __array_function__ methods, which otherwise would need
> to extract this information themselves"... but they still need to
> extract the information themselves, because they still have to examine
> each object and figure out what type it is. And, simply creating a
> frozenset costs ~0.2 µs on my laptop, which is overhead that we can't
> possibly optimize later…

The rationale here is that most implementations would check if the types in the 
array are actually supported by their implementation. If not, they’d return 
NotImplemented. If it wasn’t done here, every input would need to do it 
individually, and this may take a lot of time.

I do agree that it violates DRY a bit though… The types are already present in 
the passed-in arguments, and this can be inferred from those.

> 
> -n
> 
> On Wed, Aug 1, 2018 at 5:27 PM, Stephan Hoyer  wrote:
>> I propose to accept NEP-18, "A dispatch mechanism for NumPy’s high level
>> array functions":
>> http://www.numpy.org/neps/nep-0018-array-function-protocol.html
>> 
>> Since the last round of discussion, we added a new section on "Callable
>> objects generated at runtime" clarifying that to handle such objects is out
>> of scope for the initial proposal in the NEP.
>> 
>> If there are no substantive objections within 7 days from this email, then
>> the NEP will be accepted; see NEP 0 for more details.
>> 
>> Cheers,
>> Stpehan
>> 
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>> 
> 
> 
> 
> -- 
> Nathaniel J. Smith -- https://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hope that clarifies things!

Best regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-15 Thread Hameer Abbasi

> On 15. Aug 2018, at 18:25, Matthew Brett  wrote:
> 
> Hi,
> 
> Thanks Nathaniel for this thoughtful response.
> 
> On Mon, Aug 13, 2018 at 10:44 AM, Nathaniel Smith  wrote:
> ...
>> The other approach would be to incrementally add clean, well-defined
>> dunder methods like __array_ufunc__, __array_concatenate__, etc. This
>> way we end up putting some thought into each interface, making sure
>> that it's something we can support, protecting downstream libraries
>> from unnecessary complexity (e.g. they can implement
>> __array_concatenate__ instead of hstack, vstack, row_stack,
>> column_stack, ...), or avoiding adding new APIs entirely (e.g., by
>> converting existing functions into ufuncs so __array_ufunc__ starts
>> automagically working). And in the end we get a clean list of dunder
>> methods that new array container implementations have to define. It's
>> plausible to imagine a generic test suite for array containers. (I
>> suspect that every library that tries to implement __array_function__
>> will end up with accidental behavioral differences, just because the
>> numpy API is so vast and contains so many corner cases.) So the
>> clean-well-defined-dunders approach has lots of upsides. The big
>> downside is that this is a much longer road to go down.
> 
> Does everyone agree that, if we had infinite time and resources, this
> would be the better solution?
> 

More resources means (given NumPy’s consensus system), more people have to 
agree on the overall design, so in my mind, it might even be slower.

> If we devoted all the resources of the current Numpy grant to taking
> this track, could we complete it in a reasonable time?

I somehow think just the design of all these different protocols, heck, even 
ironing all these different protocols out and ignoring implementation; would 
take an unreasonably long amount of time, as evidenced by this one NEP.

I’m more in favour of using this one rather conservatively: Perhaps a mailing 
list consensus before actually adding a function to __array_function__, making 
sure it won’t hinder too much progress.

I also differ with Nathaniel on one minor thing with his comparisons to 
Firefox, CPython, pytest and Sphinx: We’re not talking about monkey-patching 
NumPy internals, we’re just talking about monkey-patching the public API. Of 
course, this is still a cost and can still hinder development, but it’s 
definitely better than exposing all internals.

> 
> Cheers,
> 
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion 
> <https://mail.python.org/mailman/listinfo/numpy-discussion>

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-22 Thread Hameer Abbasi

Hi Nathaniel and Stephan,

Since this conversation is getting a bit lengthy and I see a lot of repeated 
stuff, I’ll summarise the arguments for everyone’s benefit and then present my 
own viewpoints:

Nathaniel:
Undue maintenance burden on NumPy, since semantics have to match exactly
Implementations of functions may change, which may break downstream library 
compatibility
There may be time taken in merging this everywhere, so why not take time to 
define proper protocols?
Hide this entire interface behind an environment variable, possibly to be 
removed later.
Stephan:
Semantics don’t have to match exactly, that isn’t the intent  of most 
duck-arrays.
This won’t happen given NumPy’s conservativeness.
The protocols will just be copies of __array_function__, but less capable
Provide an interface that only end-users may turn on.

My viewpoints:
I don’t think any Duck array implementers intend to copy semantics on that 
level. Dask, which is the most complete one, doesn’t have views, only copies. 
Many other semantics simply don’t match. The intent is to allow for code that 
expresses, well, intent (no pun intended) instead of relying heavily on 
semantics, but that can use arbitrary duck-array implementations instead of 
just ndarray.
Most of the implementations in NumPy are pretty stable, and the only thing 
that’s likely to happen here is bug fixes. And we are free to fix bugs those; I 
doubt implementation-specific bugs will be copied. However, these first two 
points are for/against duck arrays in general, and not specific to this 
protocol, so IMO this discussion is completely orthogonal to this one.
I agree with Stephan here: Defining a minimum API for NumPy that will complete 
duck arrays will produce a lot of functions in every case that cannot be 
overridden, as they simply cannot be expressed in terms of the protocols we 
have added so far. This will lead to more protocols being produced, and so on 
ad infinitum. We have to consider the burden that such a design would place on 
the maintainers of NumPy as well… I personally feel that the amount of such 
protocols we’ll so need are large enough that this line of action is more 
burdensome, rather than less. I prefer an approach with __array_function__ + 
mailing list ping before adding a function.
May I propose an alternative that was already discussed, and one that I think 
everyone will be okay with: We put all overridable functions inside a new 
submodule, numpy.api, that will initially be a shallow-ish copy of the numpy 
module. I say ish because all modules inside NumPy will need to be 
shallow-copied as well. If we need to add __array_function__, we can always do 
that there. Normal users are using “regular” NumPy unless they know they’re 
using the API, but it is separately accessible. As for hiding it completely 
goes: We have to realise, the Python computation landscape is fragmenting. The 
slower we are, the more fragmented it will become. NumPy already isn’t “the 
standard” for machine learning.

Regards,
Hameer Abbasi

> On 22. Aug 2018, at 03:46, Nathaniel Smith  wrote:
> 
> On Tue, Aug 21, 2018 at 9:39 AM, Stephan Hoyer  <mailto:sho...@gmail.com>> wrote:
>> On Tue, Aug 21, 2018 at 12:21 AM Nathaniel Smith  wrote:
>>> 
>>> On Wed, Aug 15, 2018 at 9:45 AM, Stephan Hoyer  wrote:
>>>> This avoids a classic subclassing problem that has plagued NumPy for
>>>> years,
>>>> where overriding the behavior of method A causes apparently unrelated
>>>> method
>>>> B to break, because it relied on method A internally. In NumPy, this
>>>> constrained our implementation of np.median(), because it needed to call
>>>> np.mean() in order for subclasses implementing units to work properly.
>>> 
>>> I don't think I follow... if B uses A internally, then overriding A
>>> shouldn't cause B to break, unless the overridden A is buggy.
>> 
>> 
>> Let me try another example with arrays with units. My understanding of the
>> contract provided by unit implementations is their behavior should never
>> deviate from NumPy unless an operation raises an error. (This is more
>> explicit for arrays with units because they raise errors for operations with
>> incompatible units, but practically speaking almost all duck arrays will
>> have at least some unsupported operations in NumPy's giant API.)
>> 
>> It is quite possible that NumPy functions could be (re)written in a way that
>> is incompatible with some unit implementations but is perfectly valid for
>> "full" duck arrays. We actually see this even within NumPy already -- for
>> example, see this recent PR adding support for the datetime64 dtype to
>> percentile:
>> https://github.com/numpy/numpy/pull/11627
> 
> I clicked the link, but I do

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-23 Thread Hameer Abbasi



> On 23. Aug 2018, at 18:37, Stephan Hoyer  wrote:
> 
> RE: the types argument
> 
> On Tue, Aug 21, 2018 at 12:21 AM Nathaniel Smith  > wrote:
> This is much more of a detail as compared to the rest of the
> discussion, so I don't want to quibble too much about it. (Especially
> since if we keep things really-provisional, we can change our mind
> about the argument later :-).) Mostly I'm just confused, because there
> are lots of __dunder__ functions in Python (and NumPy), and none of
> them take a special 'types' argument... so what's special about
> __array_function__ that makes it necessary/worthwhile?
> 
> What's special about __array_function__ is that it's a hook that lets you 
> override an entire API through a single interface. Unlike protocols like 
> __add__, implementers of __array_function__ don't know exactly which 
> arguments could have implemented the operation. 
>  
> Any implementation of, say, concatenate-via-array_function is going to
> involve iterating through all the arguments and looking at each of
> them to figure out what kind of object it is and how to handle it,
> right? That's true whether or not they've done a "pre-check" using the
> types set, so in theory it's just as easy to return NotImplemented at
> that point. But I guess your point in the last paragraph is that this
> means there will be lots of chances to mess up the
> NotImplemented-returning code in particular, especially since it's
> less likely to be tested than the happy path, which seems plausible.
> So basically the point of the types set is to let people factor out
> that little bit of lots of functions into one common place?
> 
> It's also a pragmatic choice: libraries like dask.array and autograd.numpy 
> have already implemented NumPy's API without overrides. These projects follow 
> the current numpy convention: non-native array objects are coerced into 
> native arrays (i.e., dask or autograd arrays). They don't do any type 
> checking.
> 
> I doubt there would be much appetite for writing alternative versions of 
> these APIs that return NotImplemented instead -- especially while this 
> feature remains experimental.
>  
> I guess some careful devs might be unhappy with paying extra so that other
> lazier devs can get away with being lazy, but maybe it's a good
> tradeoff for us (esp. since as numpy devs, we'll be getting the bug
> reports regardless :-)).
> 
> The only extra amount we pay extra is the price of converting these types 
> into a Python data structure and passing them into the __array_function__ 
> method call. We already had to collect them for __array_function__ itself to 
> identify unique types to call -- so this is a pretty minimal extra cost.
>  
> If that's the goal, then it does make me wonder if there might be a
> more direct way to accomplish it -- like, should we let classes define
> an __array_function_types__ attribute that numpy would check before
> even trying to dispatch to __array_function__?
> 
> This could potentially work, but now the __array_function__ protocol itself 
> becomes more complex and out of sync with __array_ufunc__. It's a much 
> smaller amount of additional complexity to add an additional passed argument.

I might add that if it’s a mandatory part of the protocol, then not all things 
will work. For example, if XArray and Dask want to support sparse arrays, 
they’ll need to add an explicit dependency.

> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-24 Thread Hameer Abbasi

Hi everyone,

On Fri, Aug 24, 2018 at 9:38 AM Nathaniel Smith  wrote:

> On Thu, Aug 23, 2018 at 9:02 AM,   wrote:
> > I might add that most duck array authors are highly unlikely to be
> newcomers
> > to the Python space. We should just put a big warning there while
> enabling
> > and that’ll be enough to scare away most devs from doing it by default.
>
> That's a reasonable idea... a Big Obnoxious Warning(tm) when it's
> enabled, or on first use, would achieve a lot of the same purpose.
> E.g.
>
> if this_is_the_first_array_function_usage():
> sys.stderr.write(
> "WARNING: this program uses NumPy's experimental
> '__array_function__' feature.\n"
> "It may change or be removed without warning, which might
> break this program.\n"
> "For details see
> http://www.numpy.org/neps/nep-0018-array-function-protocol.html\n";
> )
>
> -n
>
>
I was thinking of a FutureWarning... That's essentially what it's for.
Writing to stderr looks un-pythonic to me.


> --
> Nathaniel J. Smith -- https://vorpus.org
> ___________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


Best Regards,
Hameer Abbasi
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-24 Thread Hameer Abbasi

On Fri, Aug 24, 2018 at 5:55 PM Stephan Hoyer  wrote:

> On Thu, Aug 23, 2018 at 1:06 PM Hameer Abbasi 
> wrote:
>
>> I might add that if it’s a mandatory part of the protocol, then not all
>> things will work. For example, if XArray and Dask want to support sparse
>> arrays, they’ll need to add an explicit dependency.
>>
>
> I don't follow -- can you please elaborate?
>

If we make specifying __array_function_types__ a mandatory part -- And such
that it is a whitelist, the XArray or Dask would need to import sparse in
order to specify that they accept mixing sparse arrays with native arrays
(i.e. for adding sparse.SparseArray to __array_function_types__). Which is
basically what I mean. It might be a 'soft' dependency, but there will be a
dependency nonetheless.

>
> If you don't want to do anything with the 'types' argument, you can simply
> ignore it.
>
> The problem of identifying whether arguments have valid types or not
> remains unchanged from the situation with __add__ or __array_ufunc__.
> 'types' just gives you another optional tool to help solve it.
>
>> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-08-29 Thread Hameer Abbasi

> On 29. Aug 2018, at 11:44, Matti Picus  wrote:
> 
> On 29/08/18 10:37, Nathaniel Smith wrote:
>> it's easy to imagine scenarios where the
>> people being broken aren't the ones who had a chance to read the docs
>> – e.g. if a major package starts relying on __array_function__, then
>> it's all*their*  users who we'd be breaking, even though they had
>> nothing to do with it.
> This is a packaging problem. This proposal is intended for use by other 
> "major packages", not so much for end-users. We would have much more trouble 
> if we were proposing a broad change to something like indexing or the random 
> number module (see those NEPs). If we break one of those major packages, it 
> is on them to pin the version of NumPy they can work with. In my opinion very 
> few end users will be implementing their own ndarray classes with 
> `__array_function__`. While we will get issue reports, we can handle them 
> much as we do the MKL or OpenBLAS ones - pinpoint the problem and urge users 
> to complain to those packages.

One thing that might help here is nightly or continuous CI builds of the NumPy 
wheels. This would be good, as we could test it in CI, and fix it when it comes 
up. But I guess that’s another discussion.

Personally, for as long as this protocol is experimental, I’ll add a warning in 
the docs of sparse as well; saying this might disappear anytime.

> 
> Other than adding a warning, I am not sure what the concrete proposal is 
> here. To not accept the NEP?
> Matti
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] ANN: PyData/Sparse 0.4.1

2018-09-12 Thread Hameer Abbasi

=
Announcing PyData/Sparse 0.4.1
=

(Apologies for the cross-posting)

Hi everyone,

This is a performance, bug-fix and feature release. The changelog can be seen 
at https://sparse.pydata.org/en/latest/changelog.html

Highlights include:
Faux In-place operations
Mixed ndarray-sparse operations
Fill-values other than zero
Misc support for different functions

What’s PyData/Sparse?
———

PyData/Sparse is a an N-dimensional sparse array library. It’s compatible with 
NumPy and follows the ndarray interface as closely as possible. It depends on 
NumPy, SciPy and Numba.

Where can I find PyData/Sparse?
———
The project is available on GitHub at https://github.com/pydata/sparse and is 
available on PyPI and conda-forge as “sparse”. Documentation is hosted as 
https://sparse.pydata.org.

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to accept NEP-18, __array_function__ protocol

2018-09-13 Thread Hameer Abbasi

> On Thursday, Sep 13, 2018 at 7:30 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
>
>
> On Sat, Sep 8, 2018 at 7:12 PM Charles R Harris  (mailto:charlesr.har...@gmail.com)> wrote:
>
>
>
>
> > On Wed, Aug 1, 2018 at 6:27 PM Stephan Hoyer  > (mailto:sho...@gmail.com)> wrote:
> >
> >
>
>
>
>
>
>
> > > I propose to accept NEP-18, "A dispatch mechanism for NumPy’s high level 
> > > array functions":
> >
> >
> >
>
>
>
>
>
>
> > > http://www.numpy.org/neps/nep-0018-array-function-protocol.html
> > >
> >
> >
> >
>
>
>
>
> > >
> > > Since the last round of discussion, we added a new section on "Callable 
> > > objects generated at runtime" clarifying that to handle such objects is 
> > > out of scope for the initial proposal in the NEP.
> > >
> > > If there are no substantive objections within 7 days from this email, 
> > > then the NEP will be accepted; see NEP 0 for more details.
> > >
> >
> > I've merged the PR. What next?
> >
> > Chuck
>
> I rolled back Chuck's merge of the "Acceptance" PR for this NEP because (1) 
> it was not clear that if had reached consensus and (2) I still wanted to make 
> a few changes based on this discussion.
>
> I have now drafted these revisions to the NEP to clarify its stance around 
> backwards compatibility, and the type of the "types" argument: 
> https://github.com/numpy/numpy/pull/11943
>
>
>
>
> I have *not* included any form of the explicit "end-user opt-in" requested by 
> Nathaniel. I don't think it is warranted based on the scope of anticipated 
> future changes; see my earlier post [1] for details. Nobody has seriously 
> suggested that we would release this functionality and later eliminate the 
> ability to override NumPy functions entirely in the future.
>
>
>
>
> Of course, requiring an onerous explicit opt-in would be fine while this 
> feature is initially under development, and until we are sure that checks for 
> overriding arbitrary NumPy functions are fast enough to be generally viable. 
> In particular, we should verify that __array_function__ does not meaningfully 
> impact performance for major downstream components of the SciPy stack. But I 
> think running standard benchmark suites (e.g., with ASV) and user testing of 
> release candidates would be enough.
>
> If you have still have major objections to this version of proposal, please 
> raise them unambiguously -- preferably with an formal veto [2].
>
> Best,
> Stephan
>
>
>
>
> [1] https://mail.python.org/pipermail/numpy-discussion/2018-August/078669.html
> [2] https://docs.scipy.org/doc/numpy/dev/governance/governance.html
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

+1 from me. The edits are readable and clean, and a good compromise.

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Exact semantics of ufunc.reduce

2018-10-12 Thread Hameer Abbasi

Hello!

I’m trying to investigate the exact way ufunc.reduce works when given a custom 
dtype. Does it cast before or after the operation, or somewhere in between? How 
does this differ from ufunc.reduceat, for example?

We ran into this issue in pydata/sparse#191 
(https://github.com/pydata/sparse/issues/191) when trying to match the two 
where the only thing differing is the number of zeros for sum, which shouldn’t 
change the result.

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] random.choice(replace=False) very slow

2018-10-17 Thread Hameer Abbasi

Hi!

The standard algorithm for sampling without replacement is ``O(N)`` expected 
for ``N < 0.5 * M`` where ``M`` is the length of the original set, but 
``O(N^2)`` worst-case. When this is not true, a simple Durstenfeld-Fisher-Yates 
shuffle [1] (``O(M)``) can be used on the original set and then the first ``N`` 
items selected. Although this is fast, it uses up a large amount of memory 
(``O(M)`` extra memory rather than ``O(N)``) and I’m not sure where the best 
trade off is. It also can’t be used with an arbitrary probability distribution.

One way to handle this would be to sample a maximum of ``N // 2`` samples and 
then select the “unselected” samples instead. Although this has a faster 
expected run-time than the standard algorithm in all cases, it would break 
backwards-compatibility guarantees.

Best Regards,
Hameer Abbasi

[1] https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle

> On Wednesday, Oct 17, 2018 at 7:48 PM, Matthew Brett  (mailto:matthew.br...@gmail.com)> wrote:
> Hi,
>
> I noticed that numpy.random.choice was very slow, with the
> replace=False option, and then I noticed it can (for most cases) be
> made many hundreds of times faster in Python code:
>
> In [18]: sample = np.random.uniform(size=100)
> In [19]: timeit np.random.choice(sample, 500, replace=False)
> 42.1 ms ± 214 µs per loop (mean ± std. dev. of 7 runs, 10
> loops each)
> IIn [22]: def rc(x, size):
> ...: n = np.prod(size)
> ...: n_plus = n * 2
> ...: inds = np.unique(np.random.randint(0, n_plus+1, size=n_plus))[:n]
> ...: return x[inds].reshape(size)
> In [23]: timeit rc(sample, 500)
> 86.5 µs ± 421 ns per loop (mean ± std. dev. of 7 runs, 1 loops each)each)
>
> Is there a reason why it's so slow in C? Could something more
> intelligent than the above be used to speed it up?
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Approving NEP 27 - Historical discussion of 0-D arrays

2018-10-17 Thread Hameer Abbasi

Hi everyone,

Ah, I neglected to see that the PR was already merged. In any case, I’ll repeat 
my comment here (referring to the indexing section):

I would suggest that this section be removed entirely or updated. For example, 
if xis either an array scalar or a rank zero array, x[...] is guaranteed to be 
an array and x[()]is guaranteed to be a scalar. The difference is because 
x[{anything here}, ...] is guaranteed to be an array. In words, if the last 
index is an ellipsis, the result of indexing is guaranteed to be an array.

I came across this weird behaviour when implementing the equivalent of 
np.wherefor PyData/Sparse.

Best Regards,
Hameer Abbasi

> On Wednesday, Oct 17, 2018 at 7:59 PM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
> In PR 12166 https://github.com/numpy/numpy/pull/12166 we revived an old
> wiki document discussing the implementation of 0-dimensional arrays.
> This became informational NEP-27
> http://www.numpy.org/neps/nep-0027-zero-rank-arrarys.html. There was
> fruitful discussion of the NEP and the need for both 0-D arrays and
> scalars on the PR comments. The NEP itself is informational and freezes
> the information to the 2006 discussion, noting that "some of the
> information here is dated, for instance indexing of 0-D arrays now is
> now implemented and does not error."
>
>
> I would like to submit the NEP for discussion and approval.
>
> Matti
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] BIDS/NumPy dev meetings, Wednesdays 12pm Pacific

2018-10-17 Thread Hameer Abbasi

Dial in: 
https://berkeley.zoom.us/zoomconference?m=ta2dUMqcdK219Ov78Sj7CMIzzoX2CHGZ

Join in via PC: https://berkeley.zoom.us/j/400054438

Best Regards,
Hameer Abbasi

> On Wednesday, Oct 17, 2018 at 8:59 PM, Mark Harfouche 
> mailto:mark.harfou...@gmail.com)> wrote:
> Stefan. I would like to simply listen in. I cant seem to find the meeting ID 
> that we need to call in.
>
> On Tue, Oct 16, 2018 at 6:42 PM Allan Haldane  (mailto:allanhald...@gmail.com)> wrote:
> > I'll try to make it, especially as it looks like you want to discuss two
> > of my PRs! :)
> >
> > I have a different meeting a bit before then which might run over
> > though, so sorry ahead of time if I'm not there.
> >
> > Cheers,
> > Allan
> >
> >
> > On 10/16/18 5:26 PM, Stefan van der Walt wrote:
> > > Hi everyone,
> > >
> > > This is a friendly reminder of the BIDS/NumPy dev meetings, kicking off
> > > tomorrow at 12pm Pacific time.
> > >
> > > Please add any topics you wish to discuss to the agenda linked below.
> > >
> > > Best regards,
> > > Stéfan
> > >
> > >
> > > On Thu, 11 Oct 2018 22:43:58 -0700, Stefan van der Walt wrote:
> > >> The team at BIDS meets once a week to discuss progress, priorities, and
> > >> roadblocks. While our priorities are broadly determined by the project
> > >> roadmap [0], we would like to provide an opportunity for the community
> > >> to give more regular and detailed feedback on our work.
> > >>
> > >> We therefore invite you to join us for our weekly calls,
> > >> each **Wednesday from 12:00 to 13:00 Pacific Time**.
> > >>
> > >> Detail of the next meeting (2018-10-17) is given in the agenda [1],
> > >> which is a growing document—feel free to add topics you wish to discuss.
> > >>
> > >> We hope to see you there! I will send another reminder next week.
> > >>
> > >>
> > >> [0] https://www.numpy.org/neps/index.html
> > >> [1] https://hackmd.io/YZfpGn5BSu6acAFLBaRjtw#
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] asanyarray vs. asarray

2018-10-19 Thread Hameer Abbasi

Hi all

> On Friday, Oct 19, 2018 at 10:28 AM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
>
> Was there discussion around which of `asarray` or asanyarray` to prefer? PR 
> 11162, https://github.com/numpy/numpy/pull/11162, proposes `asanyarray` in 
> place of `asarray` at the entrance to `_quantile_ureduce_func` to preserve 
> ndarray subclasses. Should we be looking into changing all the `asarray` 
> calls into `asanyarray`?
>
>
>

I suspect that this will cause a large number of problems around np.matrix, so 
unless we deprecate that, this might cause a large amount of problems. The 
problem with np.matrix is that it’s a subclass, but it’s not substitutable for 
the base class, and so violates SOLID.

There are efforts to remove np.matrix, with the largest consumer being 
scipy.sparse, so unless that’s revamped, deprecating np.matrix is kind of hard 
to do.
>
>
>
>
>
> Matti
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] asanyarray vs. asarray

2018-10-19 Thread Hameer Abbasi

Hi!

> On Friday, Oct 19, 2018 at 6:09 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> I don't think it makes much sense to change NumPy's existing usage of 
> asarray() to asanyarray() unless we add subok=True arguments (which default 
> to False). But this ends up cluttering NumPy's public API, which is also 
> undesirable.
>
Agreed so far.
>
> The preferred way to override NumPy functions going forward should be 
> __array_function__.
>

I think we should “soft support” i.e. allow but consider unsupported, the case 
where one of NumPy’s functions is implemented in terms of others and “passing 
through” an array results in the correct behaviour for that array.

>
> On Fri, Oct 19, 2018 at 8:13 AM Marten van Kerkwijk 
> mailto:m.h.vankerkw...@gmail.com)> wrote:
> > There are exceptions for `matrix` in quite a few places, and there now is 
> > warning for `maxtrix` - it might not be bad to use `asanyarray` and add an 
> > exception for `maxtrix`. Indeed, I quite like the suggestion by Eric Wieser 
> > to just add the exception to `asanyarray` itself - that way when matrix is 
> > truly deprecated, it will be a very easy change.
> >
> > -- Marten
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting

2018-10-25 Thread Hameer Abbasi

Hi!

Sorry to miss this week’s meeting.

If I may point out an inaccuracy in the notes: in PyData/Sparse most things are 
implemented from the ground up without relying on scipy.sparse. The only part 
that does rely on it is `sparse.matmul`, `sparse.dot` and `sparse.tensordot`, 
as well as a few conversions to/from SciPy, if these could depend on Cython 
wrappers instead that’d be nice.

I should probably update the docs on that. If anyone is willing to discuss 
pydata/sparse with me, I’ll be available for a meeting anytime.

Best Regards,
Hameer Abbasi

> On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt 
> mailto:stef...@berkeley.edu)> wrote:
> Hi all,
>
> On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> > We therefore invite you to join us for our weekly calls,
> > each **Wednesday from 12:00 to 13:00 Pacific Time**.
> >
> > Detail of the next meeting (2018-10-24) is given in the agenda
>
> This week's meeting notes are at:
>
> https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md
>
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting

2018-10-26 Thread Hameer Abbasi

Hi everyone,

Like I said, we just use those to coerce SciPy arrays to native ones for 
compatibility. You could remove all those and the package would work fine, as 
long as you were using native PyData/Sparse arrays.

The only core functionality dependent on scipy.sparse is matrix multiplication 
and the like. Everything else is for inter-operability.

Best Regards,
Hameer Abbasi

> On Friday, Oct 26, 2018 at 1:19 AM, Tyler Reddy  (mailto:tyler.je.re...@gmail.com)> wrote:
> What exactly would you like Cython wrappers for? Some of the C++ code in 
> scipy/sparse/sparsetools?
>
> I see you have COO.from_scipy_sparse(x) in some pydata/sparse code paths, 
> which presumably you'd like to avoid or improve?
> On Thu, 25 Oct 2018 at 03:41, Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > Hi!
> >
> > Sorry to miss this week’s meeting.
> >
> > If I may point out an inaccuracy in the notes: in PyData/Sparse most things 
> > are implemented from the ground up without relying on scipy.sparse. The 
> > only part that does rely on it is `sparse.matmul`, `sparse.dot` and 
> > `sparse.tensordot`, as well as a few conversions to/from SciPy, if these 
> > could depend on Cython wrappers instead that’d be nice.
> >
> > I should probably update the docs on that. If anyone is willing to discuss 
> > pydata/sparse with me, I’ll be available for a meeting anytime.
> >
> > Best Regards,
> > Hameer Abbasi
> >
> >
> > > On Thursday, Oct 25, 2018 at 12:08 AM, Stefan van der Walt 
> > > mailto:stef...@berkeley.edu)> wrote:
> > > Hi all,
> > >
> > > On Mon, 22 Oct 2018 09:56:37 +0300, Matti Picus wrote:
> > > > We therefore invite you to join us for our weekly calls,
> > > > each **Wednesday from 12:00 to 13:00 Pacific Time**.
> > > >
> > > > Detail of the next meeting (2018-10-24) is given in the agenda
> > >
> > > This week's meeting notes are at:
> > >
> > > https://github.com/BIDS-numpy/docs/blob/master/status_meetings/status-2018-10-24.md
> > >
> > > Stéfan
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting

2018-10-26 Thread Hameer Abbasi

Hi Stefan!

PyData/Sparse is pretty far along, by January or so we should have a CSR/CSC 
replacement that is ND. It needs optimisation in a lot of cases but the API is 
compatible with NumPy and works pretty well already IMO.

PyData/Sparse is pretty much independent of any changes to scipy.sparse at this 
point. We build on top of NumPy, not scipy.sparse.

Feel free to use any or all of my code for sparray, although I think Ralf 
Gommers, Matthew Rocklin and others were of the opinion that the data structure 
should stay in PyData/Sparse and linear algebra and csgraph etc should go into 
SciPy.

Best Regards,
Hameer Abbasi

> On Friday, Oct 26, 2018 at 7:03 PM, Stefan van der Walt  (mailto:stef...@berkeley.edu)> wrote:
> Hi Hameer,
>
> On Fri, 26 Oct 2018 10:47:09 +0200, Hameer Abbasi wrote:
> > The only core functionality dependent on scipy.sparse is matrix
> > multiplication and the like. Everything else is for inter-operability.
>
> Thank you for commenting here.
>
> As you know, I am enthusiastic about seeing an `sparray` equivalent to
> `spmatrix`. When we last spoke, my recollection was that it would be
> beneficial to `pydata/sparse`. Is this still correct?
>
> If not, are we now in a situation where it would be more helpful to
> build `sparray` based on `pydata/sparse`.
>
> If we can have a good sparse array API in place in SciPy, it may
> significantly simplify code in various other libraries (I'm thinking of
> scikit-learn, e.g.).
>
> Best regards,
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting

2018-10-27 Thread Hameer Abbasi


> On Saturday, Oct 27, 2018 at 12:10 AM, Stefan van der Walt 
> mailto:stef...@berkeley.edu)> wrote:
> On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> > Just to make sure we're talking about the same things here: Stefan, I think
> > with "sparray" you mean "an n-D sparse array implementation that lives in
> > SciPy", nothing more specific? In that case pydata/sparse is the one
> > implementation, and including it in scipy.sparse would make it "sparray".
> > I'm currently indeed leaning towards depending on pydata/sparse rather than
> > including it in scipy.
>
> I want to double check: when we last spoke, it seemed as though certain
> refactorings inside of SciPy (specifically, sparray was mentioned) would
> simplify the life of pydata/sparse devs. That no longer seems to be the
> case?

Hi! I can’t recall having said this, perhaps you inferred it from the docs 
(it’s on the front page, so that isn’t unreasonable). We should update that 
sometime.

That said, we use very little of scipy.sparse in PyData/Sparse. When Matt 
Rocklin was maintaining the project, that was the case, but even in the later 
days he shifted much of his code to pure NumPy. I followed that path further, 
not out of unwillingness to depend on it, but out of desire for generality.

In its current state, the only things in PyData/Sparse that depend on 
scipy.sparse are:
Conversion to/from scipy.sparse spmatrix classes
A bit of linear algebra i.e. dot, tensordot, matmul.

Best Regards,
Hameer Abbasi

>
> If our recommended route is to tell users to use pydata/sparse instead
> of SciPy (for the sparse array object), we probably want to get rid of
> our own internal implementation, and deprecate spmatrix (or, build
> spmatrix on top of pydata/sparse)?
>
> Once we can define a clear API for sparse arrays, we can include some
> algorithms that ingest those objects in SciPy. But, I'm not sure we
> have an API in place that will allow handover of such objects to the
> existing C/FORTRAN-level code.
>
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting

2018-10-27 Thread Hameer Abbasi



> On Saturday, Oct 27, 2018 at 6:11 AM, Ralf Gommers  (mailto:ralf.gomm...@gmail.com)> wrote:
>
>
> On Sat, Oct 27, 2018 at 11:10 AM Stefan van der Walt  (mailto:stef...@berkeley.edu)> wrote:
> > On Sat, 27 Oct 2018 10:27:49 +1300, Ralf Gommers wrote:
> > > Just to make sure we're talking about the same things here: Stefan, I 
> > > think
> > > with "sparray" you mean "an n-D sparse array implementation that lives in
> > > SciPy", nothing more specific? In that case pydata/sparse is the one
> > > implementation, and including it in scipy.sparse would make it "sparray".
> > > I'm currently indeed leaning towards depending on pydata/sparse rather 
> > > than
> > > including it in scipy.
> >
> > I want to double check: when we last spoke, it seemed as though certain
> > refactorings inside of SciPy (specifically, sparray was mentioned) would
> > simplify the life of pydata/sparse devs. That no longer seems to be the
> > case?
>
> There's no such thing as `sparray` anywhere in SciPy. There's two inactive 
> projects to create an n-D sparse array implementation, one of which is called 
> sparray (https://github.com/perimosocordiae/sparray). And there's one very 
> active project to do that same thing which is https://github.com/pydata/sparse
>
> >
> > If our recommended route is to tell users to use pydata/sparse instead
> > of SciPy (for the sparse array object), we probably want to get rid of
> > our own internal implementation, and deprecate spmatrix
>
> Doc-deprecate I think; the sparse matrix classes in SciPy are very heavily 
> used, so it doesn't make sense to start emitting deprecation warnings for 
> them. But at some point we'll want to point users to pydata/sparse for new 
> code.
>
> > (or, build
> > spmatrix on top of pydata/sparse)?
>
> It's the matrix vs. array semantics that are the issue, so not sure that 
> building one on top of the other would be useful.
>
> >
> > Once we can define a clear API for sparse arrays, we can include some
> > algorithms that ingest those objects in SciPy. But, I'm not sure we
> > have an API in place that will allow handover of such objects to the
> > existing C/FORTRAN-level code.
>
> I don't think the constructors for sparse matrix/array care about C/F order. 
> pydata/sparse is pure Python (and uses Numba). For reusing 
> scipy.sparse.linalg and scipy.sparse.csgraph you're right I think that that 
> will need some careful design work. Not sure anyone has thought about that in 
> a lot of detail yet.
>

They don’t yet. That is a planned feature, allowing an arbitrary permutation of 
input coordinates.
>
> There are interesting API questions probably, such as how to treat explicit 
> zeros (that debate still isn't settled for the matrix classes IIRC).
>

Explicit zeros are easier now, just use a fill_value of NaN and work with zeros 
as usual.

Best Regards,
Hameer Abbasi

>
> And there's an interesting transition puzzle to figure out (which also 
> includes np.matrix). At the moment the discussion on that is spread out over 
> many mailing list threads and Github issues, at some point we'll need to 
> summarize that. Probably around the time that the CSR/CSC replacement that 
> Hameer mentioned is finished.
>
> Cheers,
> Ralf
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting 31.10 at 12:00 pacific time

2018-10-30 Thread Hameer Abbasi

Hello!

If I may make a suggestion, it might be nice to create a separate calendar and 
add people to it as needed for better management.

Best Regards,
Hameer Abbasi

> On Tuesday, Oct 30, 2018 at 8:16 PM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
>
> The draft agenda is at https://hackmd.io/D3I3CdO2T9ipZ2g5uAChcA?both.
>
>
> Everyone is invited to join.
>
>
>
>
>
>
> Matti, Tyler and Stefan
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: weekly status meeting 31.10 at 12:00 pacific time

2018-10-30 Thread Hameer Abbasi

Hi,

I meant we should have a calendar that’s possible to subscribe to, and in 
addition announce the agenda here, and that the calendar could contain a link 
to the meeting agenda.

Best Regards,
Hameer Abbasi

> On Tuesday, Oct 30, 2018 at 11:52 PM, Stefan van der Walt 
> mailto:stef...@berkeley.edu)> wrote:
> Hi Hameer,
>
> On Tue, 30 Oct 2018 21:24:36 +0100, Hameer Abbasi wrote:
> > If I may make a suggestion, it might be nice to create a separate
> > calendar and add people to it as needed for better management.
>
> Can you clarify what you want? Do you mean we should not announce the
> meeting agenda here, and instead only use a calendar? Or would you like
> a calendar link to subcribe to, that also contains a link to the meeting
> notes?
>
> Best regards,
> Stéfan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] asarray/anyarray; matrix/subclass

2018-11-09 Thread Hameer Abbasi


> Begin forwarded message:
>
> From: Stephan Hoyer
> Date: Friday, Nov 09, 2018 at 3:19 PM
> To: Hameer Abbasi
> Cc: Stefan van der Walt , Marten van Kerkwijk
> Subject: asarray/anyarray; matrix/subclass
>
> This is a great discussion, but let's try to have it in public (e.g., on the 
> NumPy mailing list).
> On Fri, Nov 9, 2018 at 8:42 AM Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > Hi Stephan,
> >
> > The issue I have with writing another function is that asarray/asanyarray 
> > are so widely used that it’d be a huge maintenance task to update them 
> > throughout NumPy, not to mention other codebases, not to mention other 
> > codebases having to rely on newer NumPy versions for this. In short, it 
> > would dramatically reduce adaptability of this function.
> >
> > One path we can take is to allow asarray/asanyarray to be overridable via 
> > __array_function__ (the former is debatable). This solves most of our 
> > duck-array related issues without introducing another protocol.
> >
> > Regardless of what path we choose, I would recommend changing asanyarray to 
> > not pass through np.matrix regardless, instead passing through 
> > mat.view(type=np.ndarray) instead, which has O(1) cost and memory. In the 
> > vast majority of contexts, it’s used to ensure an array-ish structure for 
> > another operation, and usually there’s no guarantee that what comes out 
> > will be a matrix anyway. I suggest we raise a FutureWarning and then change 
> > this behaviour.
> >
> > There have been a number of discussions about deprecating np.matrix (and a 
> > few about MaskedArray as well, though there are less compelling reasons for 
> > that one). I suggest we start down that path as soon as possible. The 
> > biggest (only?) user I know of blocking that is scipy.sparse, and we’re on 
> > our way to replacing that with PyData/Sparse.
> >
> > Best Regards,
> > Hameer Abbasi
> >
> >
> > > On Friday, Nov 09, 2018 at 1:26 AM, Stephan Hoyer  > > (mailto:sho...@gmail.com)> wrote:
> > > Hi Hameer,
> > >
> > > I'd love to talk about this in more detail. I agree that something like 
> > > this is needed.
> > >
> > > The challenge with reusing an existing function like asanyarray() is that 
> > > there is at least one (somewhat?) widely used ndarray subclass that badly 
> > > violates the Liskov Substitution Principle: np.matrix.
> > >
> > > NumPy can't really use np.asanyarray() widely for internal purposes until 
> > > we don't have to worry about np matrix. We might special case np.matrix 
> > > in some way, but then asanyarray() would do totally opposite things on 
> > > different versions of NumPy. It's almost certainly a better idea to just 
> > > write a new function with the desired semantics, and "soft deprecate" 
> > > asanyarray(). The new function can explicitly black list np.matrix, as 
> > > well as any other subclasses we know of that badly violate LSP.
> > >
> > > Cheers,
> > > Stephan
> > > On Thu, Nov 8, 2018 at 5:06 PM Hameer Abbasi  > > (mailto:einstein.edi...@gmail.com)> wrote:
> > > > No, Stefan, I’ll do that now. Putting you in the cc.
> > > >
> > > > It slipped my mind among the million other things I had in mind — 
> > > > Namely: My job visa. It was only done this Monday.
> > > >
> > > > Hi, Marten, Stephan:
> > > >
> > > > Stefan wants me to write up a NEP that allows a given object to specify 
> > > > that it is a duck array — Namely, that it follows duck-array semantics.
> > > >
> > > > We were thinking of switching asanyarray to switch to passing through 
> > > > anything that implements the duck-array protocol along with ndarray 
> > > > subclasses. I’m sure this would help XArray and Quantity work better 
> > > > with existing codebases, along with PyData/Sparse arrays.
> > > >
> > > > Would you be interested?
> > > >
> > > > Best Regards,
> > > > Hameer Abbasi
> > > >
> > > >
> > > > > On Thursday, Nov 08, 2018 at 9:09 PM, Stefan van der Walt 
> > > > > mailto:stef...@berkeley.edu)> wrote:
> > > > > Hi Hameer,
> > > > >
> > > > > In last week's meeting, we had the following in the notes:
> > > > >
> > > > > > Hameer is contacting Marten & Stephan and write up

Re: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__?

2018-11-10 Thread Hameer Abbasi


> On Saturday, Nov 10, 2018 at 6:59 PM, Marten van Kerkwijk 
> mailto:m.h.vankerkw...@gmail.com)> wrote:
>
> > More broadly, it is only necessary to reject an argument type at the 
> > __array_function__ level if it defines __array_function__ itself, because 
> > that’s the only case where it would make a difference to return 
> > NotImplemented rather than trying (and failing) to call the overriden 
> > function implementation.
>
> Yes, this makes sense -- these are the only types that could possibly change 
> the outcome if the class now called fails to produce a result. Indeed, that 
> reasoning makes it logical that `ndarray` itself is not present even though 
> it defines `__array_ufunc__` - we know it cannot handle anything with a 
> `__array_ufunc__` implementation.
>
> Hameer, is Stephan's argument convincing to you too? If so, I'll close the PR.
I agree with Stephan here, other than the fact that ndarray should be in the 
list of types. I can think of many cases in PyData/Sparse where I dont want to 
allow mixed inputs, but maybe that’s a tangential discussion.

Best Regards,
Hameer Abbasi

>
> -- Marten ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] asarray/anyarray; matrix/subclass

2018-11-10 Thread Hameer Abbasi


> On Saturday, Nov 10, 2018 at 9:16 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> On Sat, Nov 10, 2018 at 9:49 AM Marten van Kerkwijk 
> mailto:m.h.vankerkw...@gmail.com)> wrote:
> > Hi Hameer,
> >
> > I do not think we should change `asanyarray` itself to special-case matrix; 
> > rather, we could start converting `asarray` to `asanyarray` and solve the 
> > problems that produces for matrices in `matrix` itself (e.g., by overriding 
> > the relevant function with `__array_function__`).
> >
> > I think the idea of providing an `__anyarray__` method (in analogy with 
> > `__array__`) might work. Indeed, the default in `ndarray` (and thus all its 
> > subclasses) could be to let it return `self` and to override it for 
> > `matrix` to return an ndarray view.
>
> Yes, we certainly would rather implement a matrix.__anyarray__ method (if 
> we're already doing a new protocol) rather than special case np.matrix 
> explicitly.
>
> Unfortunately, per Nathaniel's comments about NA skipping behavior, it seems 
> like we will also need MaskedArray.__anyarray__ to return something other 
> than itself. In principle, we should probably write new version of 
> MaskedArray that doesn't deviate from ndarray semantics, but that's a rather 
> large project (we'd also probably want to stop subclassing ndarray).
>
> Changing the default aggregation behavior for the existing MaskedArray is 
> also an option but that would be a serious annoyance to users and backwards 
> compatibility break. If the only way MaskedArray violates Liskov is in terms 
> of NA skipping aggregations by default, then this might be viable. In 
> practice, this would require adding an explicit skipna argument so 
> FutureWarnings could be silenced. The plus side of this option is that it 
> would make it easier to use np.anyarray() or any new coercion function 
> throughout the internal NumPy code base.
>
> To summarize, I think these are our options:
> 1. Change the behavior of np.anyarray() to check for an __anyarray__() 
> protocol. Change np.matrix.__anyarray__() to return a base numpy array (this 
> is a minor backwards compatibility break, but probably for the best). Start 
> issuing a FutureWarning for any MaskedArray operations that violate Liskov 
> and add a skipna argument that in the future will default to skipna=False.
>
>
>
>
>

> 2. Introduce a new coercion function, e.g., np.duckarray(). This is the 
> easiest option because we don't need to cleanup NumPy's existing ndarray 
> subclasses.
>
>
>
>
>

My vote is still for 1. I don’t have an issue for PyData/Sparse depending on 
recent-ish NumPy versions — It’ll need a lot of the recent protocols anyway, 
although I could be convinced otherwise if major package devs (scikits, SciPy, 
Dask) were to weigh in and say they’ll jump on it (which seems unlikely given 
SciPy’s policy to support old NumPy versions).

>
>
> P.S. I'm just glad pandas stopped subclassing ndarray a while ago -- there's 
> no way pandas.Series() could be fixed up to not violate Liskov :). 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Should unique types of all arguments be passed on in __array_function__?

2018-11-10 Thread Hameer Abbasi

In that case, ignore my comment. :)

Best Regards,
Hameer Abbasi

> On Saturday, Nov 10, 2018 at 11:52 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> On Sat, Nov 10, 2018 at 2:08 PM Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > I agree with Stephan here, other than the fact that ndarray should be in 
> > the list of types. I can think of many cases in PyData/Sparse where I dont 
> > want to allow mixed inputs, but maybe that’s a tangential discussion.
> >
> >
>
>
> To be clear: ndarray *is* currently preserved in the list of types passed to 
> __array_function__ (because ndarray.__array_function__ is defined). 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Vectorized version of numpy.linspace

2018-11-14 Thread Hameer Abbasi


> On Wednesday, Nov 14, 2018 at 11:46 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
>
>
> On Wed, Nov 14, 2018 at 2:35 PM Sebastian Berg  (mailto:sebast...@sipsolutions.net)> wrote:
> > On Wed, 2018-11-14 at 14:32 -0500, Marten van Kerkwijk wrote:
> > > Code being better than words: see
> > > https://github.com/numpy/numpy/pull/12388 for an implementation. The
> > > change in the code proper is very small, though it is worrying that
> > > it causes two rather unrelated tests too fail (even if arguably both
> > > tests were wrong).
> > >
> > > Note that this does not give flexibility to put the axis where one
> > > wants; as written, the code made putting it at the start the obvious
> > > solution, as it avoids doing anything with the shapes of start and
> > > stop.
> >
> > Hehe, my first gut feeling was the last axis to be the obvious one ;).
> > This has been discussed before (but what hasn't) I believe, probably
> > some old issue or even PR somewhere.
> > I am mildly in favor, just because there is probably not much reason
> > against an easy vectorization. Doesn't need to be advertised much in
> > the docs anyway.
> > Although it might be good to settle the "obvious" part in case I am not
> > alone in first thinking of -1 being the obvious default. I would
> > probably skip the axis argument for now, unless someone actually has a
> > use case.
>
> Indeed -- I think the best argument for adding an "axis" argument is that it 
> allows people to be explicit about where the axis ends up, e.g., both 
> np.linspace(start, stop, num=5, axis=0) and np.linspace(start, stop, num=5, 
> axis=-1) make their intent quite clear.
>
> To me, axis=0 feels like the right default, matching np.concatenate and 
> np.stack. But NumPy already has split conventions for this sort of thing 
> (e.g., gufuncs add axes at the end), so I like the explicit option.

I’d like to have another vote for axis=-1 by default. Stack and concatenate are 
different because we are concatenating/stacking complete arrays, so it makes 
sense to “compose” them along the first axis to maintain C-contiguous-ness.

I actually think of this as the reverse, we are “composing/combining” lots of 
1D arrays over all the other dimensions, so to preserve C-contiguous-ness, it’s 
better to have axis=-1.

> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Reminder: Numpy dev meeting Fri-Sat Nov 30-Dec 1

2018-11-28 Thread Hameer Abbasi

Hi everyone!

Just want to add that I’ll be available remotely for some of the time. You can 
reach me via email and we can set up a call.

Best Regards,
Hameer Abbasi

> On Wednesday, Nov 28, 2018 at 10:15 PM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
> We will be meeting at BIDS 9:00AM Friday for a two-day NumPy developer
> meeting. All are welcome, if you haven't already please let Stefan know
> you are coming so we can plan for space. A tentative schedule is here
>
>
> https://hackmd.io/gFqjPUSvSmm-0gmBDbrTBw?both
>
>
> Feel free to add content (just tag it with your name).
>
>
> Tyler, Matti, Stefan.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Warn or immidiately change readonly flag on broadcast_arrays return value?

2018-12-25 Thread Hameer Abbasi

Hi!

Broadcasting almost always returns a repeated output
(except when all arrays are the same shape), that’s the entire point. I suspect 
this function is in fairly widespread use and will therefore cause a lot of 
downstream issues when repeating, so I’m -0.5 on a DeprecationWarning. A 
FutureWarning might be more appropriate, in which case I’m +0.2.

As for making the output read-only, that might break code, but most likely the 
code was erroneous anyway. But breaking backward-compatibility without a grace 
period is unheard of in this community. I’m +0.5 on it anyway. 🤷🏻‍♂️

Overall, a kind of hairy problem with no clear solution.

Best Regards,
Hameer Abbasi

> On Tuesday, Dec 25, 2018 at 12:13 PM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
> In PR 12609 https://github.com/numpy/numpy/pull/12609 I added code to
> emit a DepricationWarning when broadcast_arrays returns an array where
> the output is repeated. While this is a minimal fix to the problem,
> perhaps we should consider making the output readonly immediately instead?
>
>
> - A deprecation cycle requires two changes to downstream user's code:
> one to filter the deprecation warning, and another when we actually make
> the change
>
> - Writing to the repeated data will cause errors now.
>
>
> What do you think, should we change the behaviour at all, and if so
> should we depricate it over two releases or change it immediately?
>
>
> The original issue is here https://github.com/numpy/numpy/issues/2705
>
>
> Matti
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Testing radix sort on big-endian architectures

2019-01-03 Thread Hameer Abbasi

Hello.

I’m writing a PR (#12586 (https://github.com/numpy/numpy/pull/12586)) for radix 
sort, and since some operations involve bit-shifts, the concern came up that 
big-endian architectures might not support it. If someone has access to a 
machine with a big-endian architecture, it would be nice if they could run the 
test suite on that PR.

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Debugging NumPy development build in conda environment

2019-01-04 Thread Hameer Abbasi

> On Friday, Jan 04, 2019 at 2:44 PM, Lars Grueter  (mailto:la...@mailbox.org)> wrote:
> Dear devs,
>
> If this is the wrong place for this kind of question I apologize and
> feel free to ignore this or point me to the right place.
>
> I'm using the conda package manager for my development environment which
> used to work just fine with these steps:
>
> > conda create -n dev-numpy python=3.7 numpy numpydoc cython pytest ipython 
> > coverage
> > source activate dev-numpy
> > conda uninstall --force numpy numpy-base
> > pip install -e .
>
> This would yield me a nice isolated playground and I could use
>
> > python setup.py build_ext --incplace -j 4
>
> at anytime to rebuild the binaries. Unfortunately the last two commands
> stopped working out of the blue with my existing environment and fail with
>
> > RuntimeError: Broken toolchain: cannot link a simple C program
>
> (full output in [1]) on a clean repository. No recent changes to my tool
> chain come to mind which could help me pinpoint the problem. Because it
> used to work just fine I have the annoying feeling that the fix is
> something small and trivial and have the hope that it's immediately
> obvious to someone here or that you can suggest a way to debug this. In
> any case your help would be appreciated.
>
> I'm aware that this may be an exotic dev setup. Using a virtual
> environment instead of the conda one works just fine. The build script
> seems to use a different linear algebra library in that case: blas from
> my OS as opposed to mkl in the conda environment. This is not my
> preferred solution but it would do if nothing comes of this.
>
> Some additional facts and research I've done:
>
> - Using conda environments with asv has stopped working as well due to
> not finding any LA library at all.
> - If I'm understanding [1] correctly the build script finds the mkl
> library provided by the conda environment.
> - Trying earlier versions of NumPy shows the same behavior. I tested the
> tags v1.14.6, v1.15.4.
> - However creating a development environment for SciPy and scikit-image
> using this approach still works fine.
> - Searching the internet or looking at NumPy's build guide didn't yield
> me a solution.
> - I'm on linux (Manjaro) if that is of any relevance.
> - Miniconda was reinstalled in my home directory with the installation
> script provided through Anaconda's website.
> - Haven't yet tried this approach in a fresh virtual machine.
>
> I hope that I haven't overlooked something obvious. In any case thank
> you for your time and attention.
>
> Cheers, Lars
>
> [1] https://gist.github.com/lagru/c6ac16b9984492850fe1174d6418b7b5
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hi Lars,

What I’m used to doing is simply the following (inside the NumPy root):

> conda create -n numpy-dev python=3[.x] pytest
> conda activate numpy-dev
>
>
> python setup.py build_ext --inplace -j 4
>
>
> pip install -e .
>
>

Often, when I run into issues with the build, I simply do this:

> [backup site.cfg]
> git clean -xfd
>
>
> [restore site.cfg]
>
> python setup.py build_ext --inplace -j 4
>
> pip install -e .
>
>

In your case, it might be the second that helps. If it’s not finding any LA 
library, you can try compiling with the site.cfg that points to MKL.
Cheers and Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] timedelta64 remainder behavior with div by 0

2019-01-08 Thread Hameer Abbasi

I would say this is desirable behaviour, but I’m still +0.8 on this for 
backward compatibility reasons.

I doubt anyone would build code that relies on this though… They would almost 
certainly check for the zero in the denominator rather than the return value.

Best Regards,
Hameer Abbasi

> On Tuesday, Jan 08, 2019 at 6:57 PM, Tyler Reddy  (mailto:tyler.je.re...@gmail.com)> wrote:
> We are now at the stage of implementing the timedelta64 divmod inner loop 
> given very recent additions of floordiv and remainder inner loops for this 
> data type. However, there is some contention about a previous decision 
> regarding modulus behavior that we'd like to resolve before we bake it in to 
> divmod.
>
> Currently, a modulus operation with two timedelta64 operands with a 0 
> denominator returns 0. For example:
>
> np.timedelta64(5) % np.timedelta64(0) -> numpy.timedelta64(0)
>
> In contrast, np.float64(1) % np.float64(0) -> nan
>
> There's a suggestion that we should switch to returning NaT for the 
> timedelta64 case for consistency, and that this probably isn't too harmful 
> given how recent these additions are.
>
> Do we have consensus on this?
>
> Ref: https://github.com/numpy/numpy/pull/12683
>
> Thanks!
> Tyler ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Removing undocumented buffer attribute lookup

2019-02-27 Thread Hameer Abbasi

Cc-ing in Travis, because he was the original author of the buffer protocol, 
and this is most definitely related.

Best Regards,
Hameer Abbasi

> On Wednesday, Feb 27, 2019 at 9:20 AM, Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
> In digging around the code, I found a gem in PyArray_FromBuffer (exposed
> to python as numpy.frombuffer). If a PyObject* does not have a
> tp_as_buffer->bf_getbuffer function, we check if the python object has a
> __buffer__ attribute. If so we use that as buf in
> PyObject_GetBuffer(buf, ...).
>
>
> This seems to stem back to the original numerics code, where getBuffer
> would look up the attribute and call it as a method. PyArray_FromBuffer
> does not call the attribute as a method, it simply passes it on to
> PyObject_GetBuffer, which will then raise an error saying it cannot
> convert a method. You can try this out by creating a class with a
> __buffer__ method and calling numpy.frombuffer on it.
>
>
> I submitted a pull request to remove the code. Since it is undocumented
> and (as far as I can tell) broken, I do not think we need a deprecation
> cycle.
>
>
> More details, including links to the original numeric code from 2005, in
> the PR https://github.com/numpy/numpy/pull/13049
>
>
> Any thoughts or objections?
>
>
> Matti
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Complex Normal Generator in NEP-19 extension

2019-03-29 Thread Hameer Abbasi

> On Friday, Mar 29, 2019 at 6:01 PM, Kevin Sheppard 
> mailto:kevin.k.shepp...@gmail.com)> wrote:
> One part of moving randomgen closer to fulfilling NEP-19 is rationalizing the 
> API, especially new features not in RandomState. Matti Picus has made a lot 
> of progress in getting it integrated, especially the part of replacing 
> RandomState shimed version of the new generator.
>
> There is only one new method in the generator, a scalar generator for complex 
> normals. It is scalar in the sense that it is the complex version of 
> np.random.normal, and so supports broadcasting.
>
> This was written based on some GH comments. This would be a new API and so it 
> needs to come here first to see if there is any support.
>
> If there is support, then it will appear in the new RandomGenerator, but not 
> the RandomState replacement. If not, then we can just delete it.
>
> Kevin
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

+1

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Complex Normal Generator in NEP-19 extension

2019-03-30 Thread Hameer Abbasi

> On Friday, Mar 29, 2019 at 6:03 PM, Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > On Friday, Mar 29, 2019 at 6:01 PM, Kevin Sheppard 
> > mailto:kevin.k.shepp...@gmail.com)> wrote:
> > One part of moving randomgen closer to fulfilling NEP-19 is rationalizing 
> > the API, especially new features not in RandomState. Matti Picus has made a 
> > lot of progress in getting it integrated, especially the part of replacing 
> > RandomState shimed version of the new generator.
> >
> > There is only one new method in the generator, a scalar generator for 
> > complex normals. It is scalar in the sense that it is the complex version 
> > of np.random.normal, and so supports broadcasting.
> >
> > This was written based on some GH comments. This would be a new API and so 
> > it needs to come here first to see if there is any support.
> >
> > If there is support, then it will appear in the new RandomGenerator, but 
> > not the RandomState replacement. If not, then we can just delete it.
> >
> > Kevin
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
> +1
>
> Best Regards,
> Hameer Abbasi
>
>
>
>

To expand on this, the Complex normal distribution is pretty common in 
communications, control, signals and systems, and so on. :) It’d be a great add.

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Beginner Help: Generating HTML using Sphinx

2019-04-02 Thread Hameer Abbasi

Hi Mbeng!

What is the output of python --version and python3 --version? It seems to me 
that you’re still using Python 2.7.

You may need to use the command python3 rather than python.

Best Regards,
Hameer Abbasi

> On Tuesday, Apr 02, 2019 at 10:34 AM, Mbeng Tanyi  (mailto:valentinate...@gmail.com)> wrote:
> Hello
>
> I still have a problem with this. I am using sphinx1.8.5 and changed the 
> default python version on my computer to be 3.6 but I get the following error 
> on $ make html ;
>
> > Traceback (most recent call last):
> > File "/home/valentina-t/.local/bin/sphinx-build", line 7, in 
> > from sphinx.cmd.build import main
> > File 
> > "/home/valentina-t/.local/lib/python2.7/site-packages/sphinx/cmd/build.py", 
> > line 39
> > file=stderr)
> > ^
> > SyntaxError: invalid syntax
> > Makefile:123: recipe for target 'html' failed
> > make: *** [html] Error 1
>
> I googled the error but didn't find anything really useful. I feel it has 
> something to do with python2.7 as part of the path. Suppose I am right, I 
> need help changing it to python3.6 please.
>
> Regards
> Mbeng Tanyi
>
>
>
> On Sun, Mar 31, 2019 at 9:20 PM Matti Picus  (mailto:matti.pi...@gmail.com)> wrote:
> >
> > On 31/3/19 10:56 pm, Mbeng Tanyi wrote:
> > > Hello
> > >
> > > I also got an error the first time I tried $ make file as follows:
> > >
> > > mkdir -p build/html build/doctrees
> > > LANG=C sphinx-build -b html -WT --keep-going -d build/doctrees
> > > source build/html
> > > /bin/sh: 1: sphinx-build: not found
> > > Makefile:123: recipe for target 'html' failed
> > > make: *** [html] Error 127
> > >
> > >
> > > After upgrading to sphinx2 as was suggested here, I still get errors
> > > after $ make html :
> > >
> > > mkdir -p build/html build/doctrees
> > > LANG=C sphinx-build -b html -WT --keep-going -d build/doctrees
> > > source build/html
> > > Traceback (most recent call last):
> > > File "/home/valentina-t/.local/bin/sphinx-build", line 7, in
> > > 
> > > from sphinx.cmd.build import main
> > > File
> > > "/home/valentina-t/.local/lib/python2.7/site-packages/sphinx/cmd/build.py",
> > > line 39
> > > file=stderr)
> > > ^
> > > SyntaxError: invalid syntax
> > > Makefile:123: recipe for target 'html' failed
> > > make: *** [html] Error
> > >
> > >
> > >
> > >
> >
> > You need to use sphinx version 1.8.5, and python3.6.
> >
> > Matti
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-25 Thread Hameer Abbasi

> On Thursday, Apr 25, 2019 at 9:45 PM, Marten van Kerkwijk 
> mailto:m.h.vankerkw...@gmail.com)> wrote:
> It seems we are adding to the wishlist! I see four so far:
> 1. Exposed in API, can be overridden with __array_ufunc__
> 2. One that converts everything to ndarray (or subclass); essentially the 
> current implementation;
> 3. One that does asduckarray
> 4. One that assumes all arguments are arrays.
>
> Maybe handiest would be if there is a method to coerce all relevant arguments 
> with a function of one's choice? I.e., in the example of Stephan, one would 
> have
> ```
> if function in JUST_COERCE:
> coerced_args, coerced_kwargs = function.__coerce__(np.asanyarray, *args, 
> **kwargs)
> return function.__implementation__(*coerced_args, **coerced_kwargs)
> ```
> Actually, this might in fact work with the plan proposed here, if we allow 
> for an extra, optional kwarg that contains the coercion function, that is
> ```
> return function.__implementation__(*args, coercion_function=np.asanyarray, 
> **kwargs)
> ```
>
> The possible advantage of this over yet more dunder methods is that one can 
> fine-tune the extent to which something has to mimic an array properly (e.g., 
> run `asanyarray` only if `shape` is not present).
>
> It would be nice, though, if we could end up with also option 4 being 
> available, if only because code that just can assume ndarray will be easiest 
> to read.
>
> All the best,
>
> Marten
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hi everyone,

Although, in general, I agree with Stephan’s design goals, I agree with Marten 
that the number of protocols are getting larger and may get out of hand if not 
handled properly. There’s even one Marten forgot to mention: __array_dtype__. I 
have been working on a project that I consider to have all the essential 
features that Marten proposes, mostly within one framework. It’s called uarray 
(for universal array) and can be found over at

Source: https://github.com/Quansight-Labs/uarray
Documentation: https://uarray.readthedocs.io/en/latest/
It adopts the “separation of implementation from interface” principles from the 
beginning. Here’s how it works: There are MultiMethods and Backends. A Backend 
registers implementations for a given MultiMethod. A MultiMethod defines the 
signature, along with the elements that can be dispatched over, along with 
their types. To it, NumPy is (and I realise this is going to be controversial, 
since this is the NumPy mailing list), just another backend.

Here’s how it addresses Marten’s concerns:
Everything is made into a MultiMethod. Then, the multimethod marks objects it’d 
like to dispatch over. For the status quo, this is arrays. But thinking 
long-term, we could dispatch over abstract ufuncs and dtypes as well. For 
ufuncs, ufunc.__call__ and ufunc.reduce are also MultiMethods.
Coercion works by extracting marked dispatchables, converting them into native 
library equivalents and then passing them back into the function. For example, 
it would convert lists (or anything marked as an array) to arrays. What it 
could also do is convert dtype=‘int64’ to an actual dtype, and so on.
__asduckarray__ is rendered unnecessary… Coercion handles that.

You can check out the usage examples in the tests:

Core backend infrastructure: 
https://github.com/Quansight-Labs/uarray/blob/master/uarray/tests/test_backend.py
Backend infrastructure: 
https://github.com/Quansight-Labs/uarray/blob/master/unumpy/tests/test_numpy.py

Examples of how to write NumPy MultiMethods are here: 
https://github.com/Quansight-Labs/uarray/blob/master/unumpy/multimethods.py, 
along with the accompanying Backends in 
https://github.com/Quansight-Labs/uarray/tree/master/unumpy.

Best Regards,
Hameer Abbasi

signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-25 Thread Hameer Abbasi

> What's __array_dtype__? That string doesn't seem to appear in the
> numpy source, and google has no hits…

This was a proposed protocol for dispatching over user-defined dtype objects, I 
think Stephan and a few others wrote up a mock at SciPy 2018.

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-26 Thread Hameer Abbasi

Here’s my take on it: The goal is basically “separation of interface from 
implementation”, NumPy reference becomes just one (reference) implementation 
(kind of like CPython is today). The idea is that unumpy/NumPy drive the 
interface, while there can be many implementations.

To make duck-arrays work with the same code. This is achieved by 
`__array_function__`, other than for cases where we’re creating an array.
Composability, and traversing backend boundaries.
Coercion to native library objects: This requires the “reverse dispatcher” I 
kept mentioning to take the args/kwargs and “put back” the coerced arrays into 
it. This is impossible in the current framework, but can be made possible using 
the proposals by Stephan and Marten.
Dispatch over arbitrary objects, such as dtypes or ufuncs, from other 
libraries. We are far from this goal, and, it will require repitions of 
protocols already available for arrays…
Here’s how `uarray` solves each of these issues:
Backends… There is no default implementation.
This is handled by (thread-safe) context managers, which make switching easy.
There’s one coercion function per type of objec
Libraries are only asked to dispatch over objects they know how to convert, so 
there’s no backwards-incompatible break when we add dtypes or ufuncs.
Conversion can be as simple as lambda x: x.
There’s a generic dispatcher and reverse dispatcher per function, with “marks” 
to indicate the type of object.

Arrays are just one “type” of object you can dispatch over, so there’s no 
repition by definition.

Best Regards,
Hameer Abbasi

> On Friday, Apr 26, 2019 at 10:31 AM, Ralf Gommers  (mailto:ralf.gomm...@gmail.com)> wrote:
>
>
> On Fri, Apr 26, 2019 at 1:02 AM Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> > On Thu, Apr 25, 2019 at 3:39 PM Ralf Gommers  > (mailto:ralf.gomm...@gmail.com)> wrote:
> > >
> > > On Fri, Apr 26, 2019 at 12:04 AM Stephan Hoyer  > > (mailto:sho...@gmail.com)> wrote:
> > > > I do like the look of this, but keep in mind that there is a downside 
> > > > to exposing the implementation of NumPy functions -- now the 
> > > > implementation details become part of NumPy's API. I suspect we do not 
> > > > want to commit ourselves to never changing the implementation of NumPy 
> > > > functions, so at the least this will need careful disclaimers about 
> > > > non-guarantees of backwards compatibility.
> > >
> > > I honestly still am missing the point of claiming this. There is no 
> > > change either way to what we've done for the last decade. If we change 
> > > anything in the numpy implementation of any function, we use deprecation 
> > > warnings etc. What am I missing here?
> >
> > Hypothetically, wuppose we rewrite np.stack() in terms of np.block() 
> > instead of np.concatenate(), because it turns out it is faster.
> >
> > As long as we've coercing with np.asarray(), users don't notice any 
> > material difference -- their code just gets a little faster.
> >
> > But this could be problematic if we support duck typing. For example, I 
> > support dask arrays rely on NumPy's definition of np.stack in terms of 
> > np.concatenate, but they never bothered to implement np.block. Now 
> > upgrading NumPy breaks dask.
>
> Thanks, this helped clarify what's going on here. This example is clear. The 
> problem seems to be that there's two separate discussions in this thread:
> 1. your original proposal, __numpy_implementation__. it does not have the 
> problem of your np.concatenate example, as the "numpy implementation" is 
> exactly the same as it is today.
> 2. splitting up the current numpy implementation into *multiple* entry 
> points. this can be with and without coercion, with and without checking for 
> invalid values etc.
>
> So far NEP 18 does (1). Your proposed __numpy_implementation__ addition to 
> NEP 18 is still (1). Claiming that this affects the situation with respect to 
> backwards compatibility is incorrect.
>
> (2) is actually a much more invasive change, and one that does much more to 
> increase the size of the NumPy API surface. And yes, affects our backwards 
> compatibility situation as well.
>
> Also note that these have very different purposes:
> (1) was to (quoting from the NEP) "allow using NumPy as a high level API for 
> efficient multi-dimensional array operations, even with array implementations 
> that differ greatly from numpy.ndarray."
> (2) is for making duck arrays work with numpy implementations of functions 
> (not just with the NumPy API)
>
> I think (1) is mostly achieved, and I'm +1 on your NEP addition for that. (2) 
> is quickly becoming

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-26 Thread Hameer Abbasi

Hi Stephan,

> Hameer, it's great that you are exploring these problems with a fresh 
> approach! I'm excited to see how dispatching problems could be solved without 
> the constraint of compatibility with NumPy's legacy approaches.
>
> When you have a prototype and/or design documents ready for review, please do 
> share them with the numpy-discussion list. I would be very glad to review 
> them and share my perspective.

That’s a great idea! I’ll get those ready, perhaps a NEP.
> That said, please save it a separate discussion thread, given that the design 
> of uarray is (wisely) orthogonal to NEP-18.
I disagree, I don’t consider it orthogonal: I’m presenting a way to avoid the 
very protocols being discussed, and I’d like to avoid duplicate work, or making 
NumPy itself un-maintainable. Please note the text of NEP-18:

> The __array_function__protocol, and its use on particular functions, is 
> experimental. We plan to retain an interface that makes it possible to 
> override NumPy functions, but the way to do so for particular functions can 
> and will change with little warning. If such reduced backwards compatibility 
> guarantees are not accepted to you, do not rely upon overrides of NumPy 
> functions for non-NumPy arrays. See “Non-goals” below for more details.

What I’m presenting is within scope, as it’s an alternative method.

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-27 Thread Hameer Abbasi

Hi Ralf,

[snip]

> If I'm missing something that's actually in NEP 18, can you please point out 
> the actual text?

NEP-22 is the high-level overview of the goals, but NEP-18 is the concrete 
proposal for __array_function__. Quoting that NEP, right under “Implementation":

> The __array_function__protocol, and its use on particular functions, is 
> experimental. We plan to retain an interface that makes it possible to 
> override NumPy functions, but the way to do so for particular functionscan 
> and will changewith little warning. If such reduced backwards compatibility 
> guarantees are not accepted to you, do not rely upon overrides of NumPy 
> functions for non-NumPy arrays. See “Non-goals” below for more details.

[snip]

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-27 Thread Hameer Abbasi

Hi Stephan,

> On Saturday, Apr 27, 2019 at 6:21 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> On Fri, Apr 26, 2019 at 9:16 AM Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > > That said, please save it a separate discussion thread, given that the 
> > > design of uarray is (wisely) orthogonal to NEP-18.
> > I disagree, I don’t consider it orthogonal: I’m presenting a way to avoid 
> > the very protocols being discussed, and I’d like to avoid duplicate work, 
> > or making NumPy itself un-maintainable. Please note the text of NEP-18:
> >
> > > The __array_function__protocol, and its use on particular functions, is 
> > > experimental. We plan to retain an interface that makes it possible to 
> > > override NumPy functions, but the way to do so for particular functions 
> > > can and will change with little warning. If such reduced backwards 
> > > compatibility guarantees are not accepted to you, do not rely upon 
> > > overrides of NumPy functions for non-NumPy arrays. See “Non-goals” below 
> > > for more details.
> >
> > What I’m presenting is within scope, as it’s an alternative method.
> >
> > Best Regards,
> > Hameer Abbasi
> >
> >
> >
> >
>
>
> Are there aspects of your uarray proposal that are relevant to the current 
> proposed revisions to NEP 18? If so, please restate them :).
>

Of course, here’s my proposal:

We leave NEP-18 as-is for now, and instead of writing separate protocols for 
coercion, dtypes and ufuncs (which will be needed somewhere down the line), we 
have a discussion about uarray and see if it can help there. :-)

Ralf and I discussed internally about the possibility of a dedicated call, with 
all important participants.
>
> Thanks,
> Stephan
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Best Regards,
Hameer Abbasi



signature.asc
Description: PGP signature
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adding to the non-dispatched implementation of NumPy methods

2019-04-28 Thread Hameer Abbasi

Hey Stephan,

After some discussion with Ralf, I feel that the best way forward would be to 
add the __numpy_implementation__ (which is my preferred name for it). While I 
consider the interface final (or at least to the point where we would only add 
functionality and not remove it), I would prefer to keep the experimental tag, 
for this very reason: Avoiding having to write a new NEP for adding 
functionality.

However, I’m open to declaring it non-experimental it as well.

Best Regards,
Hameer Abbasi

> On Sunday, Apr 28, 2019 at 5:50 AM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> On Sat, Apr 27, 2019 at 4:39 PM Hameer Abbasi  (mailto:einstein.edi...@gmail.com)> wrote:
> > > On Saturday, Apr 27, 2019 at 6:21 PM, Stephan Hoyer  > > (mailto:sho...@gmail.com)> wrote:
> > > Are there aspects of your uarray proposal that are relevant to the 
> > > current proposed revisions to NEP 18? If so, please restate them :).
> > >
> > >
> > >
> > >
> >
> >
> > Of course, here’s my proposal:
> >
> > We leave NEP-18 as-is for now, and instead of writing separate protocols 
> > for coercion, dtypes and ufuncs (which will be needed somewhere down the 
> > line), we have a discussion about uarray and see if it can help there. :-)
>
> I don't want to add separate protocols for coercion, dtypes or ufuncs as part 
> of NEP-18. Whatever form these should take, they should definitely be a 
> separate proposals.
>
> __array_function__ is not the end of the story about duck array support in 
> NumPy, but I think it's valuable incremental step, as evidenced by the 
> projects that are already eager to adopt it. I would really, really like to 
> try to get a usable and near-final version of it released in NumPy 1.17. That 
> doesn't leave us much time.
>
> I've very interested in your work on uarray, but as far as I can tell, it 
> would not directly interact with NumPy's implementation of 
> __array_function__, so discussing it doesn't feel immediately urgent to me. 
> Rather, it's an alternative and possibly more complete solution for some of 
> the same problems. That's fantastic -- but please, let us finish 
> __array_function__ first. ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] New Proposed Time for the NumPy Community Meeting

2019-05-15 Thread Hameer Abbasi

Hello everyone!

I’d like to propose that we shift the time of the NumPy community meeting to 
one hour earlier starting next week.

The reason is twofold: One, we have an Indian student who wishes to join the 
meetings, and so it’d be nice to have a more reasonable time for her timezone.

Second, my fast for the month of Ramadan breaks at 9:07 PM, and will only get 
later. I’d hate to eat during the meeting.

Best Regards,
Hameer Abbasi

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Keep __array_function__ unexposed by default for 1.17?

2019-05-23 Thread Hameer Abbasi

On Thu, 2019-05-23 at 10:19 -0700, Stephan Hoyer wrote:
> On Thu, May 23, 2019 at 2:43 AM Ralf Gommers 
> wrote:
> > On Thu, May 23, 2019 at 3:02 AM Marten van Kerkwijk <
> > m.h.vankerkw...@gmail.com> wrote:
> > > > If we want to keep an "off" switch we might want to add some
> > > > sort of API for exposing whether NumPy is using
> > > > __array_function__ or not. Maybe
> > > > numpy.__experimental_array_function_enabled__ = True, so you
> > > > can just test `hasattr(numpy,
> > > > '__experimental_array_function_enabled__')`? This is assuming
> > > > that we are OK with adding an underscore attribute to NumPy's
> > > > namespace semi-indefinitely.
> > 
> > I don't think we want to add or document anything publicly. That
> > only adds to the configuration problem, and indeed makes it harder
> > to rely on the issue. All I was suggested was keeping some
> > (private) safety switch in the code base for a while in case of
> > real issues as a workaround. 
> > 
> 
> I was concerned that libraries dask might have different behavior
> internally depending upon whether or not __array_function__ is
> enabled, but looking more carefully dask only does this detection for
> tests. So maybe this is not needed.
> 
> Still, I'm concerned about the potential broader implications of
> making it possibly to turn this off. In general, I don't think NumPy
> should have configurable global state -- it opens up the possibility
> of a whole class of issues. Stefan van der Walt raised this point
> when this "off switch" was suggested a few months ago:
> https://mail.python.org/pipermail/numpy-discussion/2019-March/079207.html

I agree -- Global mutable state is bad in general, but keeping around
the environment variable is okay.
> That said, I'd be OK with keeping around an environment variable as
> an emergency opt-out for now, especially to support benchmarking the
> impact of __array_function__ checks.

+1 for keeping the env var for now.
> But I would definitely be opposed to keeping around this switch
> around long term, for more than a major version or two. If there will
> be an outcry when we remove checks for
> NUMPY_EXPERIMENTAL_ARRAY_FUNCTION, then we should reconsider the
> entire __array_function__ approach.
> 
> > > Might this be overthinking it? I might use this myself on
> > > supercomputer runs were I know that I'm using arrays only. Though
> > > one should not extrapolate from oneself!
> > > 
> > > That said, it is not difficult as is. For instance, we could
> > > explain in the docs that one can tell from:
> > > ```
> > > enabled = hasattr(np.core, 'overrides') and
> > > np.core.overrides.ENABLE_ARRAY_FUNCTION
> > > ```
> > > One could even allow for eventual removal by explaining it should
> > > be,
> > > ``` 
> > > enabled = hasattr(np.core, 'overrides') and
> > > getattr(np.core.overrides, 'ENABLE_ARRAY_FUNCTION', True)
> > > ```
> > > (If I understand correctly, one cannot tell from the presence of
> > > `ndarray.__array_function__`, correct?)
> > 
> > I think a hasattr check for __array_function__ is right.
> 
> We define ndarray.__array_function__ (even on NumPy 1.16) regardless
> of whether __array_function__ is enabled or not.
> 
> In principle we could have checked the environment variable from C
> before defining the method, but it's too late for that now.

I disagree here: In principle the only people relying on this would be
the same ones relying on the functionality of this protocol, so this
would be an easy change to undo, if at all needed. I do not know of any
libraries that actually use/call the __array_function__ attribute other
than NumPy, when it isn't enabled.
> ___NumPy-Discussion
> mailing listnumpy-discuss...@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Keep __array_function__ unexposed by default for 1.17?

2019-05-23 Thread Hameer Abbasi

On Wed, 2019-05-22 at 08:52 -0700, Stephan Hoyer wrote:
> Thanks for raising these concerns.
> The full implications of my recent __skip_array_function__ proposal
> are only now becoming evident to me now, looking at it's use in GH-
> 13585. Guaranteeing that it does not expand NumPy's API surface seems
> hard to achieve without pervasive use of __skip_array_function__
> internally.
> 
> Taking a step back, the sort of minor hacks [1] that motivated
> __skip_array_function__ for me are annoying, but really not too bad
> -- they are a small amount of additional code duplication in a
> proposal that already requires a large amount of code duplication.
> 
> So let's roll back the recent NEP change adding
> __skip_array_function__ to the public interface [2]. Inside the few
> NumPy functions where __array_function__ causes a measurable
> performance impact due to repeated calls (most notably np.block, for
> which some benchmarks are 25% slower), we can make use of the private
> __wrapped__ attribute.
> 
> I would still like to turn on __array_function__ in NumPy 1.17. At
> least, let's try that for the release candidate and see how it goes.
> The "all in" nature of __array_function__ without
> __skip_array_function__ will both limit its use to cases where it is
> strongly motivated, and also limits the API implications for NumPy.
> There is still plenty of room for expanding the protocol, but it's
> really hard to see what is necessary (and prudent!) without actual
> use.

Agreed that we should turn it on for 1.17  RC, and see if there are any
complaints.
> [1] e.g., see 
> https://github.com/google/jax/blob/62473351643cecb6c248a50601af163646ba7be6/jax/numpy/lax_numpy.py#L2440-L2459
> [2] https://github.com/numpy/numpy/pull/13305
> 
> 
> 
> 
> 
> On Tue, May 21, 2019 at 11:44 PM Juan Nunez-Iglesias <
> jni.s...@gmail.com> wrote:
> > I just want to express my general support for Marten's concerns. As
> > an "interested observer", I've been meaning to give
> > `__array_function__` a try but haven't had the chance yet. So from
> > my anecdotal experience I expect that more people need to play with
> > this before setting the API in stone.
> > 
> > At scikit-image we place a very strong emphasis on code simplicity
> > and readability, so I also share Marten's concerns about code
> > getting too complex. My impression reading the NEP was "whoa, this
> > is hard, I'm glad smarter people than me are working on this, I'm
> > sure it'll get simpler in time". But I haven't seen the simplicity
> > materialise...
> > 
> > On Wed, 22 May 2019, at 11:31 AM, Marten van Kerkwijk wrote:
> > > Hi All,
> > > 
> > > For 1.17, there has been a big effort, especially by Stephan, to
> > > make __array_function__ sufficiently usable that it can be
> > > exposed. I think this is great, and still like the idea very
> > > much, but its impact on the numpy code base has gotten so big in
> > > the most recent PR (gh-13585) that I wonder if we shouldn't
> > > reconsider the approach, and at least for 1.17 stick with the
> > > status quo. Since that seems to be a bigger question than can be
> > > usefully addressed in the PR, I thought I would raise it here.
> > > 
> > > Specifically, now not only does every numpy function have its
> > > dispatcher function, but also internally all numpy function calls
> > > are being done via the new `__skip_array_function__` attribute,
> > > to avoid further overrides. I think both changes make the code
> > > significantly less readable, thus, e.g., making it even harder
> > > than it is already to attract new contributors.
> > > 
> > > I think with this it is probably time to step back and check
> > > whether the implementation is in fact the right one. For
> > > instance, among the alternatives we originally considered was one
> > > that had the overridable versions of functions in the regular
> > > `numpy` namespace, and the once that would not themselves check
> > > in a different one. Alternatively, for some of the benefits
> > > provided by `__skip_array_function__`, there was a different
> > > suggestion to have a special return value, of
> > > `NotImplementedButCoercible`. Might these be better after all?
> > > 
> > > More generally, I think we're suffering from the fact that
> > > several of us seem to have rather different final goals in mind 
> > > In particular, I'd like to move to a state where as much of the
> > > code as possible makes use of the simplest possible
> > > implementation, with only a few true base functions, so that all
> > > but those simplest functions will generally work on any type of
> > > array. Others, however, worry much more about making
> > > implementations (even more) part of the API.
> > > 
> > > All the best,
> > > 
> > > Marten
> > > 
> > > ___
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion@python.org
> > > https://mail.python.org/mailman/listinfo/numpy-discussion
> > > 
> > 
> > ___

[Numpy-discussion] defining a NumPy API standard?

2019-06-01 Thread Hameer Abbasi

I think this hits the crux of the issue... There is a huge coordination 
problem. Users want to move their code from NumPy to Sparse or Dask all the 
time, but it’s not trivial to do. And libraries like sparse and Dask want to 
follow a standard (or at least hoped there was one) before they existed.

Maybe I think the issue is bigger than it really is, but there’s definitely a 
coordination problem.

See the section in the original email on “who cares and why”...

Best Regards,
Hameer Abbasi

> On Saturday, Jun 01, 2019 at 11:32 AM, Nathaniel Smith  (mailto:n...@pobox.com)> wrote:
> [snip]
>
> That's not a problem at all for us, because numpy
> already exists.
>
> [snip]___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] defining a NumPy API standard?

2019-06-02 Thread Hameer Abbasi

Re: Successful specifications (I’ll avoid using the word standard):

Moving: HTML5/CSS3, C++, Rust, Python, Java.

Static: C

I’d really like this to be a moving spec... A static one is never much use, and 
is doomed to miss use cases, either today or some from the future.

Best Regards,
Hameer Abbasi

> On Sunday, Jun 02, 2019 at 9:46 AM, Nathaniel Smith  (mailto:n...@pobox.com)> wrote:
> On Sat, Jun 1, 2019 at 11:59 PM Ralf Gommers  wrote:
> > On Sun, Jun 2, 2019 at 12:35 AM Nathaniel Smith  wrote:
> > >
> > > On Sat, Jun 1, 2019 at 1:05 PM Ralf Gommers  
> > > wrote:
> > > > I think this is potentially useful, but *far* more prescriptive and 
> > > > detailed than I had in mind. Both you and Nathaniel seem to have not 
> > > > understood what I mean by "out of scope", so I think that's my fault in 
> > > > not being explicit enough. I *do not* want to prescribe behavior. 
> > > > Instead, a simple yes/no for each function in numpy and method on 
> > > > ndarray.
> > >
> > > So yes/no are the answers. But what's the question?
> > >
> > > "If we were redesigning numpy in a fantasy world without external
> > > constraints or compatibility issues, would we include this function?"
> > > "Is this function well designed?"
> > > "Do we think that supporting this function is necessary to achieve
> > > practical duck-array compatibility?"
> > > "If someone implements this function, should we give them a 'numpy
> > > core compliant!' logo to put on their website?"
> > > "Do we recommend that people use this function in new code?"
> > > "If we were trying to design a minimal set of primitives and implement
> > > the rest of numpy in terms of them, then is this function a good
> > > candidate for a primitive?"
> > >
> > > These are all really different things, and useful for solving
> > > different problems... I feel like you might be lumping them together
> > > some?
> >
> >
> > No, I feel like you just want to see a real proposal. At this point I've 
> > gotten some really useful feedback, in particular from Marten (thanks!), 
> > and I have a better idea of what to do. So I'll answer a few of your 
> > questions, and propose to leave the rest till I actually have some more 
> > solid to discuss. That will likely answer many of your questions.
>
> Okay, that's fine. You scared me a bit with the initial email, but I
> really am trying to be helpful :-). I'm not looking for a detailed
> proposal; I'm just super confused right now about what you're trying
> to accomplish or how this table of yes/no values will help do it. I
> look forward to hearing more!
>
> > > I'm seeing this as a living document (a NEP?)
> >
> > NEP would work. Although I'd prefer a way to be able to reference some 
> > fixed version of it rather than it being always in flux.
>
> When I say "living" I mean: it would be seen as documenting our
> consensus and necessarily fuzzy rather than normative and precise like
> most NEPs. Maybe this is obvious and not worth mentioning. But I
> wouldn't expect it to change rapidly. Unless our collective opinions
> change rapidly I guess, but that seems unlikely.
>
> (And of course NEPs are in git so we always have the ability to link
> to a point-in-time snapshot if we need to reference something.)
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] defining a NumPy API standard?

2019-06-02 Thread Hameer Abbasi

I would agree that the set should be minimal at first, but would comment that 
we should still have a better taxonomy of functions that should be supported, 
in terms of the functionality they provide and functionality that is required 
for them to work. E.g. __setitem__ needs immutability.

Best Regards,
Hameer Abbasi

> On Sunday, Jun 02, 2019 at 10:16 PM, Stephan Hoyer  (mailto:sho...@gmail.com)> wrote:
> On Sun, Jun 2, 2019 at 1:08 PM Marten van Kerkwijk  (mailto:m.h.vankerkw...@gmail.com)> wrote:
> >
> >
> > On Sun, Jun 2, 2019 at 2:21 PM Eric Wieser  > (mailto:wieser.eric%2bnu...@gmail.com)> wrote:
> > > Some of your categories here sound like they might be suitable for ABCs 
> > > that provide mixin methods, which is something I think Hameer suggested 
> > > in the past. Perhaps it's worth re-exploring that avenue.
> > >
> > > Eric
> > >
> >
> > Indeed, and of course for __array_ufunc__ we moved there a bit already, 
> > with `NDArrayOperatorsMixin` [1].
> > One could certainly similarly have NDShapingMixin that, e.g., relied on 
> > `shape`, `reshape`, and `transpose` to implement `ravel`, `swapaxes`, etc. 
> > And indeed use those mixins in `ndarray` itself.
> >
> > For this also having a summary of base functions/methods would be very 
> > helpful.
> > -- Marten
>
>
> I would definitely support writing more mixins and helper functions (either 
> in NumPy, or externally) to make it easier to re-implement NumPy's public 
> API. Certainly there is plenty of room to make it easier to leverage 
> __array_ufunc__ and __array_function__.
>
> For some recent examples of what these helpers functions could look like, see 
> JAX's implementation of NumPy, which is written in terms of a much smaller 
> array library called LAX:
> https://github.com/google/jax/blob/9dfe27880517d5583048e7a3384b504681968fb4/jax/numpy/lax_numpy.py
>
> Hypothetically, JAX could be written on top of a "restricted NumPy" instead, 
> which in turn could have an implementation written in LAX. This would 
> facilitate reusing JAX's higher level functions for automatic differentiation 
> and vectorization on top of different array backends.
>
> I would also be happy to see guidance for NumPy API re-implementers, both for 
> those scratching from scratch (e.g., in a new language) or who plan to copy 
> NumPy's Python API (e.g., with __array_function__).
>
> I would focus on:
> 1. Describing the tradeoffs of challenging design decisions that NumPy may 
> have gotten wrong, e.g., scalars and indexing.
> 2. Describing common "gotchas" where it's easy to deviate from NumPy's 
> semantics unintentionally, e.g., with scalar arithmetic dtypes or indexing 
> edge cases.
>
> I would *not* try to identify a "core" list of methods/functionality to 
> implement. Everyone uses their own slice of NumPy's API, so the rational 
> approach for anyone trying to reimplement exactly (i.e., with 
> __array_function__) is to start with a minimal subset and add functionality 
> on demand to meet user's needs. Also, many of the choices involved in making 
> an array library don't really have objectively right or wrong answers, and 
> authors are going to make intentional deviations from NumPy's semantics when 
> it makes sense for them.
>
> Cheers,
> Stephan
>
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org (mailto:NumPy-Discussion@python.org)
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Extent to which to work around matrix and other duck/subclass limitations

2019-06-10 Thread Hameer Abbasi

On Mon, 2019-06-10 at 13:47 -0400, Marten van Kerkwijk wrote:
> Hi All,
> 
> In https://github.com/numpy/numpy/pull/12801, Tyler has been trying
> to use the new `where` argument for reductions to implement `nansum`,
> etc., using simplifications that boil down to `np.sum(...,
> where=~isnan(...))`.
> 
> A problem that occurs is that `np.sum` will use a `.sum` method if
> that is present, and for matrix, the `.sum` method does not take a
> `where` argument. Since the `where` argument has been introduced only
> recently, similar problems may happen for array mimics that implement
> their own `.sum` method.

Hi Marten! I ran into a similar issue with the initial kwarg when I
implemented it, except at that time I just used the np._NoValue.
> The question now is what to do, with options being:
> 1. Let's stick with the existing implementation; the speed-up is not
> that great anyway.
> 2. Use try/except around the new implementation and use the old one
> if it fails.
> 3. As (2), but emit a deprecation warning. This will help array
> mimics, but not matrix (unless someone makes a PR; would we even
> accept it?);
> 4. Use the new implementation. `matrix` should be gone anyway and
> array mimics can either update their `.sum()` method or override
> `np.nansum` with `__array_function__`.
> 
> Personally, I'd prefer (4), but realize that (3) is probably the more
> safer approach, even if it is really annoying to still be taking into
> account matrix deficiencies.

If nansum does any other kind of dispatch that should be kept around in
any case. Otherwise it failed then and it would fail now. We can catch
and raise the right type of exception for backwards compatibility if
needed.
> All the best,
> 
> Marten
> 
> p.s. One could also avoid the `.sum()` override altogether by doing
> `np.add.reduce(..., where=...)`, but this would probably break just
> as much.
> 
> 
> ___NumPy-Discussion
> mailing listnumpy-discuss...@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] (Value Based Promotion) Current Behaviour

2019-06-12 Thread Hameer Abbasi

Hi Sebastian,

One way to avoid an ugly lookup table and special cases is to store the amount 
of sign bits, the amount of integer/mantissa bits and the amount of exponent 
bits for each numeric style. A safe cast can only happen if all three are 
exceeded or equal. Just a thought.

Best Regards,
Hameer Abbasi

> On Wednesday, Jun 12, 2019 at 9:50 PM, Sebastian Berg 
> mailto:sebast...@sipsolutions.net)> wrote:
> On Wed, 2019-06-12 at 12:03 -0500, Sebastian Berg wrote:
> > On Tue, 2019-06-11 at 22:08 -0400, Marten van Kerkwijk wrote:
> > > HI Sebastian,
> > >
> > > Thanks for the overview! In the value-based casting, what perhaps
> > > surprises me most is that it is done within a kind; it would seem
> > > an
> > > improvement to check whether a given integer scalar is exactly
> > > representable in a given float (your example of 1024 in `float16`).
> > > If we switch to the python-only scalar values idea, I would suggest
> > > to abandon this. That might make dealing with things like `Decimal`
> > > or `Fraction` easier as well.
> > >
> >
> > Yeah, one can argue that since we have this "safe casting" based
> > approach, we should go all the way for the value based logic. I think
> > I
> > tend to agree, but I am not quite sure right now to be honest.
>
> Just realized, one issue with this is that you get much more "special
> cases" if you think of it in terms of "minimal dtype". Because
> suddenly, not just the unsigned/signed integers such as "< 128" are
> special, but even more values require special handling. An int16
> "minimal dtype" may or may not be castable to float16.
>
> For `can_cast` that does not matter much, but if we use the same logic
> for promotion things may get uglier. Although, maybe it just gets
> uglier implementation wise and is fairly logic on the user side...
>
> - Sebastian
>
>
> >
> > Fractions and Decimals are very interesting in that they raise the
> > question what happens to user dtypes [0]. Although, you would still
> > need a "no lower category" rule, since you do not want 1024. or 12/3
> > be
> > demoted to an integer.
> >
> > For me right now, what is most interesting is what we should do with
> > ufunc calls, and if we can simplify them. I feel right now we have to
> > types of ufuncs:
> >
> > 1. Ufuncs which use a "common type", where we can find the minimal
> > type
> > before dispatching.
> >
> > 2. More complex ufuncs, for which finding the minimal type is
> > trickier
> > [1]. And while I could not find any weird enough ufunc, I am not sure
> > that blind promotion is a good idea for general ufuncs.
> >
> > Best,
> >
> > Sebastian
> >
> >
> > [0] A python fraction could be converted to int64/int64 or
> > int32/int32,
> > etc. depending on the value, in principle. If we want such things to
> > work in principle, we need machinery (although I expect one could tag
> > that on later).
> > [1] It is not impossible, but we need to insert non-existing types
> > into
> > the type hierarchy.
> >
> >
> >
> > PS: Another interesting issue is that if we try to move away from
> > value
> > based casting for numpy scalars, that initial `np.asarray(...)` call
> > may lose the information that a python integer was passed in. So to
> > support such things, we might need a whole new machinery.
> >
> >
> >
> >
> > > All the best,
> > >
> > > Marten
> > >
> > > On Tue, Jun 11, 2019 at 8:46 PM Sebastian Berg <
> > > sebast...@sipsolutions.net> wrote:
> > > > Hi all,
> > > >
> > > > strange, something went wrong sending that email, but in any
> > > > case...
> > > >
> > > > I tried to "summarize" the current behaviour of promotion and
> > > > value
> > > > based promotion in numpy (correcting a small error in what I
> > > > wrote
> > > > earlier). Since it got a bit long, you can find it here (also
> > > > copy
> > > > pasted at the end):
> > > >
> > > > https://hackmd.io/NF7Jz3ngRVCIQLU6IZrufA
> > > >
> > > > Allan's document which I link in there is also very interesting.
> > > > One
> > > > thing I had not really thought about before was the problem of
> > > > commutativity.
> > > >
> > > > I do not have any specific points I want to

Re: [Numpy-discussion] Moving forward with value based casting

2019-06-17 Thread Hameer Abbasi

On Wed, 2019-06-12 at 12:55 -0500, Sebastian Berg wrote:
> On Wed, 2019-06-05 at 15:41 -0500, Sebastian Berg wrote:
> > Hi all,
> > 
> > TL;DR:
> > 
> > Value based promotion seems complex both for users and ufunc-
> > dispatching/promotion logic. Is there any way we can move forward
> > here,
> > and if we do, could we just risk some possible (maybe not-existing)
> > corner cases to break early to get on the way?
> > 
> 
> Hi all,
> 
> just to note. I think I will go forward trying to fill the hole in
> the
> hierarchy with a non-existing uint7 dtype. That seemed like it may be
> ugly, but if it does not escalate too much, it is probably fairly
> straight forward. And it would allow to simplify dispatching without
> any logic change at all. After that we could still decide to change
> the
> logic.

Hi Sebastian!

This seems like the right approach to me as well, I would just add one
additional comment. Earlier on, you mentioned that a lot of "strange"
dtypes will pop up when dealing with floats/ints. E.g. int15, int31,
int63, int52 (for checking double-compat), int23 (single compat), int10
(half compat) and so on and so forth. The lookup table would get tricky
to populate by hand --- It might be worth it to use the logic I
suggested to autogenerate it in some way, or to "determine" the
temporary underspecified type, as Nathaniel proposed in his email to
the list. That is, we store the number of:

* flag (0 for numeric, 1 for non-numeric)
* sign bits (0 for unsigned ints, 1 else)
* integer/fraction bits (self-explanatory)
* exponent bits (self-explanatory)
* Log-Number of items (0 for real, 1 for complex, 2 for quarternion,
etc.) (I propose log because the Cayley-Dickson algebras [1] require a
power of two)

A type is safely castable to another if all of these numbers are
exceeded or met.

This would give us a clean way for registering new numeric types, while
also cleanly hooking into the type system, and solving the casting
scenario. Of course, I'm not proposing we generate the loops for or
provide all these types ourselves, but simply that we allow people to
define dtypes using such a schema. I do worry that we're special-casing 
numbers here, but it is "Num"Py, so I'm also not too worried.

This flexibility would, for example, allow us to easily define a
bfloat16/bcomplex32 type with all the "can_cast" logic in place, even
if people have to register their own casts or loops (and just to be
clear, we error if they are not). It also makes it easy to define loops
for int128 and so on if they come along.

The only open question left here is: What to do with a case like int64
+ uint64. And what I propose is we abandon purity for pragmatism here
and tell ourselves that losing one sign bit is tolerable 90% of the
time, and going to floating-point is probably worse. It's more of a
range-versus-accuracy question, and I would argue that people using
integers expect exactness. Of course, I doubt anyone is actually
relying on the fact that adding two integers produces floating-point
results, and it has been the cause of at least one bug, which
highlights that integers can be used in places where floats cannot. [0]

Hameer Abbasi

[0] https://github.com/numpy/numpy/issues/9982
[1] https://en.wikipedia.org/wiki/Cayley%E2%80%93Dickson_construction

> 
> Best,
> 
> Sebastian
> 
> 
> > ---
> > 
> > Currently when you write code such as:
> > 
> > arr = np.array([1, 43, 23], dtype=np.uint16)
> > res = arr + 1
> > 
> > Numpy uses fairly sophisticated logic to decide that `1` can be
> > represented as a uint16, and thus for all unary functions (and most
> > others as well), the output will have a `res.dtype` of uint16.
> > 
> > Similar logic also exists for floating point types, where a lower
> > precision floating point can be used:
> > 
> > arr = np.array([1, 43, 23], dtype=np.float32)
> > (arr + np.float64(2.)).dtype  # will be float32
> > 
> > Currently, this value based logic is enforced by checking whether
> > the
> > cast is possible: "4" can be cast to int8, uint8. So the first call
> > above will at some point check if "uint16 + uint16 -> uint16" is a
> > valid operation, find that it is, and thus stop searching. (There
> > is
> > the additional logic, that when both/all operands are scalars, it
> > is
> > not applied).
> > 
> > Note that while it is defined in terms of casting "1" to uint8
> > safely
> > being possible even though 1 may be typed as int64. This logic thus
> > affects all promotion rules as well (i.e. what should the output
> > dtype
> > be).
> > 
> > 
> > There 2 main disc

1 2 >

1 - 100 of 150 matches

Mail list logo