from:"Ilhan Polat"

[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

2023-08-19 Thread Ilhan Polat

Note that this is independent from the memory waste. There are way worse
memory ops in NumPy than this so I don't think that argument applies here
even if it was.

And like I mentioned, this is a very common operation hence internals are
secondary. But it is not an unnecessary copy of the array anyways because
that is the definition of concatenation which is a new array. And it is
very laborious to do in NumPy relatively speaking. If it was really easy,
people would probably just slap a 0 in the beginning and move on.

But instead we are now entering into a keyword commitment. I'm not sure I
agree with this strategy being better. I'm not against it, clearly there is
a demand, but probably inconvenience should not be the reason for keyword
arguments elsewhere.

On Fri, Aug 18, 2023 at 9:13 AM Ronald van Elburg <
r.a.j.van.elb...@hetnet.nl> wrote:

> Ilhan Polat wrote:
>
> > I think all these point to the missing convenient functionality that
> > extends arrays. In matlab "[0 arr 10]" nicely extends the array to a new
> > one but in NumPy you need to punch quite some code and some courage to
> > remember whether it is hstack or vstack or concat or block as the correct
> > naming which decreases the "code morale".
>
> Not having a convenient workaround is not the only problem. The workaround
> is wastefull with memory and involves unnecessary copying of  an array.
> Having a keyword implemented with these concerns in mind might avoid this.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: welcome Andrew Nelson to the NumPy maintainers team

2023-08-22 Thread Ilhan Polat

Congratulations Andrew!

On Tue, Aug 22, 2023 at 9:44 PM Daniela Cialfi 
wrote:

>
> Welcome on board
>
> Daniela
>
>
> On Tue, 22 Aug 2023 at 16:06, Charles R Harris 
> wrote:
>
>>
>>
>> On Mon, Aug 21, 2023 at 10:09 PM Andrew Nelson 
>> wrote:
>>
>>> On Mon, 21 Aug 2023 at 18:39, Ralf Gommers 
>>> wrote:
>>>
 Hi all,

 On behalf of the steering council, I am very happy to announce that
 Andrew is joining the Maintainers team. Andrew has been contributing to our
 CI setup in particular for the past year, and has contributed for example
 the Cirrus CI setup and the musllinux builds:
 https://github.com/numpy/numpy/pulls/andyfaff.

 Welcome Andrew, I'm looking forward to working with you more!

>>>
>>> Thanks Ralf, it's good to join the team.
>>>
>>
>> Welcome aboard.
>>
>> Chuck
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: danielacia...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Function that searches arrays for the first element that satisfies a condition

2023-10-26 Thread Ilhan Polat

It's typically called short-circuiting or quick exit when the target
condition is met.

if you have an array a = np.array([-1, 2, 3, 4, , 1]) and you are
looking for a true/false result whether anything is negative or not (a <
0).any() will generate a bool array equal to and then check all entries of
that bool array just to reach the conclusion true which was already true at
the first entry. Instead it spends 1 units of time for all entries.

We did similar things on SciPy side Cython level, but they are not really
competitive, instead more of a convenience. More general discussion I
opened is in https://github.com/data-apis/array-api/issues/675





On Thu, Oct 26, 2023 at 2:52 PM Dom Grigonis  wrote:

> Could you please give a concise example? I know you have provided one, but
> it is engrained deep in verbose text and has some typos in it, which makes
> hard to understand exactly what inputs should result in what output.
>
> Regards,
> DG
>
> > On 25 Oct 2023, at 22:59, rosko37  wrote:
> >
> > I know this question has been asked before, both on this list as well as
> several threads on Stack Overflow, etc. It's a common issue. I'm NOT asking
> for how to do this using existing Numpy functions (as that information can
> be found in any of those sources)--what I'm asking is whether Numpy would
> accept inclusion of a function that does this, or whether (possibly more
> likely) such a proposal has already been considered and rejected for some
> reason.
> >
> > The task is this--there's a large array and you want to find the next
> element after some index that satisfies some condition. Such elements are
> common, and the typical number of elements to be searched through is small
> relative to the size of the array. Therefore, it would greatly improve
> performance to avoid testing ALL elements against the conditional once one
> is found that returns True. However, all built-in functions that I know of
> test the entire array.
> >
> > One can obviously jury-rig some ways, like for instance create a "for"
> loop over non-overlapping slices of length slice_length and call something
> like np.where(cond) on each--that outer "for" loop is much faster than a
> loop over individual elements, and the inner loop at most will go
> slice_length-1 elements past the first "hit". However, needing to use such
> a convoluted piece of code for such a simple task seems to go against the
> Numpy spirit of having one operation being one function of the form
> func(arr)".
> >
> > A proposed function for this, let's call it "np.first_true(arr,
> start_idx, [stop_idx])" would be best implemented at the C code level,
> possibly in the same code file that defines np.where. I'm wondering if I,
> or someone else, were to write such a function, if the Numpy developers
> would consider merging it as a standard part of the codebase. It's possible
> that the idea of such a function is bad because it would violate some
> existing broadcasting or fancy indexing rules. Clearly one could make it
> possible to pass an "axis" argument to np.first_true() that would select an
> axis to search over in the case of multi-dimensional arrays, and then the
> result would be an array of indices of one fewer dimension than the
> original array. So np.first_true(np.array([1,5],[2,7],[9,10],cond) would
> return [1,1,0] for cond(x): x>4. The case where no elements satisfy the
> condition would need to return a "signal value" like -1. But maybe there
> are some weird cases where there isn't a sensible return val
>  ue, hence why such a function has not been added.
> >
> > -Andrew Rosko
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: dom.grigo...@gmail.com
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-23 Thread Ilhan Polat

For dot product I can convince myself this is a math definition thing and
accept the conjugation. But for "vecmat" why the complex conjugate of the
vector? Are we assuming that 1D things are always columns. I am also a bit
lost on the difference of dot, vdot and vecdot.

Also if __matmul__ and np.matmul give different results, I think you will
enjoy many fun tickets. Personally I would agree with them no matter what
the reasoning was at the time of divergence.

On Tue, Jan 23, 2024 at 11:17 PM Marten van Kerkwijk 
wrote:

> Hi All,
>
> I have a PR [1] that adds `np.matvec` and `np.vecmat` gufuncs for
> matrix-vector and vector-matrix calculations, to add to plain
> matrix-matrix multiplication with `np.matmul` and the inner vector
> product with `np.vecdot`.  They call BLAS where possible for speed.
> I'd like to hear whether these are good additions.
>
> I also note that for complex numbers, `vecmat` is defined as `x†A`,
> i.e., the complex conjugate of the vector is taken. This seems to be the
> standard and is what we used for `vecdot` too (`x†x`). However, it is
> *not* what `matmul` does for vector-matrix or indeed vector-vector
> products (remember that those are possible only if the vector is
> one-dimensional, i.e., not with a stack of vectors). I think this is a
> bug in matmul, which I'm happy to fix. But I'm posting here in part to
> get feedback on that.
>
> Thanks!
>
> Marten
>
> [1] https://github.com/numpy/numpy/pull/25675
>
> p.s. Separately, with these functions available, in principle these
> could be used in `__matmul__` (and thus for `@`) and the specializations
> in `np.matmul` removed. But that can be a separate PR (if it is wanted
> at all).
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: numpy 2.0.x has been branched.

2024-03-09 Thread Ilhan Polat

Maybe someone composes another preemptive blog post about this? I think
this needs some banging on the pans to soften its landing since NumPy 2.0
is a big deal. The community will take care of its circulation in typical
places. We always get the grilling no matter the outcome :)

But I think it is important that at least package maintainers see this and
react to it.

On Sat, Mar 9, 2024 at 11:12 AM Ralf Gommers  wrote:

>
>
> On Sat, Mar 9, 2024 at 2:03 AM Oscar Benjamin 
> wrote:
>
>> On Sat, 9 Mar 2024 at 00:44, Charles R Harris 
>> wrote:
>> >
>> > About a month from now.
>>
>> What will happen about a month from now? It might seem obvious to you
>> but I can interpret this in different ways.
>>
>> To be clear numpy 2.0 is expected to be released in full to the public
>> in about one month's time from today?
>>
>
> Let me give the optimistic and pessimistic timelines. Optimistic:
>
> - 2.0.0b1 later today
> - 2.0.0rc1 (ABI stable) in 7-10 days
> - 2.0.0 final release in 1 month
>
> Pessimistic:
>
> - 2.0.0b1 within a few days
> - 2.0.0rc1 (ABI stable) in 2 weeks
> - 2.0.0rc2 in 4 weeks
> - 2.0.0rc3 in 6 weeks
> - 2.0.0 final release in 8 weeks
>
> For projects which have nontrivial usage of the NumPy API (and especially
> if they also use the C API), I'd recommend:
> 1. Check whether things work with 2.0.0b1, ideally asap so if there is
> anything we missed we can catch it before rc1. Perhaps do a pre-release of
> your own package
> 2. Do a final release after 2.0.0rc1 - ideally as soon as possible after,
> and definitely before the final 2.0.0 release
>
> For (2), note that there are a ton of packages that do not have correct
> upper bounds, so if you haven't done your own new release that is
> compatible with both 2.0.0 and 1.26.x *before* 2.0.0 comes out, the users
> of your project are likely to have a hard time.
>
> Cheers,
> Ralf
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Please consider dropping Python 3.9 support for Numpy 2.0

2024-05-07 Thread Ilhan Polat

I guess this is also a mandatory read after Henry's blog post appeared that
we had an extensive discussion with Python devs
https://discuss.python.org/t/requires-python-upper-limits/12663

On Tue, May 7, 2024 at 11:35 AM Sebastian Berg 
wrote:

> On Tue, 2024-05-07 at 15:46 +1000, Juan Nunez-Iglesias wrote:
> > On Tue, 7 May 2024, at 7:04 AM, Ralf Gommers wrote:
> > > This problem could have been avoided by proper use of upper bounds.
> > > Scikit-image 0.22 not including a `numpy<2.0` upper bound is a bug
> > > in scikit-image (definitely for ABI reasons, and arguably also for
> > > API reasons). It would really be useful if downstream packages
> > > started to take adding upper bounds correctly more seriously. E.g.,
> > > SciPy has always done it right, so the failure mode that this
> > > thread is about doesn't exist for SciPy. That said, this ship has
> > > sailed for 2.0 - most packages don't have upper bounds in some or
> > > all of their recent releases.
> >
> > I don't think this is a downstream problem, I think this is a "PyPI
> > has immutable metadata" problem. I'm a big fan of Henry Schreiner's
> > "Should You Use Upper Bound Version Constraints" <
> > https://iscinumpy.dev/post/bound-version-constraints/>, where he
> > argues convincingly that the answer is almost always no. This
> > highlighted bit contains the gist:
>
>
>
> Yes, that is all because of `pip` limitations, but those limitations
> are a given.  And I think it is unfortunate/odd that it effectively
> argues that the lower in the stack you are,  the fewer version you
> should support.
>
> But, with the clarification we have that there may be a lot of packages
> that never support both Python 3.9 and NumPy 2.
> That means not publishing for 3.9 may end up helping quite a lot of
> users who would have to downgrade NumPy explicitly.
>
> If that seems the case, that is an unfortunate, but good, argument for
> dropping 3.9.
>
> I don't have an idea for how many users we'll effectively help, or if
> we do the opposite because an application (more than library) wants to
> just use NumPy 2 always but still support Python 3.9.
> But it seems to me that is what the decision comes down to, and I can
> believe that it'll be a lot of hassle saved for `pip` installing users.
> (Note that skimage users will hit cython, so should get a relatively
> clear printout that includes a "please downgrade NumPy" suggestion.)
>
> - Sebastian
>
>
>
> >
> > > A library that requires a manual version intervention is not
> > > broken, it’s just irritating. A library that can’t be installed due
> > > to a version conflict is broken. If that version conflict is fake,
> > > then you’ve created an unsolvable problem where one didn’t exist.
> >
> > Dropping Py 3.9 will fix the issue for a subset of users, but
> > certainly not all users. Someone who pip installs scikit-image==0.22
> > on Py 3.10 will have a broken install. But importantly, they will be
> > able to fix it in user space.
> >
> > At any rate, it's not like NumPy (or SciPy, or scikit-image) don't
> > change APIs over several minor versions. Quoting Henry again:
> >
> > > Quite ironically, the better a package follows SemVer, the smaller
> > > the change will trigger a major version, and therefore the less
> > > likely a major version will break a particular downstream code.
> >
> > In short, and independent of the Py3.9 issue, I don't think we should
> > advocate for upper caps in general, because in general it is
> > impossible to know whether an update is going to break your library,
> > regardless of their SemVer practices, and a fake upper pin is worse
> > than no upper pin.
> >
> > Juan.
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: sebast...@sipsolutions.net
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Controlling NumPy mul method or forcing it to use rmul of the "other"

2017-06-19 Thread Ilhan Polat

I will assume some simple linear systems knowledge but the question can be
generalized to any operator that implements __mul__ and __rmul__ methods.

Motivation:

I am trying to implement a gain matrix, say 3x3 identity matrix, for time
being with a single input single output (SISO) system that I have
implemented as a class modeling a Transfer or a state space representation.

In the typical usecase, suppose you would like to create an n-many parallel
connections with the same LTI system sitting at each branch. MATLAB
implements this as an elementwise multiplication and returning a multi
input multi output(MIMO) system.

G = tf(1,[1,1]);
eye(3)*G

produces (manually compactified)

ans =

  From input 1 to output...
   [1  ]
   [  --,   0   , 0]
   [  s + 1]
   [ 1 ]
   [  0,   -- ,   0]
   [   s + 1   ]
   [  1]
   [  0,   0,  --  ]
   [s + 1  ]

Notice that the result type is of LTI system but, in our context, not a
NumPy array with "object" dtype.

In order to achieve a similar behavior, I would like to let the __rmul__ of
G take care of the multiplication. In fact, when I do G.__rmul__(np.eye(3))
I can control what the behavior should be and I receive the
exception/result I've put in. However the array never looks for this method
and carries out the default array __mul__ behavior.

The situation is similar if we go about it as left multiplication G*eye(3)
has no problems since this uses directly the __mul__ of G. Therefore we get
a different result depending on the direction of multiplication.

Is there anything I can do about this without forcing users subclassing or
just letting them know about this particular quirk in the documentation?

What I have in mind is to force the users to create static LTI objects and
then multiply and reject this possibility. But then I still need to stop
NumPy returning "object" dtyped array to be able to let the user know about
this.


Relevant links just in case

the library : https://github.com/ilayn/harold/

the issue discussion (monologue actually) :
https://github.com/ilayn/harold/issues/7

The question I've asked on SO (but with a rather offtopic answer):
https://stackoverflow.com/q/40694380/4950339


ilhan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Controlling NumPy mul method or forcing it to use rmul of the "other"

2017-06-19 Thread Ilhan Polat

Thank you. I didn't know that it existed. Is there any place where I can
get a feeling for a sane priority number compared to what's being done in
production? Just to make sure I'm not stepping on any toes.

On Mon, Jun 19, 2017 at 5:36 PM, Stephan Hoyer  wrote:

> I answered your question on StackOverflow:
> https://stackoverflow.com/questions/40694380/forcing-
> multiplication-to-use-rmul-instead-of-numpy-array-mul-or-
> byp/44634634#44634634
>
> In brief, you need to set __array_priority__ or __array_ufunc__ on your
> object.
>
> On Mon, Jun 19, 2017 at 5:27 AM, Ilhan Polat  wrote:
>
>> I will assume some simple linear systems knowledge but the question can
>> be generalized to any operator that implements __mul__ and __rmul__
>> methods.
>>
>> Motivation:
>>
>> I am trying to implement a gain matrix, say 3x3 identity matrix, for time
>> being with a single input single output (SISO) system that I have
>> implemented as a class modeling a Transfer or a state space representation.
>>
>> In the typical usecase, suppose you would like to create an n-many
>> parallel connections with the same LTI system sitting at each branch.
>> MATLAB implements this as an elementwise multiplication and returning a
>> multi input multi output(MIMO) system.
>>
>> G = tf(1,[1,1]);
>> eye(3)*G
>>
>> produces (manually compactified)
>>
>> ans =
>>
>>   From input 1 to output...
>>[1  ]
>>[  --,   0   , 0]
>>[  s + 1]
>>[ 1 ]
>>[  0,   -- ,   0]
>>[   s + 1   ]
>>[  1]
>>[  0,   0,  --  ]
>>[s + 1  ]
>>
>> Notice that the result type is of LTI system but, in our context, not a
>> NumPy array with "object" dtype.
>>
>> In order to achieve a similar behavior, I would like to let the __rmul__
>> of G take care of the multiplication. In fact, when I do
>> G.__rmul__(np.eye(3)) I can control what the behavior should be and I
>> receive the exception/result I've put in. However the array never looks for
>> this method and carries out the default array __mul__ behavior.
>>
>> The situation is similar if we go about it as left multiplication
>> G*eye(3) has no problems since this uses directly the __mul__ of G.
>> Therefore we get a different result depending on the direction of
>> multiplication.
>>
>> Is there anything I can do about this without forcing users subclassing
>> or just letting them know about this particular quirk in the documentation?
>>
>> What I have in mind is to force the users to create static LTI objects
>> and then multiply and reject this possibility. But then I still need to
>> stop NumPy returning "object" dtyped array to be able to let the user know
>> about this.
>>
>>
>> Relevant links just in case
>>
>> the library : https://github.com/ilayn/harold/
>>
>> the issue discussion (monologue actually) :
>> https://github.com/ilayn/harold/issues/7
>>
>> The question I've asked on SO (but with a rather offtopic answer):
>> https://stackoverflow.com/q/40694380/4950339
>>
>>
>> ilhan
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Controlling NumPy mul method or forcing it to use rmul of the "other"

2017-06-19 Thread Ilhan Polat

Ah OK. I was just wondering if there are recommended values to start with
in case below some values are reserved for NumPy/SciPy internals. I'll just
go with the ufunc path just in case.

This really looks like TeX overful/underful badness value adjustment. As
long as the journal accepts don't mention it. :)

On Mon, Jun 19, 2017 at 6:58 PM, Stephan Hoyer  wrote:

> Coming up with a single number for a sane "array priority" is basically an
> impossible task :). If you only need compatibility with the latest version
> of NumPy, this is one good reason to set __array_ufunc__ instead, even if
> only to write __array_ufunc__ = None.
>
> On Mon, Jun 19, 2017 at 9:14 AM, Nathan Goldbaum 
> wrote:
>
>> I don't think there's any real standard here. Just doing a github search
>> reveals many different choices people have used:
>>
>> https://github.com/search?l=Python&q=__array_priority__&type
>> =Code&utf8=%E2%9C%93
>>
>> On Mon, Jun 19, 2017 at 11:07 AM, Ilhan Polat 
>> wrote:
>>
>>> Thank you. I didn't know that it existed. Is there any place where I can
>>> get a feeling for a sane priority number compared to what's being done in
>>> production? Just to make sure I'm not stepping on any toes.
>>>
>>> On Mon, Jun 19, 2017 at 5:36 PM, Stephan Hoyer  wrote:
>>>
>>>> I answered your question on StackOverflow:
>>>> https://stackoverflow.com/questions/40694380/forcing-multipl
>>>> ication-to-use-rmul-instead-of-numpy-array-mul-or-byp/44634634#44634634
>>>>
>>>> In brief, you need to set __array_priority__ or __array_ufunc__ on your
>>>> object.
>>>>
>>>> On Mon, Jun 19, 2017 at 5:27 AM, Ilhan Polat 
>>>> wrote:
>>>>
>>>>> I will assume some simple linear systems knowledge but the question
>>>>> can be generalized to any operator that implements __mul__ and __rmul__
>>>>> methods.
>>>>>
>>>>> Motivation:
>>>>>
>>>>> I am trying to implement a gain matrix, say 3x3 identity matrix, for
>>>>> time being with a single input single output (SISO) system that I have
>>>>> implemented as a class modeling a Transfer or a state space 
>>>>> representation.
>>>>>
>>>>> In the typical usecase, suppose you would like to create an n-many
>>>>> parallel connections with the same LTI system sitting at each branch.
>>>>> MATLAB implements this as an elementwise multiplication and returning a
>>>>> multi input multi output(MIMO) system.
>>>>>
>>>>> G = tf(1,[1,1]);
>>>>> eye(3)*G
>>>>>
>>>>> produces (manually compactified)
>>>>>
>>>>> ans =
>>>>>
>>>>>   From input 1 to output...
>>>>>[1  ]
>>>>>[  --,   0   , 0]
>>>>>[  s + 1]
>>>>>[ 1 ]
>>>>>[  0,   -- ,   0]
>>>>>[   s + 1   ]
>>>>>[  1]
>>>>>[  0,   0,  --  ]
>>>>>[s + 1  ]
>>>>>
>>>>> Notice that the result type is of LTI system but, in our context, not
>>>>> a NumPy array with "object" dtype.
>>>>>
>>>>> In order to achieve a similar behavior, I would like to let the
>>>>> __rmul__ of G take care of the multiplication. In fact, when I do
>>>>> G.__rmul__(np.eye(3)) I can control what the behavior should be and I
>>>>> receive the exception/result I've put in. However the array never looks 
>>>>> for
>>>>> this method and carries out the default array __mul__ behavior.
>>>>>
>>>>> The situation is similar if we go about it as left multiplication
>>>>> G*eye(3) has no problems since this uses directly the __mul__ of G.
>>>>> Therefore we get a different result depending on the direction of
>>>>> multiplication.
>>>>>
>>>>> Is there anything I can do about this without forcing users
>>>>> subclassing or just letting them know about this particular quirk in the
>>>>> documentation?
>>>>>
>>>>> What I have in mind is to force the us

[Numpy-discussion] Dropping support for Accelerate

2017-07-22 Thread Ilhan Polat

A few months ago, I had the innocent intention to wrap LDLt decomposition
routines of LAPACK into SciPy but then I am made aware that the minimum
required version of LAPACK/BLAS was due to Accelerate framework. Since then
I've been following the core SciPy team and others' discussion on this
issue.

We have been exchanging opinions for quite a while now within various SciPy
issues and PRs about the ever-increasing Accelerate-related issues and I've
compiled a brief summary about the ongoing discussions to reduce the
clutter.

First, I would like to kindly invite everyone to contribute and sharpen the
cases presented here

https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate

The reason I specifically wanted to post this also in NumPy mailing list is
to probe for the situation from the NumPy-Accelerate perspective. Is there
any NumPy specific problem that would indirectly effect SciPy should the
support for Accelerate is dropped?
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping support for Accelerate

2017-07-23 Thread Ilhan Polat

That's probably because I know nothing about the issue, is there any
reference I can read about?

But in general, please feel free populate new items in the wiki page.

On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith  wrote:

> I've been wishing we'd stop shipping Accelerate for years, because of
> how it breaks multiprocessing – that doesn't seem to be on your list
> yet.
>
> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat  wrote:
> > A few months ago, I had the innocent intention to wrap LDLt decomposition
> > routines of LAPACK into SciPy but then I am made aware that the minimum
> > required version of LAPACK/BLAS was due to Accelerate framework. Since
> then
> > I've been following the core SciPy team and others' discussion on this
> > issue.
> >
> > We have been exchanging opinions for quite a while now within various
> SciPy
> > issues and PRs about the ever-increasing Accelerate-related issues and
> I've
> > compiled a brief summary about the ongoing discussions to reduce the
> > clutter.
> >
> > First, I would like to kindly invite everyone to contribute and sharpen
> the
> > cases presented here
> >
> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate
> >
> > The reason I specifically wanted to post this also in NumPy mailing list
> is
> > to probe for the situation from the NumPy-Accelerate perspective. Is
> there
> > any NumPy specific problem that would indirectly effect SciPy should the
> > support for Accelerate is dropped?
> >
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
>
>
> --
> Nathaniel J. Smith -- https://vorpus.org
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping support for Accelerate

2017-07-23 Thread Ilhan Polat

Ouch, that's from 2012 :(  I'll add this thread as a reference to the wiki
list.


On Sun, Jul 23, 2017 at 5:22 PM, Nathan Goldbaum 
wrote:

> See https://mail.scipy.org/pipermail/numpy-discussion/
> 2012-August/063589.html and replies in that thread.
>
> Quote from an Apple engineer in that thread:
>
> "For API outside of POSIX, including GCD and technologies like Accelerate,
> we do not support usage on both sides of a fork(). For this reason among
> others, use of fork() without exec is discouraged in general in processes
> that use layers above POSIX."
>
> On Sun, Jul 23, 2017 at 10:16 AM, Ilhan Polat 
> wrote:
>
>> That's probably because I know nothing about the issue, is there any
>> reference I can read about?
>>
>> But in general, please feel free populate new items in the wiki page.
>>
>> On Sun, Jul 23, 2017 at 11:15 AM, Nathaniel Smith  wrote:
>>
>>> I've been wishing we'd stop shipping Accelerate for years, because of
>>> how it breaks multiprocessing – that doesn't seem to be on your list
>>> yet.
>>>
>>> On Sat, Jul 22, 2017 at 3:50 AM, Ilhan Polat 
>>> wrote:
>>> > A few months ago, I had the innocent intention to wrap LDLt
>>> decomposition
>>> > routines of LAPACK into SciPy but then I am made aware that the minimum
>>> > required version of LAPACK/BLAS was due to Accelerate framework. Since
>>> then
>>> > I've been following the core SciPy team and others' discussion on this
>>> > issue.
>>> >
>>> > We have been exchanging opinions for quite a while now within various
>>> SciPy
>>> > issues and PRs about the ever-increasing Accelerate-related issues and
>>> I've
>>> > compiled a brief summary about the ongoing discussions to reduce the
>>> > clutter.
>>> >
>>> > First, I would like to kindly invite everyone to contribute and
>>> sharpen the
>>> > cases presented here
>>> >
>>> > https://github.com/scipy/scipy/wiki/Dropping-support-for-Accelerate
>>> >
>>> > The reason I specifically wanted to post this also in NumPy mailing
>>> list is
>>> > to probe for the situation from the NumPy-Accelerate perspective. Is
>>> there
>>> > any NumPy specific problem that would indirectly effect SciPy should
>>> the
>>> > support for Accelerate is dropped?
>>> >
>>> >
>>> >
>>> >
>>> > ___
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion@python.org
>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>> >
>>>
>>>
>>>
>>> --
>>> Nathaniel J. Smith -- https://vorpus.org
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping support for Accelerate

2017-07-29 Thread Ilhan Polat

Yet another twirl to the existing spaghetti

https://www.continuum.io/blog/developer-blog/open-sourcing-anaconda-accelerate





On Tue, Jul 25, 2017 at 4:23 PM, Matthew Brett 
wrote:

> On Tue, Jul 25, 2017 at 3:14 PM, Nathaniel Smith  wrote:
> > On Tue, Jul 25, 2017 at 7:05 AM, Matthew Brett 
> wrote:
> >> On Tue, Jul 25, 2017 at 3:00 PM, Nathaniel Smith  wrote:
> >>> On Tue, Jul 25, 2017 at 6:48 AM, Matthew Brett <
> matthew.br...@gmail.com> wrote:
>  On Tue, Jul 25, 2017 at 2:19 PM, Nathaniel Smith 
> wrote:
> > I updated the bit about OpenBLAS wheel with some more information on
> > the status of that work. It's not super important, but FYI.
> 
>  Maybe remove the bit (of my text) that you crossed out, or removed the
>  strikethrough and qualify?  At the moment it's confusing, because I
>  believe what I wrote is correct, so leaving in there and crossed out
>  looks kinda weird.
> >>>
> >>> Eh, it's a little weird because there's no specification needed
> >>> really, we can implement it any time we want to. It was stalled for a
> >>> long time because I ran into arcane technical problems dealing with
> >>> the MacOS linker, but that's solved and now it's just stalled due to
> >>> lack of attention.
> >>>
> >>> I deleted the text but feel free to qualify further if you think it's
> useful.
> >>
> >> Are you saying that we should consider this specification approved
> >> already?  Or that we should go ahead without waiting for approval?  I
> >> guess the latter.  I guess you're saying you think there would be no
> >> bad consequences for doing this if the spec subsequently changed
> >> before being approved?  It might be worth adding something like that
> >> to the text, in case there's somebody who wants to do some work on
> >> that.
> >
> > It's not a PEP. It will never be approved because there is no-one to
> > approve it :-).
>
> Sure, but it is a pull-request, it hasn't been merged - so I assume
> that someone is expecting to make or receive more feedback on it.
>
> > The only reason for writing it as a spec is to
> > potentially help coordinate with others who want to get in on making
> > these kinds of packages themselves, and the main motivator for that
> > will be if one of us starts doing it and proves it works...
>
> If I had to guess, I'd guess that you are saying Yes to "no bad
> consequences" (above)?  Would you mind adding something about that in
> the text to make it clear?
>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Dropping support for Accelerate

2017-07-29 Thread Ilhan Polat

If it can confuse you, imagine what would happen to regular users like me.
That's why I wanted to mention this in advance that this also needs some
sort of a "No this is not related to Anaconda Accelerate" disclaimer at
some place if need be.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] ANN: SciPy 1.0 beta release

2017-09-17 Thread Ilhan Polat

Well also thank you Ralf, for going through all those issues one by one
from all kinds of topics. Must be really painstakingly tedious.


On Sun, Sep 17, 2017 at 12:48 PM, Ralf Gommers 
wrote:

> Hi all,
>
> I'm excited to be able to announce the availability of the first beta
> release of Scipy 1.0. This is a big release, and a version number that
> has been 16 years in the making. It contains a few more deprecations and
> backwards incompatible changes than an average release. Therefore please do
> test it on your own code, and report any issues on the Github issue tracker
> or on the scipy-dev mailing list.
>
> Sources: https://github.com/scipy/scipy/releases/tag/v1.0.0b1
> Binary wheels: will follow tomorrow, I'll announce those when ready
> (TravisCI is under maintenance right now)
>
> Thanks to everyone who contributed to this release!
>
> Ralf
>
>
>
>
> Release notes (full notes including authors, closed issued and merged PRs
> at the Github Releases link above):
>
> [snip]
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Sustainability

2017-10-04 Thread Ilhan Polat

I have two points that I know, from first hand, people (including myself)
wonder:

1. Clear distinction between NumPy/SciPy development and respective
roadmaps.

In addition to Johann's summary; I am an occasional contributor to SciPy
(mostly linalg) and again occasionally I wonder whether certain stuff can
be done on NumPy side or how to sync linalg issues lingering due to say
legacy reasons etc. So I start reading the source code. However in
particular to NumPy, it is extremely difficult for me to find an entry
point on how things actually work or what core team has in mind about the
SciPy/NumPy separation. Story gets really complicated by invoking the
backwards compatibility issues, say the recent dropping the Accelerate
support discussion. There are so many details to take care of, I can only
mention how I'm impressed with the work you guys pulled off over the years.
If sustainability is meant for widening the spectrum of contributors, some
care is needed for initialization of us even in the form of contribution
guide or which files stay where. This would also return as ease of
reviewing and less weight on the core team.

2. Feature completeness of basic modules.

I have been in contact with a few companies, probing the opportunities
about open-source usage in my domain of expertise. Many of them mentioned
the feature incompleteness of the basics. One person used the analogy of
potholes and bumpy ride in the linalg module "How come <...> is there but
<...> is not?" . So it creates a maintenance obligation of a code base that
not so many use. Another person used the term "a bit of this, a bit of
that". Same applies for NumPy side too.

I hope these won't be taken as complaints, I just want to give the
perspective I've gained in the last few months. But similar to other "huge"
projects in open source domain, it seems to me that if there is a plan to
attract interest of commercial 3rd parties for funding or simply donations,
it would really help if they can see some clear planning or a better
structure.

Best,
ilhan

On Wed, Oct 4, 2017 at 5:31 PM, John T. Goetz 
wrote:

> Hello Chuck,
> Sustainability is indeed a broad topic and I think it's all too easy to
> think broadly about it. Please do discuss the big picture, but I am far
> more interested in the practical day-to-day action items that result
> from such a meeting. Here are my concerns with regards NumPy
> specifically:
>
> * How to handle the backlog of pull requests.
>
> * How to advertise outstanding issues that could be tackled by
> developers that are new to NumPy (like myself). This maybe just being
> more aggressive with the "Difficulty" tag.
>
> * Coding style has changed within the code-base over time and it would
> good to have a handful of functions one can point to as examples to
> follow.
>
> Notice these are all on the "ease of contributing" side of
> sustainability. I can't address the perhaps larger issues of ecosystem
> integration but I suspect NumPy doesn't suffer from being ignored. As
> to sponsored work or financial support, I'll look forward to the report
> that comes out of these meetings.
>
> Thanks for bringing this up here on the mailing list,
> Johann
>
> On Tue, 2017-10-03 at 17:04 -0600, Charles R Harris wrote:
> > Hi All,
> >
> > I and a number of others representing various open source projects
> > under the NumFocus umbrella will be attending as meeting next Tuesday
> > do discuss the problem of sustainability. In preparation for that
> > meeting I would be interested in any ideas that the folks who follow
> > this list may have on the subject.
> >
> > Chuck
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal of timeline for dropping Python 2.7 support

2017-11-08 Thread Ilhan Polat

I was about to send the same thing. I think this matter became a vim/emacs
issue and Py2 supporters won't take any arguments anymore. But if Instagram
can do it, it means that legacy code argument is a matter of will but not a
technicality. https://thenewstack.io/instagram-makes-smooth-move-python-3/

Also people are really going out of their ways such as Tauthon
https://github.com/naftaliharris/tauthon to stay with Python2. To be
honest, I'm convinced that this is a sentimental debate after seeing this
fork.







On Wed, Nov 8, 2017 at 5:50 PM, Peter Cock 
wrote:

> On Tue, Nov 7, 2017 at 11:40 PM, Nathaniel Smith  wrote:
> >
> > 
> >
> > Right now, the decision in front of us is what to tell people who ask
> about
> > numpy's py2 support plans, so that they can make their own plans. Given
> what
> > we know right now, I don't think we should promise to keep support past
> > 2018. If we get there and the situation's changed, and there's both
> desire
> > and means to extend support we can revisit that. But's better to
> > under-promise and possibly over-deliver, instead of promising to support
> py2
> > until after it becomes a millstone around our necks and then realizing we
> > haven't warned anyone and are stuck supporting it another year beyond
> > that...
> >
> > -n
>
> NumPy (and to a lesser extent SciPy) is in a tough position being at the
> bottom of many scientific Python programming stacks. Whenever you
> drop Python 2 support is going to upset someone.
>
> It is too ambitious to pledge to drop support for Python 2.7 no later than
> 2020, coinciding with the Python development team’s timeline for dropping
> support for Python 2.7?
>
> If that looks doable, NumPy could sign up to http://www.python3statement.
> org/
>
> Regards,
>
> Peter
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal of timeline for dropping Python 2.7 support

2017-11-17 Thread Ilhan Polat

I've actually engaged with him on Twitter too but just to repeat one part
here : scarce academic resources to maintain code is not an argument. Out
of all places, it is academia that should have come up with or should have
contributed greatly to open-source instead of paper-writing frenzy among
each other. As many people already have written many blog posts/tweets etc.
academia does not value software as scientific products but demand software
continuously. As an ex-academician I can safely ignore that
argument.Scientific code is expected to be maintained properly. I
understand the sentiment but blocking progress because of legacy code is a
burden on the posterity and a luxury for the past.





On Fri, Nov 17, 2017 at 1:35 PM, Peter Cock 
wrote:

> Since Konrad Hinsen no longer follows the NumPy discussion list
> for lack of time, he has not posted here - but he has commented
> about this on Twitter and written up a good blog post:
>
> http://blog.khinsen.net/posts/2017/11/16/a-plea-for-
> stability-in-the-scipy-ecosystem/
>
> In a field where scientific code is expected to last and be developed
> on a timescale of decades, the change of pace with Python 2 and 3
> is harder to handle.
>
> Regards,
>
> Peter
>
> On Wed, Nov 15, 2017 at 2:19 AM, Nathaniel Smith  wrote:
> > Apparently this is actually uncontroversial, the discussion's died
> > down (see also the comments on Chuck's PR [1]), and anyone who wanted
> > to object has had more than a week to do so, so... I guess we can say
> > this is what's happening and start publicizing it to our users!
> >
> > A direct link to the rendered NEP in the repo is:
> > https://github.com/numpy/numpy/blob/master/doc/neps/
> dropping-python2.7-proposal.rst
> >
> > (I guess that at some point it will also show up on docs.scipy.org.)
> >
> > -n
> >
> > [1] https://github.com/numpy/numpy/pull/10006
> >
> > On Thu, Nov 9, 2017 at 5:52 PM, Nathaniel Smith  wrote:
> >> Fortunately we can wait until we're a bit closer before we have to
> >> make any final decision on the version numbering :-)
> >>
> >> Right now though it would be good to start communicating to
> >> users/downstreams about whatever our plans our though, so they can
> >> make plans. Here's a first attempt at some text we can put in the
> >> documentation and point people to -- any thoughts, on either the plan
> >> or the wording?
> >>
> >>  DRAFT TEXT - NOT FINAL - DO NOT POST THIS TO HACKERNEWS OK? OK 
> >>
> >> The Python core team plans to stop supporting Python 2 in 2020. The
> >> NumPy project has supported both Python 2 and Python 3 in parallel
> >> since 2010, and has found that supporting Python 2 is an increasing
> >> burden on our limited resources; thus, we plan to eventually drop
> >> Python 2 support as well. Now that we're entering the final years of
> >> community-supported Python 2, the NumPy project wants to clarify our
> >> plans, with the goal of to helping our downstream ecosystem make plans
> >> and accomplish the transition with as little disruption as possible.
> >>
> >> Our current plan is as follows:
> >>
> >> Until **December 31, 2018**, all NumPy releases will fully support
> >> both Python 2 and Python 3.
> >>
> >> Starting on **January 1, 2019**, any new feature releases will support
> >> only Python 3.
> >>
> >> The last Python-2-supporting release will be designated as a long-term
> >> support (LTS) release, meaning that we will continue to merge
> >> bug-fixes and make bug-fix releases for a longer period than usual.
> >> Specifically, it will be supported by the community until **December
> >> 31, 2019**.
> >>
> >> On **January 1, 2020** we will raise a toast to Python 2, and
> >> community support for the last Python-2-supporting release will come
> >> to an end. However, it will continue to be available on PyPI
> >> indefinitely, and if any commercial vendors wish to extend the LTS
> >> support past this point then we are open to letting them use the LTS
> >> branch in the official NumPy repository to coordinate that.
> >>
> >> If you are a NumPy user who requires ongoing Python 2 support in 2020
> >> or later, then please contact your vendor. If you are a vendor who
> >> wishes to continue to support NumPy on Python 2 in 2020+, please get
> >> in touch; ideally we'd like you to get involved in maintaining the LTS
> >> before it actually hits end-of-life, so we can make a clean handoff.
> >>
> >> To minimize disruption, running 'pip install numpy' on Python 2 will
> >> continue to give the last working release in perpetuity; but after
> >> January 1, 2019 it may not contain the latest features, and after
> >> January 1, 2020 it may not contain the latest bug fixes.
> >>
> >> For more information on the scientific Python ecosystem's transition
> >> to Python-3-only, see: http://www.python3statement.org/
> >>
> >> For more information on porting your code to run on Python 3, see:
> >> https://docs.python.org/3/howto/pyporting.html
> >>
> >> 
> >>
> >> Thoughts?
>

Re: [Numpy-discussion] Deprecate matrices in 1.15 and remove in 1.17?

2017-11-30 Thread Ilhan Polat

This would be really good to remove the apparent confusion. Moreover, I
think cleanly explaining why using "np.matrix" is not a good idea *before*
announcing the news would encourage people to accept this decision along
the way. That would greatly reduce the sporadic "the devs are deprecating
stuff as they see fit without asking us" sentiment.

On Thu, Nov 30, 2017 at 6:00 PM, Marten van Kerkwijk <
m.h.vankerkw...@gmail.com> wrote:

> Moving to a subpackage may indeed make more sense, though it might not
> help as much with getting rid of the hacks inside other parts of numpy
> to keep matrix working. In that respect it seems a bit different at
> least from weave.
>
> Then again, independently of whether we remove or release a separate
> package, it is probably best to start by moving all tests involving
> matrix to matrixlib/tests, so we can at least get a sense of what
> hacks are actually present.
>
> -- Marten
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] RFC: comments to BLAS committee from numpy/scipy devs

2018-01-09 Thread Ilhan Polat

I couldn't find an item to place this but I think ilaenv and also calling
the function twice (one with lwork=-1 and reading the optimal block size
and the call the function again properly with lwork=) in LAPACK
needs to be gotten rid of.

That's a major annoyance during the wrapping of LAPACK routines for SciPy.

I don't know if this is realistic but the values ilaenv needed can be
computed once (or again if hardware is changed) at the install and can be
read off by the routines.



On Jan 9, 2018 09:25, "Nathaniel Smith"  wrote:

> Hi all,
>
> As mentioned earlier [1][2], there's work underway to revise and
> update the BLAS standard -- e.g. we might get support for strided
> arrays and lose xerbla! There's a draft at [3]. They're interested in
> feedback from users, so I've written up a first draft of comments
> about what we would like as NumPy/SciPy developers. This is very much
> a first attempt -- I know we have lots of people who are more expert
> on BLAS than me on these lists :-). Please let me know what you think.
>
> -n
>
> [1] https://mail.python.org/pipermail/numpy-discussion/
> 2017-November/077420.html
> [2] https://mail.python.org/pipermail/scipy-dev/2017-November/022267.html
> [3] https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdB
> DvtD5I14QHp9OE/edit
>
> -
>
> # Comments from NumPy / SciPy developers on "A Proposal for a
> Next-Generation BLAS"
>
> These are comments on [A Proposal for a Next-Generation
> BLAS](https://docs.google.com/document/d/1DY4ImZT1coqri2382GusXgBTTTVdB
> DvtD5I14QHp9OE/edit#)
> (version as of 2017-12-13), from the perspective of the developers of
> the NumPy and SciPy libraries. We hope this feedback is useful, and
> welcome further discussion.
>
> ## Who are we?
>
> NumPy and SciPy are the two foundational libraries of the Python
> numerical ecosystem, and one of their duties is to wrap BLAS and
> expose it for the use of other Python libraries. (NumPy primarily
> provides a GEMM wrapper, while SciPy exposes more specialized
> operations.) It's unclear how many users we have exactly, but we
> certainly ship multiple million copies of BLAS every month, and
> provide one of the most popular numerical toolkits for both novice and
> expert users.
>
> Looking at the original BLAS and LAPACK interfaces, it often seems
> that their imagined user is something like a classic supercomputer
> consumer, who writes code directly in Fortran or C against the BLAS
> API, and where the person writing the code and running the code are
> the same. NumPy/SciPy are coming from a very different perspective:
> our users generally know nothing about the details of the underlying
> BLAS; they just want to describe their problem in some high-level way,
> and the library is responsible for making it happen as efficiently as
> possible, and is often integrated into some larger system (e.g. a
> real-time analytics platform embedded in a web server).
>
> When it comes to our BLAS usage, we mostly use only a small subset of
> the routines. However, as "consumer software" used by a wide variety
> of users with differing degress of technical expertise, we're expected
> to Just Work on a wide variety of systems, and with as many different
> vendor BLAS libraries as possible. On the other hand, the fact that
> we're working with Python means we don't tend to worry about small
> inefficiencies that will be lost in the noise in any case, and are
> willing to sacrifice some performance to get more reliable operation
> across our diverse userbase.
>
> ## Comments on specific aspects of the proposal
>
> ### Data Layout
>
> We are **strongly in favor** of the proposal to support arbitrary
> strided data layouts. Ideally, this would support strides *specified
> in bytes* (allowing for unaligned data layouts), and allow for truly
> arbitrary strides, including *zero or negative* values. However, we
> think it's fine if some of the weirder cases suffer a performance
> penalty.
>
> Rationale: NumPy – and thus, most of the scientific Python ecosystem –
> only has one way of representing an array: the `numpy.ndarray` type,
> which is an arbitrary dimensional tensor with arbitrary strides. It is
> common to encounter matrices with non-trivial strides. For example::
>
> # Make a 3-dimensional tensor, 10 x 9 x 8
> t = np.zeros((10, 9, 8))
> # Considering this as a stack of eight 10x9 matrices, extract the
> first:
> mat = t[:, :, 0]
>
> Now `mat` has non-trivial strides on both axes. (If running this in a
> Python interpreter, you can see this by looking at the value of
> `mat.strides`.) Another case where interesting strides arise is when
> performing ["broadcasting"](https://docs.scipy.org/doc/numpy-1.13.0/
> user/basics.broadcasting.html),
> which is the name for NumPy's rules for stretching arrays to make
> their shapes match. For example, in an expression like::
>
> np.array([1, 2, 3]) + 1
>
> the scalar `1` is "broadcast" to create a vector `[1, 1, 1]`. This is
> acco

Re: [Numpy-discussion] Splitting MaskedArray into a separate package

2018-05-23 Thread Ilhan Polat

 As far as I understand from the discussion above, I think the opposite
would be a better strategy for the sanity of our scarce but brave
maintainers. I would argue that if there is a maintenance burden, then the
ballasts seem to be the linalg and random indeed. Similar pain points exist
in SciPy too. There are a lot of issues that has been already thought of,
years ago but never materialized (be it backwards compatibility, lack of
champions and so on) because they are not the priority of the maintaining
team. It is very common that a discussion ends with "yes, we should
probably make it a ufunc" and then fades away. I feel that if there were
less things to worry about more people would step up and "do it".

I would also argue that highest expectancy from NumPy would be having a
really sound data structure basis with more ufuncs, more array manipulation
tricks and so on. Masked arrays, imho, fall into that category. Hence, if
the codebase gets more refined in that respect and less stuff to maintain,
less moving parts, I think there would be a more coherent overall picture
and more focused action plan. Now the attention of maintainers seem to be
divided into a lot of orthogonal issues which is not a bad thing per se but
tedious at times. Currently NumPy has a lot of code that really doesn't
need to bother and can delegate to higher level packages like SciPy or any
other subpackage. It sounds like NumPy 2.0 but actually more of a gradual
thinning out.

On Wed, May 23, 2018 at 10:51 PM, Stefan van der Walt 
wrote:

> Hi Eric,
>
> On May 23, 2018 13:25:44 Eric Firing  wrote:
>
> On 2018/05/23 9:06 AM, Matti Picus wrote:
>> I understand at least some of the motivation and potential advantages,
>> but as it stands, I find this NEP highly alarming.
>>
>
> I am not at my computer right now, so I will respond in more detail later.
> But I wanted to address your statement above:
>
> I see a NEP as an opportunity to discuss and flesh out an idea, and I
> certainly hope that you there's no reason for alarm.
>
> I do not expect to know whether this is a good idea before discussions
> conclude, so I appreciate your feedback. If we cannot find good support for
> the idea, with very specific benefits, it should simply be dropped.
>
> But, I think there's a lot to learn from the conversation in the meantime
> w.r.t. exactly how streamlined people want NumPy to be, how core
> functionality can perhaps be strengthened by becoming a customer of our own
> API, how to optimally maintain sub-components, etc.
>
> Best regards,
> Stéfan
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] matmul as a ufunc

2018-05-29 Thread Ilhan Polat

Apart from the math-validity discussion, in my experience errors are used a
bit too generously in the not-allowed ops. No ops are fine once you learn
more about them such as transpose on 1D arrays (good or bad is another
discussion). But raising errors bloat the computational code too much. "Is
it a scalar oh then do this is it 1D oh make this one is it 2D then do
something else." type of coding is really making life difficult.

Most of my time in the numerical code is spent on trying to catch scalars
and 1D arrays and writing exceptions because I can't predict what the user
would do or what the result should be after certain operations. Quite
unwillingly, I've started making everything 2D whether it is required or
not because then I can just avoid the following

np.eye(4)[:, 1]   # 1d
np.eye(4)[:, 1:2] #2d
np.eye(4)[:, [1]] #2d
np.eye(4)[:, [1]] @ 5  # Error
np.eye(4)[:, [1]] @ np.array(5)#Error
np.eye(4)[:, [1]] @ np.array([5])  # Result is 1D
np.eye(4)[:, [1]] @ np.array([[5]])# Result 2D

So imagine I'm trying to get a simple multiply_these function, I have
already quite some cases to consider such that the function is "Pythonic".
If the second argument is int,float do *-mult, if it is a numpy array but
has no dimensions then again *-mult but if it is 1d keep dims and also if
it is 2d do @-mult. Add broadcasting rules on top of this and it gets a
pretty wordy function.Hence, what I would suggest is to also include the
use cases while deciding the behavior of a single functionality.

So indeed it doesn't make sense to transpose 0d array but as an array
object now it would start to have a lot of Wat! moments.
https://www.destroyallsoftware.com/talks/wat

On Tue, May 29, 2018 at 12:51 PM, Andras Deak  wrote:

> On Tue, May 29, 2018 at 12:16 PM, Daπid  wrote:
> > Right now, np.int(8).T throws an error, but np.transpose(np.int(8))
> gives a
> > 0-d array. On one hand, it is nice to be able to use the same code for
>
> `np.int` is just python `int`! What you mean is `np.int64(8).T` which
> works fine, so does `np.array(8).T`.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Polynomial evaluation inconsistencies

2018-06-30 Thread Ilhan Polat

I think restricting polynomials to time series is not a generic way and
quite specific.

Apart from the series and certain filter design actual usage of polynomials
are always presented with decreasing order (control and signal processing
included because they use powers of s and inverse powers of z if needed).
So if that is the use case then probably it should go under a namespace of
`TimeSeries` or at least require an option to present it in reverse.  In my
opinion polynomials are way more general than that domain and to everyone
else it seems to me that "the intuitive way" is the decreasing powers.

For the design

> This isn't a great design, because they represent:
>p(x) = c[0] * x^2 + c[1] * x^1 + c[2] * x^0

I don't see the problem actually. If I ask someone to write down the
coefficients of a polynomial I don't think anyone would start from c[2].





On Sat, Jun 30, 2018 at 8:30 PM, Charles R Harris  wrote:

>
>
> On Sat, Jun 30, 2018 at 12:09 PM, Eric Wieser  > wrote:
>
>> >  if a single program uses both np.polyval() and
>> np.polynomail.Polynomial, it seems bound to cause unnecessary confusion.
>>
>> Yes, I would recommend definitely not doing that!
>>
>> > I still think it would make more sense for np.polyval() to use
>> conventional indexing
>>
>> Unfortunately, it's too late for "making sense" to factor into the
>> design. `polyval` is being used in the wild, so we're stuck with it
>> behaving the way it does. At best, we can deprecate it and start telling
>> people to move from `np.polyval` over to `np.polynomial.polynomial.polyval`.
>> Perhaps we need to make this namespace less cumbersome in order for that to
>> be a reasonable option.
>>
>> I also wonder if we want a more lightweight polynomial object without the
>> extra domain and range information, which seem like they make `Polynomial`
>> a more questionable drop-in replacement for `poly1d`.
>>
>
> The defaults for domain and window make it like a regular polynomial. For
> fitting, it does adjust the range, but the usual form can be recovered with
> `p.convert()` and will usually have more accurate coefficients due to using
> a better conditioned matrix during the fit.
>
> In [1]: from numpy.polynomial import Polynomial as P
>
> In [2]: p = P([1, 2, 3], domain=(0,2))
>
> In [3]: p(0)
> Out[3]: 2.0
>
> In [4]: p.convert()
> Out[4]: Polynomial([ 2., -4.,  3.], domain=[-1.,  1.], window=[-1.,  1.])
>
> In [5]: p.convert()(0)
> Out[5]: 2.0
>
> Chuck
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Adoption of a Code of Conduct

2018-08-01 Thread Ilhan Polat

I agree with Ralf. That thread is more towards a US based separation.
Actually we briefly touched upon these on the SciPy side but indeed there
was no real discussion.

Political beliefs (especially communism in US for a practical example) can
offend some people and that's OK because being offended by itself doesn't
have a merit. There will always be someone getting offended by anything.
But racism, sexism etc. are not "political" stances and deserve to be
united against regardless whether someone is offended or not.

Actually calling these discriminating doctrines as "political beliefs" is
making me quite nervous instead.







On Wed, Aug 1, 2018 at 5:37 PM, Ralf Gommers  wrote:

>
>
> On Wed, Aug 1, 2018 at 8:12 AM, Nathan Goldbaum 
> wrote:
>
>>
>>
>> On Wed, Aug 1, 2018 at 9:49 AM, Ralf Gommers 
>> wrote:
>>
>>>
>>>
>>> On Wed, Aug 1, 2018 at 12:20 AM, Nathan Goldbaum 
>>> wrote:
>>>
 I realize this was probably brought up in the discussions about the
 scipy code of conduct which I have not looked at, but I’m troubled by the
 inclusion of “political beliefs” in the document.

>>>
>>> It was not brought up explicitly as far as I remember.
>>>
>>>
 See e.g.
 https://github.com/jupyter/governance/pull/5

>>>
>>> That's about moving names around. I don't see any mention of political
>>> beliefs?
>>>
>>
>> Sorry about that, I elided the 6. This is the correct link:
>>
>> https://github.com/jupyter/governance/pull/56
>>
>
> Thanks, that's useful context for your question.
>
> I'm personally not too attached to "political belief", but I think the
> discussion in that PR and in the OSCON context is very US-centric and
> reflective of the polarized atmosphere there.
>
> If everyone is fine with removing political beliefs then I'm fine with
> that, but I don't think that the argument itself (from a non-US
> perspective) has much merit.
>
>
>>
>>
>>>
>>>
 As a thought experiment, what if someone’s political beliefs imply that
 other contributors are not deserving of human rights? Increasingly ideas
 like this are coming into the mainstream worldwide and I think this is a
 real concern that should be considered.

>>>
>>> There is a difference between having beliefs, and expressing those
>>> beliefs in ways that offends others. I don't see any problem with saying
>>> that we welcome anyone, irrespective of political belief. However, if
>>> someone starts expressing things that are intolerant (like someone else not
>>> deserving human rights) on any of our communication forums or in an
>>> in-person meeting, that would be a clear violation of the CoC. Which can be
>>> dealt with via the reporting and enforcement mechanism in the CoC.
>>>
>>> I don't see a problem here, but I would see a real problem with removing
>>> the "political beliefs" phrase.
>>>
>>
>> For another perspective on this issue see https://where.coraline.codes/b
>> log/oscon/, where Coraline Ada describes her reasons for not speaking at
>> OSCON this year due to a similar clause in the code of conduct.
>>
>
> There's a lot of very unrealistic examples in that post. Plus retracting a
> week in advance of a conference is, to put it mildly, questionable. So not
> sure what to think of the rest of that post. There may be good points in
> there, but they're obscured by the obvious flaws in thinking and choice of
> examples.
>
> Cheers,
> Ralf
>
>
>>
>>
>> Cheers,
>>> Ralf
>>>
>>>
>>>

 On Mon, Jul 30, 2018 at 8:25 PM Charles R Harris <
 charlesr.har...@gmail.com> wrote:

> On Fri, Jul 27, 2018 at 4:02 PM, Stefan van der Walt <
> stef...@berkeley.edu> wrote:
>
>> Hi everyone,
>>
>> A while ago, SciPy (the library) adopted its Code of Conduct:
>> https://docs.scipy.org/doc/scipy/reference/dev/conduct/code_
>> of_conduct.html
>>
>> We worked hard to make that document friendly, while at the same time
>> stating clearly the kinds of behavior that would and would not be
>> tolerated.
>>
>> I propose that we adopt the SciPy code of conduct for NumPy as well.
>> It
>> is a good way to signal to newcomers that this is a community that
>> cares
>> about how people are treated.  And I think we should do anything in
>> our
>> power to make NumPy as attractive as possible!
>>
>> If we adopt this document as policy, we will need to select a Code of
>> Conduct committee, to whom potential transgressions can be reported.
>> The individuals doing this for SciPy may very well be happy to do the
>> same for NumPy, but the community should decide whom will best serve
>> those roles.
>>
>> Let me know your thoughts.
>>
>
> +1 from me.
>
> Chuck
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>

 __

Re: [Numpy-discussion] Possible Deprecation of np.ediff1d

2018-08-28 Thread Ilhan Polat

In the meantime I'll make a PR to get rid of it from SciPy. We can also
signal other libraries to do so. Anything frees up the already-very-crowded
namespace of NumPy dot is worth it in my opinion.

On Tue, Aug 28, 2018 at 7:40 PM Stephan Hoyer  wrote:

>
>
> On Tue, Aug 28, 2018 at 9:03 AM Ralf Gommers 
> wrote:
>
>> Maybe we need a "NumpyObsoleteWarning" :) At the least, we should
>>> probably have a list of obsolete functions in the documentation somewhere.
>>> My main concern is that as we go forward we might end up supporting a bunch
>>> of functions that are seldom used and have better replacements. We need
>>> some method of pruning.
>>>
>>
>> Given the list of uses Stephan turned up and Robert saying it's a useful
>> function, I'm -1 on any warning. If np.diff gets the same padding behavior,
>> we can document ediff1d in its document as being superceded with a
>> recommendation to use np.diff instead.
>>
>
> To be clear, I don't think np.ediff1d is particularly useful or necessary,
> despite these uses. Most of these uses don't even use the optional
> arguments, so the author was probably simply ignorant of np.diff. This is
> more or less inevitable for most corners of NumPy's API, given how many
> users we have.
>
> "PendingDeprecationWarning" is Python's built-in warning for signaling
> that something is obsolete but not deprecated yet. It might be appropriate
> to use in these cases. The default warning filters silence it for users, so
> it doesn't show up unless you're very aggressive about enabling all
> warnings.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] how much does binary size matter?

2019-04-26 Thread Ilhan Polat

here is a baseline
https://en.wikipedia.org/wiki/List_of_countries_by_Internet_connection_speeds
. Probably a good idea to throttle values at 60% of the bandwidth and you
get a crude average delay it would cause per 1MB worldwide.



On Fri, Apr 26, 2019 at 11:49 AM Éric Depagne  wrote:

> Le vendredi 26 avril 2019, 11:10:56 SAST Ralf Gommers a écrit :
> Hi Ralf,
>
> >
> > Right now a wheel is 16 MB. If we increase that by 10%/50%/100% - are we
> > causing a real problem for someone?
> Access to large bandwidth is not universal at all, and in many countries
> (I'd
> even say in most of the countries around the world), 16 Mb is a
> significant
> amount of data so increasing it is a burden.
>
> Cheers,
> Éric.
>
> >
> > Thanks,
> > Ralf
>
>
> --
> Un clavier azerty en vaut deux
> --
> Éric Depagne
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-24 Thread Ilhan Polat

Please don't introduce more errors for 1D arrays. They are already very
counter-intuitive for transposition and for other details not relevant to
this issue. Emitting errors for such a basic operation is very bad for user
experience. This already is the case with wildly changing slicing syntax.
It would have made sense if 2D arrays were the default objects and 1D
required extra effort to create. But it is the other way around. Hence a
transpose operation is "expected" from it. This would kind of force all
NumPy users to shift their code one tab further to accomodate for the extra
try, catch blocks for "Oh wait, what if a 1D array comes in?" checks for
the existence of transposability everytime I write down `.T` in the code.

Code example; I am continuously writing code involving lots of matrix
products with inverses and transposes/hermitians (say, the 2nd eq.,
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.solve_continuous_are.html
)
That means I have to check at least 4-6 matrices if any of them are
transposable to make that equation go through.

The dot-H solution is actually my ideal choice but I get the point that the
base namespace is already crowded. I am even OK with having
`x.conj(T=True)` having a keyword for extra transposition so that I can get
away with `x.conj(1)`; it doesn't solve the fundamental issue but at least
gives some convenience.

Best,
ilhan

On Mon, Jun 24, 2019 at 3:11 AM Marten van Kerkwijk <
m.h.vankerkw...@gmail.com> wrote:

> I had not looked at any implementation (only remembered the nice idea of
> "importing from the future"), and looking at the links Eric shared, it
> seems that the only way this would work is, effectively, pre-compilation
> doing a `.replace('.T', '._T_from_the_future')`, where you'd be
> hoping that there never is any other meaning for a `.T` attribute, for any
> class, since it is impossible to be sure a given variable is an ndarray.
> (Actually, a lot less implausible than for the case of numpy indexing
> discussed in the link...)
>
> Anyway, what I had in mind was something along the lines of inside the
> `.T` code there being be a check on whether a particular future item was
> present in the environment. But thinking more, I can see that it is not
> trivial to get to know something about the environment in which the code
> that called you was written
>
> So, it seems there is no (simple) way to tell numpy that inside a given
> module you want `.T` to have the new behaviour, but still to warn if
> outside the module it is used in the old way (when risky)?
>
> -- Marten
>
> p.s. I'm somewhat loath to add new properties to ndarray, but `.T` and
> `.H` have such obvious and clear meaning to anyone dealing with (complex)
> matrices that I think it is worth it. See
> https://mail.python.org/pipermail/numpy-discussion/2019-June/079584.html
> for a list of options of attributes that we might deprecate "in exchange"...
>
> All the best,
>
> Marten
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-24 Thread Ilhan Polat

I think enumerating the cases along the way makes it a bit more tangible
for the discussion


import numpy as np
z = 1+1j
z.conjugate()  # 1-1j

zz = np.array(z)
zz  # array(1+1j)
zz.T  # array(1+1j)  # OK expected.
zz.conj()  # 1-1j ?? what happened; no arrays?
zz.conjugate()  # 1-1j ?? same

zz1d = np.array([z]*3)
zz1d.T  # no change so this is not the regular 2D array
zz1d.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
zz1d.conj().T  # array([1.-1.j, 1.-1.j, 1.-1.j])
zz1d.T.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
zz1d[:, None].conj()  # 2D column vector - no surprises if [:, None] is
known

zz2d = zz1d[:, None]  # 2D column vector - no surprises if [:, None] is
known
zz2d.conj()  # 2D col vec conjugated
zz2d.conj().T  # 2D col vec conjugated transposed

zz3d = np.arange(24.).reshape(2,3,4).view(complex)
zz3d.conj()  # no surprises, conjugated
zz3d.conj().T  # ?? Why not the last two dims swapped like other stacked ops

# For scalar arrays conjugation strips the number
# For 1D arrays transpose is a no-op but conjugation works
# For 2D arrays conjugate it is the matlab's elementwise conjugation op .'
# and transpose is acting like expected
# For 3D arrays conjugate it is the matlab's elementwise conjugation op .'
# but transpose is the reversing all dims just like matlab's permute()
# with static dimorder.

and so on. Maybe we can try to identify all the use cases and the quirks
before we can make design the solution. Because these are a bit more
involved and I don't even know if this is exhaustive.


On Mon, Jun 24, 2019 at 8:21 PM Marten van Kerkwijk <
m.h.vankerkw...@gmail.com> wrote:

> Hi Stephan,
>
> Yes, the complex conjugate dtype would make things a lot faster, but I
> don't quite see why we would wait for that with introducing the `.H`
> property.
>
> I do agree that `.H` is the correct name, giving most immediate clarity
> (i.e., people who know what conjugate transpose is, will recognize it,
> while likely having to look up `.CT`, while people who do not know will
> have to look up regardless). But at the same time agree that the docstring
> and other documentation should start with "Conjugate tranpose" - good to
> try to avoid using names of people where you have to be in the "in crowd"
> to know what it means.
>
> The above said, if we were going with the initial suggestion of `.MT` for
> matrix transpose, then I'd prefer `.CT` over `.HT` as its conjugate version.
>
> But it seems there is little interest in that suggestion, although sadly a
> clear path forward has not yet emerged either.
>
> All the best,
>
> Marten
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-25 Thread Ilhan Polat

I have to disagree, I hardly ever saw such bugs and moreover  is
not compatible if you don't also transpose it but expected in almost all
contexts of matrices, vectors and scalars. Elementwise conjugation is well
inline with other elementwise operations starting with a dot in matlab
hence still consistent.

I would still expect an conjugation+transposition to be the default since
just transposing a complex array is way more special and rare than its
ubiquitous regular usage.

ilhan


On Tue, Jun 25, 2019 at 10:57 AM Andras Deak  wrote:

> On Tue, Jun 25, 2019 at 4:29 AM Cameron Blocker
>  wrote:
> >
> > In my opinion, the matrix transpose operator and the conjugate transpose
> operator should be one and the same. Something nice about both Julia and
> MATLAB is that it takes more keystrokes to do a regular transpose instead
> of a conjugate transpose. Then people who work exclusively with real
> numbers can just forget that it's a conjugate transpose, and for relatively
> simple algorithms, their code will just work with complex numbers with
> little modification.
> >
>
> I'd argue that MATLAB's feature of `'` meaning adjoint (conjugate
> transpose etc.) and `.'` meaning regular transpose causes a lot of
> confusion and probably a lot of subtle bugs. Most people are unaware
> that `'` does a conjugate transpose and use it habitually, and when
> for once they have a complex array they don't understand why the
> values are off (assuming they even notice). Even the MATLAB docs
> conflate the two operations occasionally, which doesn't help at all.
> Transpose should _not_ incur conjugation automatically. I'm already a
> bit wary of special-casing matrix dynamics this much, when ndarrays
> are naturally multidimensional objects. Making transposes be more than
> transposes would be a huge mistake in my opinion, already for matrices
> (2d arrays) and especially for everything else.
>
> András
>
>
>
> > Ideally, I'd like to see a .H that was the defacto Matrix/Linear
> Algebra/Conjugate transpose that for 2 or more dimensions, conjugate
> transposes the last two dimensions and for 1 dimension just conjugates (if
> necessary). And then .T can stay the Array/Tensor transpose for general
> axis manipulation. I'd be okay with .T raising an error/warning on 1D
> arrays if .H did not. I commonly write things like u.conj().T@v even if I
> know both u and v are 1D just so it looks more like an inner product.
> >
> > -Cameron
> >
> > On Mon, Jun 24, 2019 at 6:43 PM Ilhan Polat 
> wrote:
> >>
> >> I think enumerating the cases along the way makes it a bit more
> tangible for the discussion
> >>
> >>
> >> import numpy as np
> >> z = 1+1j
> >> z.conjugate()  # 1-1j
> >>
> >> zz = np.array(z)
> >> zz  # array(1+1j)
> >> zz.T  # array(1+1j)  # OK expected.
> >> zz.conj()  # 1-1j ?? what happened; no arrays?
> >> zz.conjugate()  # 1-1j ?? same
> >>
> >> zz1d = np.array([z]*3)
> >> zz1d.T  # no change so this is not the regular 2D array
> >> zz1d.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
> >> zz1d.conj().T  # array([1.-1.j, 1.-1.j, 1.-1.j])
> >> zz1d.T.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
> >> zz1d[:, None].conj()  # 2D column vector - no surprises if [:, None] is
> known
> >>
> >> zz2d = zz1d[:, None]  # 2D column vector - no surprises if [:, None] is
> known
> >> zz2d.conj()  # 2D col vec conjugated
> >> zz2d.conj().T  # 2D col vec conjugated transposed
> >>
> >> zz3d = np.arange(24.).reshape(2,3,4).view(complex)
> >> zz3d.conj()  # no surprises, conjugated
> >> zz3d.conj().T  # ?? Why not the last two dims swapped like other
> stacked ops
> >>
> >> # For scalar arrays conjugation strips the number
> >> # For 1D arrays transpose is a no-op but conjugation works
> >> # For 2D arrays conjugate it is the matlab's elementwise conjugation op
> .'
> >> # and transpose is acting like expected
> >> # For 3D arrays conjugate it is the matlab's elementwise conjugation op
> .'
> >> # but transpose is the reversing all dims just like matlab's
> permute()
> >> # with static dimorder.
> >>
> >> and so on. Maybe we can try to identify all the use cases and the
> quirks before we can make design the solution. Because these are a bit more
> involved and I don't even know if this is exhaustive.
> >>
> >>
> >> On Mon, Jun 24, 2019 at 8:21 PM Marten van Kerkwijk <
> m.h.vankerkw...@gmail.com> wrote:

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-25 Thread Ilhan Polat

I think we would have seen a lot of evidence in the last four decades if
this was that problematic.

You are the second person to memtion these bugs. Care to show me some
examples of these bugs?

Maybe I am missing the point here. I haven't seen any bugs because somebody
thought they are just transposing.

Using transpose to reshape an array is a different story. That we can
discuss.

On Tue, Jun 25, 2019, 16:10 Todd  wrote:

> That is how it is in your field, but not mine.  For us we only use the
> conventional transpose, even for complex numbers.  And I routinely see bugs
> in MATLAB because of its choice of defaults, and there are probably many
> more that don't get caught because they happen silently.
>
> I think the principle of least surprise should apply here.  For people who
> need the conjugate transform the know to make sure they use the right
> operation.  But a lot of people aren't even aware that there conjugate
> transpose exists, they are just going to copy what they see in the examples
> without realizing it does the completely wrong thing in certain cases.
> They wouldn't bother to check because they don't even know there is a
> second transpose operation they need to look out for.  So it would hurt a
> lot of people without helping anyone.
>
> On Tue, Jun 25, 2019, 07:03 Ilhan Polat  wrote:
>
>> I have to disagree, I hardly ever saw such bugs and moreover  is
>> not compatible if you don't also transpose it but expected in almost all
>> contexts of matrices, vectors and scalars. Elementwise conjugation is well
>> inline with other elementwise operations starting with a dot in matlab
>> hence still consistent.
>>
>> I would still expect an conjugation+transposition to be the default since
>> just transposing a complex array is way more special and rare than its
>> ubiquitous regular usage.
>>
>> ilhan
>>
>>
>> On Tue, Jun 25, 2019 at 10:57 AM Andras Deak 
>> wrote:
>>
>>> On Tue, Jun 25, 2019 at 4:29 AM Cameron Blocker
>>>  wrote:
>>> >
>>> > In my opinion, the matrix transpose operator and the conjugate
>>> transpose operator should be one and the same. Something nice about both
>>> Julia and MATLAB is that it takes more keystrokes to do a regular transpose
>>> instead of a conjugate transpose. Then people who work exclusively with
>>> real numbers can just forget that it's a conjugate transpose, and for
>>> relatively simple algorithms, their code will just work with complex
>>> numbers with little modification.
>>> >
>>>
>>> I'd argue that MATLAB's feature of `'` meaning adjoint (conjugate
>>> transpose etc.) and `.'` meaning regular transpose causes a lot of
>>> confusion and probably a lot of subtle bugs. Most people are unaware
>>> that `'` does a conjugate transpose and use it habitually, and when
>>> for once they have a complex array they don't understand why the
>>> values are off (assuming they even notice). Even the MATLAB docs
>>> conflate the two operations occasionally, which doesn't help at all.
>>> Transpose should _not_ incur conjugation automatically. I'm already a
>>> bit wary of special-casing matrix dynamics this much, when ndarrays
>>> are naturally multidimensional objects. Making transposes be more than
>>> transposes would be a huge mistake in my opinion, already for matrices
>>> (2d arrays) and especially for everything else.
>>>
>>> András
>>>
>>>
>>>
>>> > Ideally, I'd like to see a .H that was the defacto Matrix/Linear
>>> Algebra/Conjugate transpose that for 2 or more dimensions, conjugate
>>> transposes the last two dimensions and for 1 dimension just conjugates (if
>>> necessary). And then .T can stay the Array/Tensor transpose for general
>>> axis manipulation. I'd be okay with .T raising an error/warning on 1D
>>> arrays if .H did not. I commonly write things like u.conj().T@v even if
>>> I know both u and v are 1D just so it looks more like an inner product.
>>> >
>>> > -Cameron
>>> >
>>> > On Mon, Jun 24, 2019 at 6:43 PM Ilhan Polat 
>>> wrote:
>>> >>
>>> >> I think enumerating the cases along the way makes it a bit more
>>> tangible for the discussion
>>> >>
>>> >>
>>> >> import numpy as np
>>> >> z = 1+1j
>>> >> z.conjugate()  # 1-1j
>>> >>
>>> >> zz = np.array(z)
>>> >> zz  # array(1

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-26 Thread Ilhan Polat

Maybe a bit of a grouping would help, because I am also losing track here.
Let's see if I could manage to get something sensible because, just like
Marten mentioned, I am confusing myself even when I am thinking about this

1- Transpose operation on 1D arrays:
This is a well-known confusion point for anyone that arrives at NumPy
usage from, say matlab background or any linear algebra based user. Andras
mentioned already that this is a subset of NumPy users so we have to be
careful about the user assumptions. 1D arrays are computational constructs
and mathematically they don't exist and this is the basis that matlab
enforced since day 1. Any numerical object is an at least 2D array
including scalars hence transposition flips the dimensions even for a col
vector or row vector. That doesn't mean we cannot change it or we need to
follow matlab but this is kind of what anybody kinda sorta wouda expect.
For some historical reason, on numpy side transposition on 1D arrays did
nothing since they have single dimensions. Hence you have to create a 2D
vector for transpose from the get go to match the linear algebra intuition.
Points that has been discussed so far are about whether we should go
further and even intercept this behavior such that 1D transpose gives
errors or warnings as opposed to the current behavior of silent no-op. as
far as I can tell, we have a consensus that this behavior is here to stay
for the foreseeable future.

2- Using transpose to reshape the (complex) array or flip its dimensions
This is a usage that has been mentioned above that I don't know much
about. I usually go the "reshape() et al." way for this but apparently
folks use it to flip dimensions and they don't want the automatically
conjugation which is exactly the opposite of a linear algebra oriented user
is used to have as an adjoint operator. Therefore points that have been
discussed about are whether to inject conjugation into .T behavior of
complex arrays or not. If not can we have an extra .H or something that
specifically does .conj().T together (or .T.conj() order doesn't matter).
The main feel (that I got so far) is that we shouldn't touch the current
way and hopefully bring in another attribute.

3- Having a shorthand notation such as .H or .mH etc.
If the previous assertion is true then the issue becomes what should be
the new name of the attribute and how can it have the nice properties of a
transpose such as returning a view etc. However this has been proposed and
rejected before e.g., GH-8882 and GH-13797. There is a catch here though,
because if the alternative is .conj().T then it doesn't matter whether it
copies or not because .conj().T doesn't return a view either and therefore
the user receives a new array anyways. Therefore no benefits lost. Since
the idea is to have a shorthand notation, it seems to me that this point is
artificial in that sense and not necessarily a valid argument for
rejection. But from the reluctance of Ralf I feel like there is a
historical wear-out on this subject.

4- transpose of 3+D arrays
I think we missed the bus on this one for changing the default behavior
now and there are glimpses of confirmation of this above in the previous
mails. I would suggest discussing this separately.

So if you are not already worn out and not feeling sour about it, I would
like to propose the discussion of item 3 opened once again. Because the
need is real and we don't need to get choked on the implementation details
right away.

Disclaimer: I do applied math so I have a natural bias towards the linalg-y
way of doing things. And sorry about that if I did that above, sometimes
typing quickly loses the intention.

Best,
ilhan

On Wed, Jun 26, 2019 at 4:39 AM Ralf Gommers  wrote:

>
>
> On Wed, Jun 26, 2019 at 3:56 AM Marten van Kerkwijk <
> m.h.vankerkw...@gmail.com> wrote:
>
>> Hi Ralf,
>>
>> On Tue, Jun 25, 2019 at 6:31 PM Ralf Gommers 
>> wrote:
>>
>>>
>>>
>>> On Tue, Jun 25, 2019 at 11:02 PM Marten van Kerkwijk <
>>> m.h.vankerkw...@gmail.com> wrote:
>>>

 For the names, my suggestion of lower-casing the M in the initial one,
 i.e., `.mT` and `.mH`, so far seemed most supported (and I think we should
 discuss *assuming* those would eventually involve not copying data; let's
 not worry about implementation details).

>>>
>>> For the record, this is not an implementation detail. It was the
>>> consensus before that `H` is a bad idea unless it returns a view just like
>>> `T`: https://github.com/numpy/numpy/issues/8882
>>>
>>
>> Is there more than an issue in which Nathaniel rejecting it mentioning
>> some previous consensus?
>>
>
> Yes, this has been discussed in lots of detail before, also on this list
> (as Nathaniel mentioned in the issue). I spent 10 minutes to try and find
> it but that wasn't enough. I do think it's not necessarily my
> responsibility though to dig up all the history here - that should be on
> the proposers of a new feature 
>
> I was part

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-26 Thread Ilhan Polat

I've finally gone through the old discussion and finally got the
counter-argument in one of the Dag Sverre's replies
http://numpy-discussion.10968.n7.nabble.com/add-H-attribute-tp34474p34668.html

TL; DR

I disagree with [...adding the .H attribute...] being forward looking, as
> it explicitly creates a situation where code will break if .H becomes a
> view
>

This actually makes perfect sense and a valid concern that I have not
considered before.

The remaining question is why we treat as if returning a view is a
requirement. We have been using .conj().T and receiving the copies of the
arrays since that day with equally inefficient code after many years. Then
the discussion diverges to other things hence I am not sure where does this
requirement come from.

But I guess this part should be rehashed clearer until next time :)




On Thu, Jun 27, 2019 at 12:03 AM Charles R Harris 
wrote:

>
>
> On Wed, Jun 26, 2019 at 2:18 PM Ralf Gommers 
> wrote:
>
>>
>>
>> On Wed, Jun 26, 2019 at 10:04 PM Kirill Balunov 
>> wrote:
>>
>>> Only concerns #4 from Ilhan's list.
>>>
>>> ср, 26 июн. 2019 г. в 00:01, Ralf Gommers :
>>>

 []

 Perhaps not full consensus between the many people with different
 opinions and interests. But for the first one, arr.T change: it's clear
 that this won't happen.

>>>
>>> To begin with, I must admit that I am not familiar with the accepted
>>> policy of introducing changes to NumPy. But I find it quite
>>> nonconstructive just to say - it will not happen. What then is the
>>> point in the discussion?
>>>
>>
>> There has been a *very* long discussion already, and several others on
>> the same topic before. There are also long-standing ways of dealing with
>> backwards compatibility - e.g. what Matthew said is not new, it's an agreed
>> upon way of working.
>> http://www.numpy.org/neps/nep-0023-backwards-compatibility.html lists
>> some principles. That NEP is not yet accepted (it needs rework), but it
>> gives a good idea of what does and does not go.
>>
>>
>>>
>>>
 Between Juan's examples of valid use, and what Stephan and Matthew
 said, there's not much more to add. We're not going to change correct code
 for minor benefits.

>>>
>>> I fully agree that any feature can find its use, valid or not is another
>>> question. Juan did not present these examples, but I will allow myself
>>> to assume that it is more correct to describe what is being done there as a
>>> permutation, and not a transpose. In addition, in the very next
>>> sentence, Juan adds that "These could be easily changed to .transpose()
>>> (honestly they probably should!)"
>>>
>>> We're not going to change correct code for minor benefits.

>>>
>>> It's fair, I personally have no preferences in both cases, the most
>>> important thing for me is that in the 2d case it works correctly. To be
>>> honest, until today, I thought that `.T` will raise for` ndim > 2`. At
>>> least that's what my experience told me. For example in
>>>
>>> Matlab - Error using  .' Transpose on ND array is not defined. Use
>>> PERMUTE instead.
>>>
>>> Julia - transpose not defined for Array(Float64, 3). Consider using
>>> permutedims for higher-dimensional arrays.
>>>
>>> Sympy - raise ValueError("array rank not 2")
>>>
>>> Here, I agree with the authors that, to begin with, `transpose` is not
>>> the best name, since in general it doesn’t fit as an any mathematical
>>> definition (of course it will depend on what we take as an element) or a
>>> definition from linear algebra. Thus the name `transpose` only leads to
>>> confusion.
>>>
>>> For a note about another suggestion - `.T` to mean a transpose of the
>>> last two dimensions, in Mathematica authors for some reason did the
>>> opposite (personally, I could not understand why they made such a
>>> choice :) ):
>>>
>>> Transpose[list]
>>> transposes the first two levels in list.
>>>
>>> I feel strongly that we should have the following policy:

 * Under no circumstances should we make changes that mean that
 correct
 old code will give different results with new Numpy.

>>>
>>> I find this overly strict rules that do not allow to evolve. I
>>> completely agree that a silent change in behavior is a disaster, that
>>> changing behavior (if it is not an error) in the same minor version (1.X.Y)
>>> is not acceptable, but I see no reason to extend this rule for a major
>>> version bump (2.A.B.),  especially if it allows something to improve.
>>>
>>
>> I'm sorry, you'll have to live with this rule. We've had lots of
>> discussion about this rule in many concrete cases. When existing code is
>> buggy or is consistently confusing many users, we can discuss. But in
>> general changing old code to do something else is a terrible idea.
>>
>>
>>> I would see such a rough version of a roadmap of change (I foresee my
>>> loneliness in this :)) Also considering this comment
>>>
>>> Personally I would find a

Re: [Numpy-discussion] [SciPy-Dev] Season of Docs - welcome Anne, Maja, Brandon

2019-08-06 Thread Ilhan Polat

Great news, welcome all!

On Wed, Aug 7, 2019 at 1:47 AM Ralf Gommers  wrote:

> Hi all,
>
> Google has announced the Season of Docs participants for this year [1]. We
> had a lot of excellent candidates and had to make some hard choices. We
> applied for extra slots, but unfortunately didn't win the lottery for
> those; we got one slot for NumPy and one for SciPy. We chose the projects
> of Anne for NumPy and Maja for SciPy:
>
> Anne Bonner, "Making "The Basics" a Little More Basic: Improving the
> Introductory NumPy Sections" [2]
>
> Maja Gwozdz, "User-oriented documentation and thorough restructuring" [3]
>
> That's not all though. There was some space left in the budget of the
> NumPy BIDS grant, and Stéfan has reserved that so we can accept more
> writers and provide them the same mentoring and funding as they would have
> gotten through GSoD. We could only start the conversations about that once
> Google made its decisions, so a further announcement will follow. However,
> we already have one extra project confirmed, from Brandon:
>
> Brandon David, "Improve the documentation of scipy.stats" (project details
> to be published).
>
> I will send out a poll to find a good time for everyone for a kickoff
> call. Our intent is to build a documentation team with multiple writers and
> mentors interacting and able to help each other out. And all of this will
> also interact with the numpy.org website redesign and the people putting
> energy into that:)
>
> I'm very happy to welcome Anne, Maja and Brandon!
>
> Cheers,
> Ralf
>
>
> [1] https://developers.google.com/season-of-docs/docs/participants/
> [2]
> https://developers.google.com/season-of-docs/docs/participants/project-numpy
> [3]
> https://developers.google.com/season-of-docs/docs/participants/project-scipy
> ___
> SciPy-Dev mailing list
> scipy-...@python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fwd: Re: Calling BLAS functions from Python

2019-08-27 Thread Ilhan Polat

The inplace overwriting is done if f2py can forward the original array down
to the low level.

When it is not contiguous then it has to somehow marshall the view into a
compatible array and that is when the inbetween array is formed. And also
that array can also be overwritten but that would not be the original view
you started with. Hence it is kind of a convenience cost you pay.

Cython might be a better option for you such that you can pass things
around via memory views and cython wrappers of BLAS.

On Tue, Aug 27, 2019, 16:40 Jens Jørgen Mortensen  wrote:

> Sorry!  Stupid me, asking scipy questions on numpy-discussion.  Now
> continuing on scipy-user.  Any help is much appreciated.  See short
> numpy-discussion thread here:
> https://mail.python.org/pipermail/numpy-discussion/2019-August/079945.html
>
>
> Hi!
>
> I'm trying to use dgemm, zgemm and friends from scipy.linalg.blas to
> multiply matrices efficiently.  As an example, I'd like to do:
>
>  c += a.dot(b)
>
> using whatever BLAS scipy is linked to and I want to avoid copies of
> large matrices.  This works the way I want it:
>
>  >>> import numpy as np
>  >>> from scipy.linalg.blas import dgemm
>  >>> a = np.ones((2, 3), order='F')
>  >>> b = np.ones((3, 4), order='F')
>  >>> c = np.zeros((2, 4), order='F')
>  >>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
> array([[3., 3., 3., 3.],
> [3., 3., 3., 3.]])
>  >>> print(c)
> [[3. 3. 3. 3.]
>   [3. 3. 3. 3.]]
>
> but if c is not contiguous, then c is not overwritten:
>
>  >>> c = np.zeros((7, 4), order='F')[:2, :]
>  >>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
> array([[3., 3., 3., 3.],
> [3., 3., 3., 3.]])
>  >>> print(c)
> [[0. 0. 0. 0.]
>   [0. 0. 0. 0.]]
>
> Which is also what the docs say, but I think the raw BLAS function dgemm
> could do the update of c in-place by setting LDC=7.  See here:
>
>  http://www.netlib.org/lapack/explore-html/d7/d2b/dgemm_8f.html
>
> Is there a way to call the raw BLAS function from Python?
>
> I found this capsule thing, but I don't know if there is a way to call
> that (maybe using ctypes):
>
>  >>> from scipy.linalg import cython_blas
>  >>> cython_blas.__pyx_capi__['dgemm']
>  __pyx_t_5scipy_6linalg_11cython_blas_d *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *)" at 0x7f06fe1d2ba0>
>
> Best,
> Jens Jørgen
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NEP 32: Remove the financial functions from NumPy

2019-09-04 Thread Ilhan Polat

+1 on removing them from NumPy. I think there are plenty of alternatives
already so many that we might even consider deprecating them just like
SciPy misc module by pointing to alternatives.

On Tue, Sep 3, 2019 at 6:38 PM Sebastian Berg 
wrote:

> On Tue, 2019-09-03 at 08:56 -0400, Warren Weckesser wrote:
> > Github issue 2880 ("Get financial functions out of main namespace",
>
> Very briefly, I am absolutely in favor of this.
>
> Keeping the functions in numpy seems more of a liability than help
> anyone. And this push is more likely to help users by spurring
> development on a good replacement, than a practically unmaintained
> corner of NumPy that may seem like it solves a problem, but probably
> does so very poorly.
>
> Moving them into a separate pip installable package seems like the best
> way forward until a better replacement, to which we can point users,
> comes up.
>
> - Sebastian
>
>
> > https://github.com/numpy/numpy/issues/2880) has been open since 2013.
> > In a recent community meeting, it was suggested that we create a NEP
> > to propose the removal of the financial functions from NumPy.  I have
> > submitted "NEP 32:  Remove the financial functions from NumPy" in a
> > pull request at https://github.com/numpy/numpy/pull/14399.  A copy of
> > the latest version of the NEP is below.
> >
> > According to the NEP process document, "Once the PR is in place, the
> > NEP should be announced on the mailing list for discussion (comments
> > on the PR itself should be restricted to minor editorial and
> > technical fixes)."  This email is the announcement for NEP 32.
> >
> > The NEP includes a brief summary of the history of the financial
> > functions, and has links to several relevant mailing list threads,
> > dating back to when the functions were added to NumPy in 2008.  I
> > recommend reviewing those threads before commenting here.
> >
> > Warren
> >
> > -
> >
> > ==
> > NEP 32 — Remove the financial functions from NumPy
> > ==
> >
> > :Author: Warren Weckesser 
> > :Status: Draft
> > :Type: Standards Track
> > :Created: 2019-08-30
> >
> >
> > Abstract
> > 
> >
> > We propose deprecating and ultimately removing the financial
> > functions [1]_
> > from NumPy.  The functions will be moved to an independent
> > repository,
> > and provided to the community as a separate package with the name
> > ``numpy_financial``.
> >
> >
> > Motivation and scope
> > 
> >
> > The NumPy financial functions [1]_ are the 10 functions ``fv``,
> > ``ipmt``,
> > ``irr``, ``mirr``, ``nper``, ``npv``, ``pmt``, ``ppmt``, ``pv`` and
> > ``rate``.
> > The functions provide elementary financial calculations such as
> > future value,
> > net present value, etc. These functions were added to NumPy in 2008
> > [2]_.
> >
> > In May, 2009, a request by Joe Harrington to add a function called
> > ``xirr`` to
> > the financial functions triggered a long thread about these functions
> > [3]_.
> > One important point that came up in that thread is that a "real"
> > financial
> > library must be able to handle real dates.  The NumPy financial
> > functions do
> > not work with actual dates or calendars.  The preference for a more
> > capable
> > library independent of NumPy was expressed several times in that
> > thread.
> >
> > In June, 2009, D. L. Goldsmith expressed concerns about the
> > correctness of the
> > implementations of some of the financial functions [4]_.  It was
> > suggested then
> > to move the financial functions out of NumPy to an independent
> > package.
> >
> > In a GitHub issue in 2013 [5]_, Nathaniel Smith suggested moving the
> > financial
> > functions from the top-level namespace to ``numpy.financial``.  He
> > also
> > suggested giving the functions better names.  Responses at that time
> > included
> > the suggestion to deprecate them and move them from NumPy to a
> > separate
> > package.  This issue is still open.
> >
> > Later in 2013 [6]_, it was suggested on the mailing list that these
> > functions
> > be removed from NumPy.
> >
> > The arguments for the removal of these functions from NumPy:
> >
> > * They are too specialized for NumPy.
> > * They are not actually useful for "real world" financial
> > calculations, because
> >   they do not handle real dates and calendars.
> > * The definition of "correctness" for some of these functions seems
> > to be a
> >   matter of convention, and the current NumPy developers do not have
> > the
> >   background to judge their correctness.
> > * There has been little interest among past and present NumPy
> > developers
> >   in maintaining these functions.
> >
> > The main arguments for keeping the functions in NumPy are:
> >
> > * Removing these functions will be disruptive for some users.
> > Current users
> >   will have to add the new ``numpy_financial`` package to their
> > dependencies,
> >   and then modify their code to u

Re: [Numpy-discussion] Disallow Accelerate as a LAPACK backend for NumPy

2019-11-15 Thread Ilhan Polat

We have a wiki page with all the details on scipy repo for the rationale of
why we wanted to drop it. There is no need to discuss further about the
situation of Accelerate.

On Fri, Nov 15, 2019, 15:41 Ralf Gommers  wrote:

>
>
> On Fri, Nov 15, 2019 at 5:27 AM Matti Picus  wrote:
>
>> On Tue, Nov 12, 2019 at 12:41 AM Matti Picus 
>> wrote:
>>
>> Apple has dropped support for Accelerate. It has bugs that have not
>> been
>> fixed, and is closed source so we cannot fix them ourselves. We have
>> been getting a handful of reports from users who end up building NumPy
>> on macOS, and inadvertently link to Accelerate, then end up with wrong
>> linalg results. In PR 14880https://github.com/numpy/numpy/pull/14880
>> I
>> propose to disallow finding it when building NumPy. At this time it
>> will
>> remain in distutils as one of the backends to support users, but how
>> do
>> people feel about a future PR to totally remove it?
>>
>> Someone pointed out that Apple has not officially dropped support as far
>> as it can be determined. Sorry for the bad information. However, I still
>> stand by the "has bugs that have not been fixed, and is closed source". An
>> alternative to dropping automatic support for it would be to find a channel
>> for engaging with Apple to report and fix the bugs.
>>
>
> That's been tried, repeatedly. I would suggest not to spend time on that.
> Apple knows, they have just decided it's not important to them.
>
> Spending time on contributing to either OpenBLAS or BLIS/libFLAME seems
> like a more useful activity.
>
> Cheers,
> Ralf
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Output type of round is inconsistent with python built-in

2020-02-26 Thread Ilhan Polat

Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?

That would be really awkward for many reasons pandas frame size being
bloated just by rounding for an example. Or numpy array size growing for no
apparent reason

I am not really sure if I understand why LSP should hold in this case to be
honest. Rounding is an operation specific for the number instance and not
for the generic class.




On Wed, Feb 26, 2020, 21:38 Robert Kern  wrote:

> On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi 
> wrote:
>
>>
>> There still remains the question, do we return Python ints or np.int64s?
>>
>>- Python ints have the advantage of not overflowing.
>>- If we decide to add __round__ to arrays in the future, Python ints
>>may become inconsistent with our design, as such a method will return an
>>int64 array.
>>
>>
>>
>> This was issue was discussed in the weekly triage meeting today, and the
>> following plan of action was proposed:
>>
>>- change scalar floats to return integers for __round__ (which
>>integer type was not discussed, I propose np.int64)
>>- not change anything else: not 0d arrays and not other numpy
>>functionality
>>
>> The only reason that float.__round__() was allowed to change to returning
> ints was because ints became unbounded. If we also change to returning an
> integer type, it should be a Python int.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Output type of round is inconsistent with python built-in

2020-02-26 Thread Ilhan Polat

It's not about what I want but this changes the output of round. In my
example I didn't use any arrays but a scalar type which looks like will
upcasted.

On Wed, Feb 26, 2020, 23:04 Robert Kern  wrote:

> On Wed, Feb 26, 2020 at 5:41 PM  wrote:
>
>>
>>
>> On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat  wrote:
>>
>>> Does this mean that np.round(np.float32(5)) return a 64 bit upcasted int?
>>>
>>> That would be really awkward for many reasons pandas frame size being
>>> bloated just by rounding for an example. Or numpy array size growing for no
>>> apparent reason
>>>
>>> I am not really sure if I understand why LSP should hold in this case to
>>> be honest. Rounding is an operation specific for the number instance and
>>> not for the generic class.
>>>
>>>
>>>
>>>
>>> On Wed, Feb 26, 2020, 21:38 Robert Kern  wrote:
>>>
>>>> On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <
>>>> einstein.edi...@gmail.com> wrote:
>>>>
>>>>>
>>>>> There still remains the question, do we return Python ints or np.int64
>>>>> s?
>>>>>
>>>>>- Python ints have the advantage of not overflowing.
>>>>>- If we decide to add __round__ to arrays in the future, Python ints
>>>>>may become inconsistent with our design, as such a method will return 
>>>>> an
>>>>>int64 array.
>>>>>
>>>>>
>>>>>
>>>>> This was issue was discussed in the weekly triage meeting today, and
>>>>> the following plan of action was proposed:
>>>>>
>>>>>- change scalar floats to return integers for __round__ (which
>>>>>integer type was not discussed, I propose np.int64)
>>>>>- not change anything else: not 0d arrays and not other numpy
>>>>>functionality
>>>>>
>>>>>
>> I think making numerical behavior different between arrays and numpy
>> scalars with the same dtype, will create many happy debugging hours.
>>
>
> round(some_ndarray) isn't implemented, so there is no difference to worry
> about.
>
> If you want the float->float rounding, use np.around(). That function
> should continue to behave like it currently does for both arrays and
> scalars.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Output type of round is inconsistent with python built-in

2020-02-26 Thread Ilhan Polat

Oh sorry. That's trigger finger np-dotting.

What i mean is if someone was using the round method on float32 or other
small bit datatypes they would have a silent upcasting.

Maybe not a big problem but can have significant impact.

On Thu, Feb 27, 2020, 05:12 Robert Kern  wrote:

> Your example used np.round(), not the builtin round(). np.round() is not
> changing. If you want the dtype of the output to be the dtype of the input,
> you can certainly keep using np.round() (or its canonical spelling,
> np.around()).
>
> On Thu, Feb 27, 2020, 12:05 AM Ilhan Polat  wrote:
>
>> It's not about what I want but this changes the output of round. In my
>> example I didn't use any arrays but a scalar type which looks like will
>> upcasted.
>>
>> On Wed, Feb 26, 2020, 23:04 Robert Kern  wrote:
>>
>>> On Wed, Feb 26, 2020 at 5:41 PM  wrote:
>>>
>>>>
>>>>
>>>> On Wed, Feb 26, 2020 at 5:30 PM Ilhan Polat 
>>>> wrote:
>>>>
>>>>> Does this mean that np.round(np.float32(5)) return a 64 bit upcasted
>>>>> int?
>>>>>
>>>>> That would be really awkward for many reasons pandas frame size being
>>>>> bloated just by rounding for an example. Or numpy array size growing for 
>>>>> no
>>>>> apparent reason
>>>>>
>>>>> I am not really sure if I understand why LSP should hold in this case
>>>>> to be honest. Rounding is an operation specific for the number instance 
>>>>> and
>>>>> not for the generic class.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 26, 2020, 21:38 Robert Kern  wrote:
>>>>>
>>>>>> On Wed, Feb 26, 2020 at 3:19 PM Hameer Abbasi <
>>>>>> einstein.edi...@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>> There still remains the question, do we return Python ints or
>>>>>>> np.int64s?
>>>>>>>
>>>>>>>- Python ints have the advantage of not overflowing.
>>>>>>>- If we decide to add __round__ to arrays in the future, Python
>>>>>>>ints may become inconsistent with our design, as such a method
>>>>>>>will return an int64 array.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This was issue was discussed in the weekly triage meeting today, and
>>>>>>> the following plan of action was proposed:
>>>>>>>
>>>>>>>- change scalar floats to return integers for __round__ (which
>>>>>>>integer type was not discussed, I propose np.int64)
>>>>>>>- not change anything else: not 0d arrays and not other numpy
>>>>>>>functionality
>>>>>>>
>>>>>>>
>>>> I think making numerical behavior different between arrays and numpy
>>>> scalars with the same dtype, will create many happy debugging hours.
>>>>
>>>
>>> round(some_ndarray) isn't implemented, so there is no difference to
>>> worry about.
>>>
>>> If you want the float->float rounding, use np.around(). That function
>>> should continue to behave like it currently does for both arrays and
>>> scalars.
>>>
>>> --
>>> Robert Kern
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Feelings about type aliases in NumPy

2020-04-26 Thread Ilhan Polat

I agree that parking all these in a secondary namespace sounds a better
option, can't say that I feel for the word "typing" though. There are
already too many type, dtype, ctypeslib etc. Maybe we can go for a bit more
distant name like "numpy.annotations" or whatever.

On Sat, Apr 25, 2020 at 8:51 AM Kevin Sheppard 
wrote:

> Typing is for library developers more than end users. I would also worry
> that putting it into the top level might discourage other typing classes
> since it is more difficult to add to the top level than to a lower level
> module. np.typing seems very clear to me.
>
> On Sat, Apr 25, 2020, 07:41 Stephan Hoyer  wrote:
>
>>
>>
>> On Fri, Apr 24, 2020 at 11:31 AM Sebastian Berg <
>> sebast...@sipsolutions.net> wrote:
>>
>>> On Fri, 2020-04-24 at 11:10 -0700, Stefan van der Walt wrote:
>>> > On Fri, Apr 24, 2020, at 08:45, Joshua Wilson wrote:
>>> > > But, Stephan pointed out that it might be confusing to users for
>>> > > objects to only exist at typing time, so we came around to the
>>> > > question of whether people are open to the idea of including the
>>> > > type
>>> > > aliases in NumPy itself. Ralf's concrete proposal was to make a
>>> > > module
>>> > > numpy.types (or maybe numpy.typing) to hold the aliases so that
>>> > > they
>>> > > don't pollute the top-level namespace. The module would initially
>>> > > contain the types
>>> >
>>> > That sounds very sensible.  Having types available with NumPy should
>>> > also encourage their use, especially if we can add some documentation
>>> > around it.
>>>
>>> I agree, I might have a small tendency for `numpy.types` if we ever
>>> find any usage other than direct typing that may be the better name?
>>
>>
>> Unless we anticipate adding a long list of type aliases (more than the
>> three suggested so far), I would lean towards adding ArrayLike to the top
>> level NumPy namespace as np.ArrayLike.
>>
>> Type annotations are becoming an increasingly core part of modern Python
>> code. We should make it easy to appropriately type check functions that act
>> on NumPy arrays, and a top level np.ArrayLike is definitely more convenient
>> than np.types.ArrayLike.
>>
>> Out of curiousity, I guess `ArrayLike` would be an ABC that a
>>> downstream project can register with?
>>
>>
>> ArrayLike will be a typing Protocol, automatically recognizing attributes
>> like __array__ to indicate that something can be cast to an array.
>>
>>
>>>
>>> - Sebastian
>>>
>>>
>>> >
>>> > Stéfan
>>> > ___
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion@python.org
>>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Feelings about type aliases in NumPy

2020-04-27 Thread Ilhan Polat

> Interestingly this was proposed independently here:

Wow apologies for missing the entire thread about it and the noise.


On Sun, Apr 26, 2020 at 11:19 PM Joshua Wilson 
wrote:

> To try and add some more data points to the conversation:
>
> > Maybe we can go for a bit more distant name like "numpy.annotations" or
> whatever.
>
> Interestingly this was proposed independently here:
>
> https://github.com/numpy/numpy-stubs/pull/66#issuecomment-619131274
>
> Related to that, Ralf was opposed to numpy.typing because it would
> shadow a stdlib module name:
>
> https://github.com/numpy/numpy-stubs/pull/66#issuecomment-619123629
>
> But, types is _also_ a stdlib module name. Maybe the above points give
> some extra weight to "numpy.annotations"?
>
> > Unless we anticipate adding a long list of type aliases (more than the
> three suggested so far)
>
> While working on some types in SciPy here:
>
> https://github.com/scipy/scipy/pull/11936#discussion_r415280894
>
> we ran into the issue of typing things that are "integer types" or
> "floating types". For the time being we just inlined a definition like
> Union[float, np.floating], but conceivably we would want to unify
> those definitions somewhere instead of redefining them in every
> project. (Note that existing types like SupportsInt etc. were not what
> we wanted.) This perhaps suggests that the ultimate number of type
> aliases might be larger than we initially thought.
>
> On Sun, Apr 26, 2020 at 6:25 AM Ilhan Polat  wrote:
> >
> > I agree that parking all these in a secondary namespace sounds a better
> option, can't say that I feel for the word "typing" though. There are
> already too many type, dtype, ctypeslib etc. Maybe we can go for a bit more
> distant name like "numpy.annotations" or whatever.
> >
> > On Sat, Apr 25, 2020 at 8:51 AM Kevin Sheppard <
> kevin.k.shepp...@gmail.com> wrote:
> >>
> >> Typing is for library developers more than end users. I would also
> worry that putting it into the top level might discourage other typing
> classes since it is more difficult to add to the top level than to a lower
> level module. np.typing seems very clear to me.
> >>
> >> On Sat, Apr 25, 2020, 07:41 Stephan Hoyer  wrote:
> >>>
> >>>
> >>>
> >>> On Fri, Apr 24, 2020 at 11:31 AM Sebastian Berg <
> sebast...@sipsolutions.net> wrote:
> >>>>
> >>>> On Fri, 2020-04-24 at 11:10 -0700, Stefan van der Walt wrote:
> >>>> > On Fri, Apr 24, 2020, at 08:45, Joshua Wilson wrote:
> >>>> > > But, Stephan pointed out that it might be confusing to users for
> >>>> > > objects to only exist at typing time, so we came around to the
> >>>> > > question of whether people are open to the idea of including the
> >>>> > > type
> >>>> > > aliases in NumPy itself. Ralf's concrete proposal was to make a
> >>>> > > module
> >>>> > > numpy.types (or maybe numpy.typing) to hold the aliases so that
> >>>> > > they
> >>>> > > don't pollute the top-level namespace. The module would initially
> >>>> > > contain the types
> >>>> >
> >>>> > That sounds very sensible.  Having types available with NumPy should
> >>>> > also encourage their use, especially if we can add some
> documentation
> >>>> > around it.
> >>>>
> >>>> I agree, I might have a small tendency for `numpy.types` if we ever
> >>>> find any usage other than direct typing that may be the better name?
> >>>
> >>>
> >>> Unless we anticipate adding a long list of type aliases (more than the
> three suggested so far), I would lean towards adding ArrayLike to the top
> level NumPy namespace as np.ArrayLike.
> >>>
> >>> Type annotations are becoming an increasingly core part of modern
> Python code. We should make it easy to appropriately type check functions
> that act on NumPy arrays, and a top level np.ArrayLike is definitely more
> convenient than np.types.ArrayLike.
> >>>
> >>>> Out of curiousity, I guess `ArrayLike` would be an ABC that a
> >>>> downstream project can register with?
> >>>
> >>>
> >>> ArrayLike will be a typing Protocol, automatically recognizing
> attributes like __array__ to indicate that something can be cast to an
> array.
&g

Re: [Numpy-discussion] log of negative real numbers -> RuntimeWarning: invalid value encountered in log

2020-05-25 Thread Ilhan Polat

I wasted a good 2 weeks because of that behavior of Matlab back in the day
and I think that is one of the cardinal sins that matlab commits. If need
be there are alternatives as mentioned before but I definitely do not
prefer this coercion at all.

On Mon, May 25, 2020 at 5:18 PM Sebastian Berg 
wrote:

> On Mon, 2020-05-25 at 11:10 -0400, Robert Kern wrote:
> > On Mon, May 25, 2020 at 10:36 AM Sebastian Berg <
> > sebast...@sipsolutions.net>
> > wrote:
> >
> > > On Mon, 2020-05-25 at 10:09 -0400, Brian Racey wrote:
> > > > Would a "complex default" mode ever make it into numpy, to behave
> > > > more like
> > > > Matlab and other packages with respect to complex number
> > > > handling?
> > > > Sure it
> > > > would make it marginally slower if enabled, but it might open the
> > > > door to
> > > > better compatibility when porting code to Python.
> > > >
> > >
> > > I think the SciPy versions may have such a default, or there is
> > > such a
> > > functionality hidden somewhere (maybe even the switching
> > > behaviour).
> > > I am not sure anyone actually uses those, so it may not be a good
> > > idea
> > > to use them to be honest.
> > >
> >
> > The versions in `np.lib.scimath` behave like this. Of course, people
> > do use
> > them when they want to deal with real numbers as subsets of the
> > complex
> > numbers.
> >
>
> True, I guess I just used complex numbers too rarely in programs (i.e.
> never central to any programming project).
>
> It seems this is actually also exposed as `np.emath`, which is maybe a
> better entry point? And I guess the scipy namespace uses them.
>
> - Sebastian
>
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] NumPy team update

2020-06-19 Thread Ilhan Polat

This is great. Can we have a SciPy version too Ralf?

On Fri, Jun 19, 2020 at 6:31 PM Devulapalli, Raghuveer <
raghuveer.devulapa...@intel.com> wrote:

> Hi Ralf,
>
>
>
> Thank you for the acknowledgement. I am happy to contribute and hope to
> continue to do so in the future.
>
>
>
> Raghuveer
>
>
>
> *From:* NumPy-Discussion  intel@python.org> *On Behalf Of *Ralf Gommers
> *Sent:* Thursday, June 18, 2020 2:58 PM
> *To:* Discussion of Numerical Python 
> *Subject:* [Numpy-discussion] NumPy team update
>
>
>
> Hi all,
>
>
>
> The NumPy team is growing, and it's awesome to see everything that is
> going on. Hard to keep up with, but that's a good thing! I think it's a
> good time for an update on people who gained commit rights, or joined one
> of the teams we now have.
>
>
>
> For those who haven't seen it yet, we have a team gallery at
> https://numpy.org/gallery/team.html. It isn't yet updated for the changes
> in this email, but gives a good picture of the current state.
>
>
>
> Matti Picus joined the Steering Council. He has been one of the driving
> forces behind NumPy for well over two years now, and we're very glad to
> have him join the council.
>
>
>
> Ross Barnowski, Melissa Weber Mendonça, Josh Wilson and Bas van Beek
> gained commit rights. Ross has worked on the docs and reviewed lots of doc
> PRs for the last six months.  Melissa has led the doc structuring and
> tutorial writing effort and has done a good amount of f2py maintenance as
> well. Josh and Bas have been pushing the type annotation work forward,
> first in the numpy-stubs repo and now in master. It's great to have experts
> in all these topics join the team.
>
>
>
> Furthermore, we now have 10+ people in the community calls, the triage
> calls and the docs team calls (all bi-weekly and on the NumPy community
> calendar [1] - everyone is welcome). And there's more going on - I feel
> like I should mention some of the other excellent work going on:
>
>
>
> A lot of work is going into SIMD optimizations. Sayed Adel has made very
> nice progress on implementing universal intrinsics (NEP 38), and Raghuveer
> Devulapalli, Chunlin Fang and others have contributed SSE/AVX and ARM Neon
> implementations for many functions.
>
>
>
> For the website, Shaloo Shalini has continued working on new case studies,
> we're about to merge a really nice one on tracking animal movement. Ben
> Nathanson has contributed his technical writing and editing skills to
> improve our website and documentation content. And Isabela Presedo-Floyd
> has taken up the challenge of redesigning the NumPy logo, and we're nearing
> the end of the process there.
>
>
>
> The survey team has also been working hard. Inessa Pawson, Xiaoyi Deng,
> Stephanie Mendoza, Ross Barnowski, Sebastian Berg and a number of
> volunteers for translations are getting a really well-designed survey
> together.
>
>
>
> And then of course there's both old hands and new people doing the regular
> maintenance and enhancement work on the main repo.
>
>
>
> Writing this email started with "we just gave out some commit rights, we
> should put that on the mailing list". Then I realized there's lots of other
> people and activities that deserve a shout out. And probably more that I
> forgot (if so, apologies!). I'll stop here - thanks everyone for all you do!
>
>
>
> Cheers,
>
> Ralf
>
>
>
>
>
> [1]
> https://calendar.google.com/calendar?cid=YmVya2VsZXkuZWR1X2lla2dwaWdtMjMyamJobGRzZmIyYzJqODFjQGdyb3VwLmNhbGVuZGFyLmdvb2dsZS5jb20
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Proposal to add clause to license prohibiting use by oil and gas extraction companies

2020-07-02 Thread Ilhan Polat

Ralf basically wrote the email that I was about the send in a much more
structured way so thanks for that. I'd like to mention also that oil&gas
industry practically cannot be cornered by these restrictions. So even the
cause is very noble and I wholeheartedly agree, forcing this type of
exclusions only will make their hand stronger in going to other commercial
software (they can really afford even acquiring whole companies) and
forcing their employees using it and finally boomeranging back to the
reduction of the potential contributors to open source who would have
otherwise contributed back just because they liked it (like most of us did
back in the day). For example, Shell and Intel are corporate level
collaborators. Should we ban also usage of MKL? Of course not, because this
is not about driving Shell and others to software starvation but actually
forcing them to take concrete steps towards the climate crisis. This is not
to say we are desperate, quite the contrary, however this strategy seems
dire against the possible outcomes.

I really would like to take a more concrete approach that Ralf outlined.
Again, it is not a crusade against commercial software, I truly think all
have different shoes to fill in. However, making the switch from commercial
software to open source as smooth as possible would actually emit the
message that we are not bound to conglomerate structures to achieve noble
goals. Thus this would make a bolder statement as far as what software can
manage to display. Signal processing can make fuel consumption notebooks,
stats can display bicycle usage results and their impact etc. Again it is a
mentality that we are trying to build so it shouldn't be up to the level of
annoyance so that everyone can hop on the bandwagon.

On Thu, Jul 2, 2020 at 12:14 PM Ralf Gommers  wrote:

>
>
> On Thu, Jul 2, 2020 at 10:58 AM Juan Nunez-Iglesias 
> wrote:
>
>> Hi everyone,
>>
>> If you live in Australia, this has been a rough year to think about
>> climate change. After the hottest and driest year on record, over 20% of
>> the forest surface area of the south east was burned in the bushfires.
>> Although I was hundreds of kilometres from the nearest fire, the air
>> quality was rated as hazardous for several days in my city. This brought
>> home for me two points.
>>
>> One, that "4ºC" is not about taking off a jumper and going to the beach
>> more often, but actually represents a complete transformation of our
>> planet. 4ºC is what separates us from the last ice age, so we can expect
>> our planet in 80 years to be as unrecognisable from today as today is from
>> the ice age.
>>
>> Two, that climate change is already with us, and we can't just continue
>> to ignore the problem and enjoy whatever years of climate peace we thought
>> we had left. Greta has it right, we are running out of time and absolutely
>> drastic action is needed.
>>
>> All this is a prelude to add my voice to everyone who has already said
>> that *messing with the NumPy license is absolutely *not* the drastic
>> action needed*, and will be counter-productive, as many have noted.
>>
>> Having said this, I'm happy that the community is getting involved and
>> getting active and coming up with creative ideas to do their part. If
>> someone wants to start a "Pythonistas for Climate Action" user group, I'll
>> be the first to join. I had planned to give a lightning talk in the vein of
>> the above at SciPy, which, and believe me that I hate to hate on my
>> favourite conference, recently loudly thanked Shell [1] for being a
>> platinum sponsor. (Not to mention that Enthought derives about a third of
>> its income from fossil fuel companies.) Unfortunately and for obvious
>> reasons I won't make it to SciPy after all, but again, I'm happy to see the
>> community rising.
>>
>> Perhaps this is derailing the discussion, but, anyone up for a "Python
>> for Climate Action" BoF at the conference? I can probably make the
>> late-afternoon BoFs given the time difference.
>>
>
> Thanks for this Juan. I don't think it's derailing the discussion.
> Thinking about things we *can* do that may have a positive influence on the
> climate emergency we're in, or the state of the world in general, are valid
> and probably the most productive turn this conversation can take. Changing
> the NumPy license isn't feasible, because of many of the pragmatic reasons
> already pointed out. That said, the "NumPy is just a tool" point of view is
> fairly naive; I think we do have a responsibility to at least think about
> the wider issues and possibly make some changes.
>
> One thing I have been thinking about recently is the educational material
> and high level documentation we produce. When we use data sources or write
> tutorials, we can incorporate data and examples related to climate issues,
> social issues, ethics in ML/AI, etc.
>
> Another thing to think about is: what do we, NumPy maintainers and
> contributors, choose to spend our time on? Not each i

[Numpy-discussion] Type declaration to include all valid numerical NumPy types for Cython

2020-08-09 Thread Ilhan Polat

Hi all,

As you might have seen my recent mails in Cython list, I'm trying to cook
up an input validator for the linalg.solve() function. The machinery of
SciPy linalg is as follows:

Some input comes in passes through np.asarray() then depending on the
resulting dtype of the numpy array we choose a LAPACK flavor (s,d,c,z) and
off it goes through f2py to lalaland and comes back with some result.

For the backslash polyalgorithm I need the arrays to be contiguous (C- or
F- doesn't matter) and any of the four (possibly via making new copies)
float, double, float complex, double complex after the intake because we
are using wrapped fortran code (LAPACK) in SciPy. So my difficulty is how
to type such function input, say,

ctypedef fused numeric_numpy_t:
bint
cnp.npy_bool
cnp.int_t
cnp.intp_t
cnp.int8_t
cnp.int16_t
cnp.int32_t
cnp.int64_t
cnp.uint8_t
cnp.uint16_t
cnp.uint32_t
cnp.uint64_t
cnp.float32_t
cnp.float64_t
cnp.complex64_t
cnp.complex128_t

Is this acceptable or something else needs to be used? Then there is the
storyof np.complex256 and mysterious np.float16. Then there is the Linux vs
Windows platform dependence issue and possibly some more that I can't
comprehend. Then there are datetime, str, unicode etc. that need to be
rejected. So this is quickly getting out of hand for my small brain.

To be honest, I am a bit running out of steam working with this issue even
though I managed to finish the actual difficult algorithmic part but got
stuck here. I am quite surprised how fantastically complicated and
confusing both NumPy and Cython docs about this stuff. Shouldn't we keep a
generic fused type for such usage? Or maybe there already exists but I
don't know and would be really grateful for pointers.

Here I wrote a dummy typed Cython function just for type checking:

cpdef inline bint ncc( numeric_numpy_t[:, :] a):
print(a.is_f_contig())
print(a.is_c_contig())

return a.is_f_contig() or a.is_c_contig()

And this is a dummy loop (with aliases) just to check whether fused type is
working or not (on windows I couldn't make it work for float16).

for x in (np.uint, np.uintc, np.uintp, np.uint0, np.uint8, np.uint16,
np.uint32,
  np.uint64, np.int, np.intc, np.intp, np.int0, np.int8, np.int16,
  np.int32,np.int64, np.float, np.float32, np.float64, np.float_,
  np.complex, np.complex64, np.complex128, np.complex_):
print(x)
C = np.arange(25., dtype=x).reshape(5, 5)
ncc(C)


Thanks in advance,
ilhan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Type declaration to include all valid numerical NumPy types for Cython

2020-08-10 Thread Ilhan Polat

Yes it seems like I don't have any other option anyways. There is a bit of
a penalty but I guess this should do the trick.

Thanks Eric (again! :D)

On Mon, Aug 10, 2020 at 2:51 AM Eric Moore  wrote:

> If that is really all you need, then the version in python is:
>
> def convert_one(a):
> """
> Converts input with arbitrary layout and dtype to a blas/lapack
> compatible dtype with either C or F order.  Acceptable objects are
> passed
> through without making copies.
> """
>
> a_arr = np.asarray(a)
> dtype = np.result_type(a_arr, 1.0)
>
> # need to handle these separately
> if dtype == np.longdouble:
> dtype = np.dtype('d')
> elif dtype == np.clongdouble:
> dtype = np.dtype('D')
> elif dtype == np.float16:
> dtype = np.dtype('f')
>
> # explicitly force a copy if a_arr isn't one segment
> return np.array(a_arr, dtype, copy=not a_arr.flags.forc, order='K')
>
> In Cython, you could just run exactly this code and it's probably fine.
> The could also be rewritten using the C calls if you really wanted.
>
> You need to either provide your own or use a casting table and the copy /
> conversion routines from somewhere.  Cython, to my knowledge, doesn't
> provide these things, but Numpy does.
>
> Eric
>
> On Sun, Aug 9, 2020 at 6:16 PM Ilhan Polat  wrote:
>
>> Hi all,
>>
>> As you might have seen my recent mails in Cython list, I'm trying to cook
>> up an input validator for the linalg.solve() function. The machinery of
>> SciPy linalg is as follows:
>>
>> Some input comes in passes through np.asarray() then depending on the
>> resulting dtype of the numpy array we choose a LAPACK flavor (s,d,c,z) and
>> off it goes through f2py to lalaland and comes back with some result.
>>
>> For the backslash polyalgorithm I need the arrays to be contiguous (C- or
>> F- doesn't matter) and any of the four (possibly via making new copies)
>> float, double, float complex, double complex after the intake because we
>> are using wrapped fortran code (LAPACK) in SciPy. So my difficulty is how
>> to type such function input, say,
>>
>> ctypedef fused numeric_numpy_t:
>> bint
>> cnp.npy_bool
>> cnp.int_t
>> cnp.intp_t
>> cnp.int8_t
>> cnp.int16_t
>> cnp.int32_t
>> cnp.int64_t
>> cnp.uint8_t
>> cnp.uint16_t
>> cnp.uint32_t
>> cnp.uint64_t
>> cnp.float32_t
>> cnp.float64_t
>> cnp.complex64_t
>> cnp.complex128_t
>>
>> Is this acceptable or something else needs to be used? Then there is the
>> storyof np.complex256 and mysterious np.float16. Then there is the Linux vs
>> Windows platform dependence issue and possibly some more that I can't
>> comprehend. Then there are datetime, str, unicode etc. that need to be
>> rejected. So this is quickly getting out of hand for my small brain.
>>
>> To be honest, I am a bit running out of steam working with this issue
>> even though I managed to finish the actual difficult algorithmic part but
>> got stuck here. I am quite surprised how fantastically complicated and
>> confusing both NumPy and Cython docs about this stuff. Shouldn't we keep a
>> generic fused type for such usage? Or maybe there already exists but I
>> don't know and would be really grateful for pointers.
>>
>> Here I wrote a dummy typed Cython function just for type checking:
>>
>> cpdef inline bint ncc( numeric_numpy_t[:, :] a):
>> print(a.is_f_contig())
>> print(a.is_c_contig())
>>
>> return a.is_f_contig() or a.is_c_contig()
>>
>> And this is a dummy loop (with aliases) just to check whether fused type
>> is working or not (on windows I couldn't make it work for float16).
>>
>> for x in (np.uint, np.uintc, np.uintp, np.uint0, np.uint8, np.uint16,
>> np.uint32,
>>   np.uint64, np.int, np.intc, np.intp, np.int0, np.int8,
>> np.int16,
>>   np.int32,np.int64, np.float, np.float32, np.float64, np.float_,
>>   np.complex, np.complex64, np.complex128, np.complex_):
>> print(x)
>> C = np.arange(25., dtype=x).reshape(5, 5)
>> ncc(C)
>>
>>
>> Thanks in advance,
>> ilhan
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-12 Thread Ilhan Polat

For what is worth, as a potential consumer in SciPy, it really doesn't say
anything (both in NEP and the PR) about how the regular users of NumPy will
benefit from this. If only and only 3rd parties are going to benefit from
it, I am not sure adding a new keyword to an already confusing function is
the right thing to do.

Let me clarify,

- This is already a very (I mean extremely very) easy keyword name to
confuse with ones_like, zeros_like and by its nature any other
interpretation. It is not signalling anything about the functionality that
is being discussed. I would seriously consider reserving such obvious names
for really obvious tasks. Because you would also expect the shape and ndim
would be mimicked by the "like"d argument but it turns out it is acting
more like "typeof=" and not "like=" at all. Because if we follow the
semantics it reads as "make your argument asarray like the other thing" but
it is actually doing, "make your argument an array with the other thing's
type" which might not be an array after all.

- Again, if this is meant for downstream libraries (because that's what I
got out of the PR discussion, cupy, dask, and JAX were the only examples I
could read) then hiding it in another function and writing with capital
letters "this is not meant for numpy users" would be a much more convenient
way to separate the target audience and regular users.
numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may
be would be quite clean and to the point with no ambiguous keywords.

I think, arriving to an agreement would be much faster if there is an
executive summary of who this is intended for and what the regular usage
is. Because with no offense, all I see is "dispatch", "_array_function_"
and a lot of technical details of which I am absolutely ignorant.

Finally as a minor point, I know we are mostly (ex-)academics but this
necessity of formal language on NEPs is self-imposed (probably PEPs are to
blame) and not quite helping. It can be a bit more descriptive in my
external opinion.

best,
ilhan







On Tue, Aug 11, 2020 at 12:18 AM Ralf Gommers 
wrote:

>
>
> On Mon, Aug 10, 2020 at 8:37 PM Sebastian Berg 
> wrote:
>
>> On Mon, 2020-08-10 at 17:35 +0200, Hameer Abbasi wrote:
>> > Hi,
>> >
>> > We should have a higher-bandwidth meeting/communication for all
>> > stakeholders, and particularly some library authors, to see what
>> > would be good for them.
>>
>
> I'm not sure that helps. At this point there's little progress since the
> last meeting, I think the plan is unchanged: we need implementations of all
> the options on offer, and then try them out in PRs for scikit-learn, SciPy
> and perhaps another package who's maintainers are interested, to test
> like=, __array_module__ in realistic situations.
>
>
> >
>> > We should definitely have language in the NEP that says it won’t be
>> > in a release unless the NEP is accepted.
>>
>> In that case, I think the important part is to have language right now
>> in the implementation, although that can refer to the NEP itself of
>> course.
>> You can't expect everyone who may be tempted to use it to actually read
>> the NEP draft, at least not without pointing it out.
>>
>
> Agreed, I think the decision is on this list not in the NEP, and to make
> sure we won't forget we need an issue opened with the 1.20 milestone.
>
> Cheers,
> Ralf
>
>
>> I will say that I think it is not very high risk, because I think
>> annoying or not, the argument could be deprecated again with a
>> transition short phase. Admittedly, that argument only works if we have
>> a replacement solution.
>>
>> Cheers,
>>
>> Sebastian
>>
>>
>> >
>> > Best regards,
>> > Hameer Abbasi
>> >
>> > --
>> > Sent from Canary (https://canarymail.io)
>> >
>> > > On Monday, Aug 10, 2020 at 5:31 PM, Sebastian Berg <
>> > > sebast...@sipsolutions.net (mailto:sebast...@sipsolutions.net)>
>> > > wrote:
>> > > Hi all,
>> > >
>> > > as a heads up that Peter Entschev has a PR open to add `like=` to
>> > > most array creation functions, my current plan is to merge it soon
>> > > as a preliminary API and bring it up again before the actual
>> > > release (in a few months). This allows overriding for array-likes,
>> > > e.g. it will allow:
>> > >
>> > >
>> > > arr = np.asarray([3], like=dask_array)
>> > > type(arr) is dask.array.Array
>> > >
>> > > This was proposed in NEP 35:
>> > >
>> > >
>> https://numpy.org/neps/nep-0035-array-creation-dispatch-with-array-function.html
>> > >
>> > > Although that has not been accepted as of now, the PR is:
>> > >
>> > > https://github.com/numpy/numpy/pull/16935
>> > >
>> > >
>> > > This was discussed in a smaller group, and is an attempt to see how
>> > > we
>> > > can make the array-function protocol viable to allow packages such
>> > > as
>> > > sklearn to work with non-NumPy arrays.
>> > >
>> > > As of now, this would be experimental and can revisit it before the
>> > > actual NumPy release. We should probably discuss accepting NEP 35
>> >

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ilhan Polat

gt;> > I’ve generally been on the “let the NumPy devs worry about it” side of
>> things, but I do agree with Ilhan that `like=` is confusing and `typeof=`
>> would be a much more appropriate name for that parameter.
>> >
>> > I do think library writers are NumPy users and so I wouldn’t really
>> make that distinction, though. Users writing their own analysis code could
>> very well be interested in writing code using numpy functions that will
>> transparently work when the input is a CuPy array or whatever.
>> >
>> > I also share Ilhan’s concern (and I mentioned this in a previous NEP
>> discussion) that NEPs are getting pretty inaccessible. In a sense these are
>> difficult topics and readers should be expected to have *some* familiarity
>> with the topics being discussed, but perhaps more effort should be put into
>> the context/motivation/background of a NEP before accepting it. One way to
>> ensure this might be to require a final proofreading step by someone who
>> has not been involved at all in the discussions, like peer review does for
>> papers.
>> >
>> > Food for thought.
>> >
>> > Juan.
>> >
>> > On 13 Aug 2020, at 9:24 am, Ilhan Polat  wrote:
>> >
>> > For what is worth, as a potential consumer in SciPy, it really doesn't
>> say anything (both in NEP and the PR) about how the regular users of NumPy
>> will benefit from this. If only and only 3rd parties are going to benefit
>> from it, I am not sure adding a new keyword to an already confusing
>> function is the right thing to do.
>> >
>> > Let me clarify,
>> >
>> > - This is already a very (I mean extremely very) easy keyword name to
>> confuse with ones_like, zeros_like and by its nature any other
>> interpretation. It is not signalling anything about the functionality that
>> is being discussed. I would seriously consider reserving such obvious names
>> for really obvious tasks. Because you would also expect the shape and ndim
>> would be mimicked by the "like"d argument but it turns out it is acting
>> more like "typeof=" and not "like=" at all. Because if we follow the
>> semantics it reads as "make your argument asarray like the other thing" but
>> it is actually doing, "make your argument an array with the other thing's
>> type" which might not be an array after all.
>> >
>> > - Again, if this is meant for downstream libraries (because that's what
>> I got out of the PR discussion, cupy, dask, and JAX were the only examples
>> I could read) then hiding it in another function and writing with capital
>> letters "this is not meant for numpy users" would be a much more convenient
>> way to separate the target audience and regular users.
>> numpy.astypedarray([[some data], [...]], type_of=x) or whatever else it may
>> be would be quite clean and to the point with no ambiguous keywords.
>> >
>> > I think, arriving to an agreement would be much faster if there is an
>> executive summary of who this is intended for and what the regular usage
>> is. Because with no offense, all I see is "dispatch", "_array_function_"
>> and a lot of technical details of which I am absolutely ignorant.
>> >
>> > Finally as a minor point, I know we are mostly (ex-)academics but this
>> necessity of formal language on NEPs is self-imposed (probably PEPs are to
>> blame) and not quite helping. It can be a bit more descriptive in my
>> external opinion.
>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Experimental `like=` attribute for array creation functions

2020-08-13 Thread Ilhan Polat

Yes, the underlying gory details should be spelled out of course but if it
is also modifying/adding to API then it is best to sound the horn and
invite zombies to take a stab at it. Often people arrive with interesting
use-cases that you wouldn't have thought about.

And I am very familiar with the pushback feeling you are having right now,
probably internally shouting "where have you been all this time you
slackers?". As you might have seen me asking questions here and Cython
lists, when I am done with some new feature over SciPy, it is also going to
be a very very long and tiring process. I am really not looking forward to
it :-)  but I guess it is part of the deal. Maybe I can give some comfort
that if more people start to flock over that means it has morphed into a
finished product so people can shoot. But, I honestly thought this was a
new NEP, that's a mistake on my part.

For the like, typeof and other candidates, by esoteric I mean foreign
enough to most users. We already have a nice candidate I think; ehm...
"dispatch" or "dispatch_like" or something like that, nobody sober enough
would confuse this with any other. And since this won't be typed in daily
usage, or so I understood, I guess it is ok to make it verbose. But still
take it as an initial guess and feel free to dismiss.

I still would be in a platonic love with "numpy.DIY" or "numpy.hermes"
namespace with a nice "bring your own _array_function_" service.

On Thu, Aug 13, 2020 at 4:16 PM Peter Andreas Entschev 
wrote:

> Ilhan,
>
> Thanks, that does clarify things.
>
> I think the main point -- and correct me here if I'm still wrong -- is
> that we want the NEP to have some very clear example of when/why/how
> to use it, preferably as early in the text as possible, maybe just
> below the Abstract, in a Motivation and Scope section, as the NEP
> Template [6] pointed out to by Ralf earlier suggests. That is a
> totally valid ask, and I'll try to address it as soon as possible
> (hopefully today or tomorrow).
>
> To the point of whether NEPs are to be read by users, I normally don't
> expect users to be required to read and understand those NEPs other
> than by pure curiosity. If we need them to do so, then there's
> definitely a big problem in the API. This may sound counterintuitive
> with what I said before about the "like=" name, but that's really the
> piece of the NumPy API that I with a somewhat reasonable understand of
> arrays don't quite get or like, for instance "asarray" and "like"
> sound exactly the same thing, but they're not in the NumPy context,
> and on the other hand it's quite difficult to find a reasonable name
> to clarify that. And once more, I do like the "typeof=" suggestion
> more than "like=" to be perfectly honest, I'm just afraid it could be
> mistaken by the "dtype=" keyword somehow and thus still not solve the
> clarity problem. Going back to users reading NEPs or not, I would
> really expect that the docstring from the function is sufficiently
> clear to keep users off of it, but still give them an understanding of
> why that exists, the current docstring is in [9], please do comment on
> it if you have ideas of how to make it more accessible to users.
>
> You also mentioned you'd like that the name is as esoteric as
> possible, do you have any suggestions for an esoteric name that is
> hopefully unambiguous too? Naming has definitely been very much on the
> table since the NEP was written, but the consensus was more that
> "like=" is reasonably similar enough in both application and the name
> itself to "empty_like" and derived functions, that's why we just stuck
> to it.
>
> Best,
> Peter
>
> [9]
> https://github.com/numpy/numpy/pull/16935/files#diff-e5969453e399f2d32519d305b2582da9R16-R22
>
> On Thu, Aug 13, 2020 at 3:43 PM Ilhan Polat  wrote:
> >
> > To maybe lighten up the discussion a bit and to make my outsider
> confusion more tangible, let me start by apologizing for diving head first
> without weighing the past luggage :-) I always forget how much effort goes
> into these things and for outsiders like me, it's a matter of dipping the
> finger and tasting it just before starting to complain how much salt is
> missing etc. What I was mentioning about NEPs wasn't only related
> specifically to this one by the way. It's the generic feeling that I have.
> >
> > First let me start what I mean by NumPy users and downstreamers
> distinction. This is very much related to how data-science and huge-array
> users are magnetizing every tool out there in the Python world which is
> fine though the maj

Re: [Numpy-discussion] NEP Procedure Discussion

2020-08-14 Thread Ilhan Polat

Also, not to be a complete slacker, I'd like to add to this list;

- How can I help as an external lib maintainer?
- Do you even want us to get involved before the final draft? Or wait until
internal discussion finishes?




On Fri, Aug 14, 2020 at 1:23 PM Peter Andreas Entschev 
wrote:

> Hi all,
>
> During the discussion about NEP-35, there have been lots of
> discussions around the NEP process itself. In the interest of allowing
> people who are mostly interested in this discussion and to avoid
> drifting so much off-topic in that thread, I'm starting this new
> thread to discuss the NEP procedure.
>
> A few questions that have been raised so far:
>
> - Is the NEP Template [1] a guideline to be strictly followed or a
> suggestion for authors?
> - Who should decide when a NEP is sufficiently clear?
> - Should a NEP PR be merged at all until it's sufficiently clear or
> should it only be merged even in Draft state only after it's
> sufficiently clear?
> - What parts of the NEP are necessary to be clear for everyone? Just
> Abstract? Motivation and Scope? Everything, including the real
> technical details of implementation?
> - Would it be possible to have proof-readers -- preferably people who
> are not at all involved in the NEP's topic?
>
> Please feel free to comment on that and add any major points I might
> have missed.
>
> Best,
> Peter
>
> [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-template.rst
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Cythonize and add backslash logic to "scipy.linalg.solve"

2020-09-06 Thread Ilhan Polat

Dear all,

I've finally managed to draft a PR [1] for the functionality given in the
title (see also "Algorithms" section of [2]). It is almost halfway done but
the gist is already in place.

I'm posting this to both NumPy and SciPy lists since I think it is
important enough to get feedback from all parties involved.The particular
detail I need to be taught is the Cython parts and fleshing out
anti-patterns and code smells. Probably NumPy folks are better equipped to
spot the C related issues. There is this ILP64 issue that I am aware of
which would cause a bit of trouble and I would appreciate it if we can
tackle it at this early stage.

Thanks in advance,
ilhan

[1] : https://github.com/scipy/scipy/pull/12824
[2] : https://nl.mathworks.com/help/matlab/ref/mldivide.html
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] start of an array (tensor) and dataframe API standardization initiative

2020-11-11 Thread Ilhan Polat

This is great work. Thanks to everyone who contributed. Very clean
user-interface too.

One question: Can we propose feature requests already or is that discussion
closed?

On Tue, Nov 10, 2020 at 7:21 PM Ralf Gommers  wrote:

> Hi all,
>
> I'd like to share an update on this topic. The draft array API standard is
> now ready for wider review:
>
> - Blog post: https://data-apis.org/blog/array_api_standard_release
> - Array API standard document:
> https://data-apis.github.io/array-api/latest/
> - Repo: https://github.com/data-apis/array-api/
>
> It would be great if people - and in particular, NumPy maintainers - could
> have a look at it and see if that looks sensible from a NumPy perspective
> and whether the goals and benefits of adopting it are described clearly
> enough and are compelling.
>
> I'm sure a NEP will be needed for proposing adoption of the standard once
> it is closer to completion, and work out what that means for interaction
> with the array protocol NEPs and/or NEP 37, and how an implementation would
> look. It's a bit early for that now, I'm thinking maybe by the end of the
> year. Some initial discussion now would be useful though, since it's easier
> to make changes now rather than when that API standard is already further
> along.
>
> Cheers,
> Ralf
>
>
> On Mon, Aug 17, 2020 at 9:34 PM Ralf Gommers 
> wrote:
>
>> Hi all,
>>
>> I'd like to share this announcement blog post about the creation of a
>> consortium for array and dataframe API standardization here:
>> https://data-apis.org/blog/announcing_the_consortium/. It's still in the
>> beginning stages, but starting to take shape. We have participation from
>> one or more maintainers of most array and tensor libraries - NumPy,
>> TensorFlow, PyTorch, MXNet, Dask, JAX, Xarray. Stephan Hoyer, Travis
>> Oliphant and myself have been providing input from a NumPy perspective.
>>
>> The effort is very much related to some of the interoperability work
>> we've been doing in NumPy (e.g. it could provide an answer to what's
>> described in
>> https://numpy.org/neps/nep-0037-array-module.html#requesting-restricted-subsets-of-numpy-s-api
>> ).
>>
>> At this point we're looking for feedback from maintainers at a high level
>> (see the blog post for details).
>>
>> Also important: the python-record-api tooling and data in its repo has
>> very granular API usage data, of the kind we could really use when making
>> decisions that impact backwards compatibility.
>>
>> Cheers,
>> Ralf
>>
>> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] How did Numpy get its latest version of the documentation to appear at the top of Google search results?

2020-11-13 Thread Ilhan Polat

Have a look at here for "some" background
https://github.com/scipy/docs.scipy.org/issues/39

On Fri, Nov 13, 2020 at 5:37 PM efremdan1  wrote:

> I'm working with Bokeh (https://docs.bokeh.org/en/latest/), another
> open-source Python package. The developers would like to have the latest
> version of their documentation appear at the top of Google search results
> when users search for information, but knowledge of how to do this is
> lacking.
>
> I've noticed that Numpy seems to have gotten this problem figured out,
> e.g.,
> googling "numpy interpolate" results in the first hit being
> https://numpy.org/doc/stable/reference/generated/numpy.interp.html. This
> is
> unlike Python itself, where googling "python string formatting" results in
> the first hit being https://docs.python.org/3.4/library/string.html.
>
> So apparently someone in the Numpy developer world knows how to setup the
> doc pages in a manner that allows for this. Would that person be willing to
> post to the Bokeh message board on the topic
> (https://discourse.bokeh.org/t/some-unsolicited-feedback/6643/17) with
> some
> advice?
>
> Thank you!
>
>
>
> --
> Sent from: http://numpy-discussion.10968.n7.nabble.com/
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

2020-11-24 Thread Ilhan Polat

Do we have to take it seriously to start with? Because, with absolutely no
offense meant, I am having significant difficulty doing so.

On Tue, Nov 24, 2020 at 4:58 PM PIERRE AUGIER <
pierre.aug...@univ-grenoble-alpes.fr> wrote:

> Hi,
>
> I recently took a bit of time to study the comment "The ecological impact
> of high-performance computing in astrophysics" published in Nature
> Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y,
> https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best
> however, for the environment is to abandon Python for a more
> environmentally friendly (compiled) programming language.".
>
> I wrote a simple Python-Numpy implementation of the problem used for this
> study (https://www.nbabel.org) and, accelerated by Transonic-Pythran,
> it's very efficient. Here are some numbers (elapsed times in s, smaller is
> better):
>
> | # particles |  Py | C++ | Fortran | Julia |
> |-|-|-|-|---|
> | 1024|  29 |  55 |   41|   45  |
> | 2048| 123 | 231 |  166|  173  |
>
> The code and a modified figure are here: https://github.com/paugier/nbabel
> (There is no check on the results for https://www.nbabel.org, so one
> still has to be very careful.)
>
> I think that the Numpy community should spend a bit of energy to show what
> can be done with the existing tools to get very high performance (and low
> CO2 production) with Python. This work could be the basis of a serious
> reply to the comment by Zwart (2020).
>
> Unfortunately the Python solution in https://www.nbabel.org is very bad
> in terms of performance (and therefore CO2 production). It is also true for
> most of the Python solutions for the Computer Language Benchmarks Game in
> https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here
> https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else).
>
> We could try to fix this so that people see that in many cases, it is not
> necessary to "abandon Python for a more environmentally friendly (compiled)
> programming language". One of the longest and hardest task would be to
> implement the different cases of the Computer Language Benchmarks Game in
> standard and modern Python-Numpy. Then, optimizing and accelerating such
> code should be doable and we should be able to get very good performance at
> least for some cases. Good news for this project, (i) the first point can
> be done by anyone with good knowledge in Python-Numpy (many potential
> workers), (ii) for some cases, there are already good Python
> implementations and (iii) the work can easily be parallelized.
>
> It is not a criticism, but the (beautiful and very nice) new Numpy website
> https://numpy.org/ is not very convincing in terms of performance. It's
> written "Performant The core of NumPy is well-optimized C code. Enjoy the
> flexibility of Python with the speed of compiled code." It's true that the
> core of Numpy is well-optimized C code but to seriously compete with C++,
> Fortran or Julia in terms of numerical performance, one needs to use other
> tools to move the compiled-interpreted boundary outside the hot loops. So
> it could be reasonable to mention such tools (in particular Numba, Pythran,
> Cython and Transonic).
>
> Is there already something planned to answer to Zwart (2020)?
>
> Any opinions or suggestions on this potential project?
>
> Pierre
>
> PS: Of course, alternative Python interpreters (PyPy, GraalPython, Pyjion,
> Pyston, etc.) could also be used, especially if HPy (
> https://github.com/hpyproject/hpy) is successful (C core of Numpy written
> in HPy, Cython able to produce HPy code, etc.). However, I tend to be a bit
> skeptical in the ability of such technologies to reach very high
> performance for low-level Numpy code (performance that can be reached by
> replacing whole Python functions with optimized compiled code). Of course,
> I hope I'm wrong! IMHO, it does not remove the need for a successful HPy!
>
> --
> Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr
> LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels
> BP53, 38041 Grenoble Cedex, Francetel:+33.4.56.52.86.16
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

2020-11-24 Thread Ilhan Polat

Measuring running time of a program in arbitrary programming language is
not an objective metric. Otherwise force everyone code in Assembler and we
would be done as quick as possible. Hire 5 people to come to the workplace
for 6 months to optimize it and we will be done with their transportation.
There is a reason for not doing so. Alternatively, any time that will be
shaved off from this will be spent on extremely inefficient i9 laptops that
developers have while debugging the type issues. As the author themselves
admit, the development speed would justify the loss encountered from the
actual code running.

So this study is suggestive at the very least just; like my rebuttal, very
difficult to verify. I do Industrial IoT for a living, and while I
wholeheartedly agree with the intentions, I would seriously question the
power metrics given here because similarly I can easily show a steel
factory to be very efficient if I am not careful. Especially tying the code
quality to the programming language is a very slippery slope that I have
been listening to in the last 20 years from Fortran people.

> I think we, the community, does have to take it seriously. NumPy and the
rest of the ecosystem is trying to raise money to hire developers. This
sentiment, which is much wider than a single paper, is a prevalent
roadblock.

I don't get this sentence.



On Tue, Nov 24, 2020 at 7:29 PM Hameer Abbasi 
wrote:

> Hello,
>
> We’re trying to do a part of this in the TACO team, and with a Python
> wrapper in the form of PyData/Sparse. It will allow an abstract
> array/scheduling to take place, but there are a bunch of constraints, the
> most important one being that a C compiler cannot be required at runtime.
>
> However, this may take a while to materialize, as we need an LLVM backend,
> and a Python wrapper (matching the NumPy API), and support for arbitrary
> functions (like universal functions).
>
> https://github.com/tensor-compiler/taco
> http://fredrikbk.com/publications/kjolstad-thesis.pdf
>
> --
> Sent from Canary 
>
> On Dienstag, Nov. 24, 2020 at 7:22 PM, YueCompl 
> wrote:
> Is there some community interest to develop fusion based high-performance
> array programming? Something like
> https://github.com/AccelerateHS/accelerate#an-embedded-language-for-accelerated-array-computations
>  ,
> but that embedded  DSL is far less pleasing compared to Python as the
> surface language for optimized Numpy code in C.
>
> I imagine that we might be able to transpile a Numpy program into fused
> LLVM IR, then deploy part as host code on CPUs and part as CUDA code on
> GPUs?
>
> I know Numba is already doing the array part, but it is too limited in
> addressing more complex non-array data structures. I had been approaching
> ~20K separate data series with some intermediate variables for each, then
> it took up to 30+GB RAM keep compiling yet gave no result after 10+hours.
>
> Compl
>
>
> On 2020-11-24, at 23:47, PIERRE AUGIER <
> pierre.aug...@univ-grenoble-alpes.fr> wrote:
>
> Hi,
>
> I recently took a bit of time to study the comment "The ecological impact
> of high-performance computing in astrophysics" published in Nature
> Astronomy (Zwart, 2020, https://www.nature.com/articles/s41550-020-1208-y,
> https://arxiv.org/pdf/2009.11295.pdf), where it is stated that "Best
> however, for the environment is to abandon Python for a more
> environmentally friendly (compiled) programming language.".
>
> I wrote a simple Python-Numpy implementation of the problem used for this
> study (https://www.nbabel.org) and, accelerated by Transonic-Pythran,
> it's very efficient. Here are some numbers (elapsed times in s, smaller is
> better):
>
> | # particles |  Py | C++ | Fortran | Julia |
> |-|-|-|-|---|
> | 1024|  29 |  55 |   41|   45  |
> | 2048| 123 | 231 |  166|  173  |
>
> The code and a modified figure are here: https://github.com/paugier/nbabel
> (There is no check on the results for https://www.nbabel.org, so one
> still has to be very careful.)
>
> I think that the Numpy community should spend a bit of energy to show what
> can be done with the existing tools to get very high performance (and low
> CO2 production) with Python. This work could be the basis of a serious
> reply to the comment by Zwart (2020).
>
> Unfortunately the Python solution in https://www.nbabel.org is very bad
> in terms of performance (and therefore CO2 production). It is also true for
> most of the Python solutions for the Computer Language Benchmarks Game in
> https://benchmarksgame-team.pages.debian.net/benchmarksgame/ (codes here
> https://salsa.debian.org/benchmarksgame-team/benchmarksgame#what-else).
>
> We could try to fix this so that people see that in many cases, it is not
> necessary to "abandon Python for a more environmentally friendly (compiled)
> programming language". One of the longest and hardest task would be to
> implement the different cases of the Computer

Re: [Numpy-discussion] problem with numpy 1.19.4 install via pip on Win 10

2020-12-02 Thread Ilhan Polat

Yes this is known and we are waiting MS to roll out a solution for this.
Here are more details
https://developercommunity2.visualstudio.com/t/fmod-after-an-update-to-windows-2004-is-causing-a/1207405

On Thu, Dec 3, 2020 at 12:57 AM Alan G. Isaac  wrote:

> numpy 1.19.3 installs fine.
> numpy 1.19.4 appears to install but does not work.
> (Details below. The supplied tinyurl appears relevant.)
> Alan Isaac
>
> PS test> python38 -m pip install -U numpy
> Collecting numpy
>Using cached numpy-1.19.4-cp38-cp38-win_amd64.whl (13.0 MB)
> Installing collected packages: numpy
> Successfully installed numpy-1.19.4
> PS test> python38
> Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64
> bit (AMD64)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import numpy
>   ** On entry to DGEBAL parameter number  3 had an illegal value
>   ** On entry to DGEHRD  parameter number  2 had an illegal value
>   ** On entry to DORGHR DORGQR parameter number  2 had an illegal value
>   ** On entry to DHSEQR parameter number  4 had an illegal value
> Traceback (most recent call last):
>File "", line 1, in 
>File "C:\Program Files\Python38\lib\site-packages\numpy\__init__.py",
> line 305, in 
>  _win_os_check()
>File "C:\Program Files\Python38\lib\site-packages\numpy\__init__.py",
> line 302, in _win_os_check
>  raise RuntimeError(msg.format(__file__)) from None
> RuntimeError: The current Numpy installation ('C:\\Program
> Files\\Python38\\lib\\site-packages\\numpy\\__init__.py') fails to pass a
> sanity check due to a bug
> in the windows runtime. See this issue for more information:
> https://tinyurl.com/y3dm3h86
>  >>>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] updated backwards compatibility and deprecation policy NEP

2020-12-30 Thread Ilhan Polat

Hi Ralf,

This reads really nice. Thanks to everyone who contributed.

Before nitpicking here and there, and sticking my head for others, is this
is a finished discussion and only stylistic feedback is expected? Also is
it preferred here or in the PR? GitHub is really not designed for extended
discussions and here it if there are two subjects are discussed
simultaneously it just becomes difficult to follow (maybe it's a bias due
to my dislike of mailing lists).

One of the less mentioned point is about what the tipping point is for the
benefits outweighing the compatibility breakage sin and how to get a
feeling for it. Because for a typical user, every break is just a break.
Nobody will squint their eyes to see the reasoning behind it downstream.
Thus this is more of a declaration of "yes as maintainers we are ready for
facing the consequences but it had to be done because such and such".

I am not asking to initiate a power discussion ala "who has the mod hammer"
but rather what constitutes as a valid business case for a breakage
proposal. A few generic lines about that would go a long way. Because we
are in the same situation with scipy.linalg in which, what to do is crystal
clear but how to do it without breaking anything is herding the cats hence
I am genuinely curious how to go about this.

Best,
ilhan


On Wed, Dec 30, 2020 at 3:07 PM Ralf Gommers  wrote:

> Hi all,
>
> Here is a long overdue update of the draft NEP about backwards
> compatibility and deprecation policy:
> https://github.com/numpy/numpy/pull/18097
>
> - This is NEP 23:
> https://numpy.org/neps/nep-0023-backwards-compatibility.html
> - Link to the previous mailing list discussion:
> https://mail.python.org/pipermail/numpy-discussion/2018-July/078432.html
>
> It would be nice to get this NEP to Accepted status. Main changes are:
>
> - Removed all examples that people objected to
> - Removed all content regarding versioning
> - Restructured sections, and added "Strategies related to deprecations"
> (using suggestions by @njsmith and @shoyer).
> - Added concrete examples of deprecations, and a more thorough description
> of how to go about adding warnings incl. Sphinx directives, using
> `stacklevel`, etc.
>
> As always, feedback here or on the PR is very welcome!
>
> Cheers,
> Ralf
>
>
> Abstract
> 
>
> In this NEP we describe NumPy's approach to backwards compatibility,
> its deprecation and removal policy, and the trade-offs and decision
> processes for individual cases where breaking backwards compatibility
> is considered.
>
>
> Motivation and Scope
> 
>
> NumPy has a very large user base.  Those users rely on NumPy being stable
> and the code they write that uses NumPy functionality to keep working.
> NumPy is also actively maintained and improved -- and sometimes
> improvements
> require, or are made much easier by, breaking backwards compatibility.
> Finally, there are trade-offs in stability for existing users vs. avoiding
> errors or having a better user experience for new users.  These competing
> needs often give rise to long debates and to delays in accepting or
> rejecting
> contributions.  This NEP tries to address that by providing a policy as
> well
> as examples and rationales for when it is or isn't a good idea to break
> backwards compatibility.
>
> In scope for this NEP are:
>
> - Principles of NumPy's approach to backwards compatibility.
> - How to deprecate functionality, and when to remove already deprecated
>   functionality.
> - Decision making process for deprecations and removals.
>
> Out of scope are:
>
> - Making concrete decisions about deprecations of particular functionality.
> - NumPy's versioning scheme.
>
>
> General principles
> --
>
> When considering proposed changes that are backwards incompatible, the
> main principles the NumPy developers use when making a decision are:
>
> 1. Changes need to benefit users more than they harm them.
> 2. NumPy is widely used so breaking changes should by default be assumed
> to be
>fairly harmful.
> 3. Decisions should be based on data and actual effects on users and
> downstream
>packages rather than, e.g., appealing to the docs or for stylistic
> reasons.
> 4. Silently getting a wrong answer is much worse than getting a loud error.
>
> When assessing the costs of proposed changes, keep in mind that most users
> do
> not read the mailing list, do not look at deprecation warnings, and
> sometimes
> wait more than one or two years before upgrading from their old version.
> And
> that NumPy has millions of users, so "no one will do or use this" is very
> likely incorrect.
>
> Benefits include improved functionality, usability and performance, as
> well as
> lower maintenance cost and improved future extensibility.
>
> Fixes for clear bugs are exempt from this backwards compatibility policy.
> However in case of serious impact on users (e.g. a downstream library
> doesn't
> build anymore or would start giving incorr

Re: [Numpy-discussion] Pearu Peterson has joined the NumPy developers team.

2021-02-08 Thread Ilhan Polat

This is very comforting news :) Welcome back

On Sun, Feb 7, 2021 at 9:10 PM Stefan van der Walt 
wrote:

> On Sun, Feb 7, 2021, at 10:12, Charles R Harris wrote:
>
> Pearu Peterson has joined the NumPy developers team. Pearu was responsible
> for contributing f2py and much of distutils in the early days of NumPy.
> Welcome back Pearu.
>
>
> Welcome back, it's good to see you around more, Pearu!
>
> Best regards,
> Stéfan
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Guidelines for floating point comparison

2021-02-24 Thread Ilhan Polat

Matrix powers are annoyingly tricky to keep under control due to the fact
that things to explode or implode rather quickly. In fact the famous quote
from Moler, Van Loan "Unfortunately, the roundoff errors in the mth power
of a matrix, say B^m ,are usually small relative to ||B||^m rather than
||B^m||" So thanks for nothing smart people.

Then there is a typical bound on rounding errors of matrix multiplication
which says if C is the exact result and Ce the result of our operation then
for some number k this expression holds  |C - Ce| <= k * |A| * |B| From
that, hoping that matrix power won't result with an error too far from the
manual multiplication, it is a matter of selecting a sensible k for atol
and rtol=0. I would go about this as

(some arbitrary constant I am randomly throwing in)*(matrix size n)*
np.finfo(dtype).eps*norm(A, 1)**k

As an example, get a matrix and artificially bloat the (0,0) entry

n = 100
A = np.random.rand(n, n)
A += np.diag([10.]+[0.]*99)
A4 = np.linalg.matrix_power(A, 4)
AA = A @ A @ A @ A
print('Max entry error', np.max(np.abs(AA-A4)))
print('My atol value', 100*n*np.finfo(A.dtype).eps*np.linalg.norm(A, 1)*4)

This accidentally makes it a tight bound but depending on how wildly your A
varies or how the spectrum of A is structured you might need to change
these constants.




















On Wed, Feb 24, 2021 at 1:53 PM Kevin Sheppard 
wrote:

> In my experience it is most common to use reasonable but not exceedingly
> tight bounds in complex applications where there isn’t a proof that the
> maximum error must be smaller than some number.  I would also caution
> against using a single system to find the tightest tolerance a test passes
> at.  For example, if you can pass at a rol 1e-13 on Linux/AMD64/GCC 9, then
> you might want to set a tolerance around 1e-11 so that you don’t get caught
> out on other platforms. Notoriously challenging platforms in my experience
> (mostly from statsmodels) are 32-bit Windows, 32-bit Linux and OSX (and I
> suspect OSX/ARM64 will be another difficult one).
>
>
>
> This advice is moot if you have a precise bound for the error.
>
>
>
> Kevin
>
>
>
>
>
> *From: *Ralf Gommers 
> *Sent: *Wednesday, February 24, 2021 12:25 PM
> *To: *Discussion of Numerical Python 
> *Subject: *Re: [Numpy-discussion] Guidelines for floating point comparison
>
>
>
>
>
>
>
> On Wed, Feb 24, 2021 at 11:29 AM Bernard Knaepen 
> wrote:
>
> Hi all,
>
> We are developing a code that heavily relies on NumPy. Some of our
> regression tests rely on floating point number comparisons and we are a bit
> lost in determining how to choose atol and rtol (we are trying to do all
> operations in double precision). We would like to set atol and rtol as low
> as possible but still have the tests pass if we run on different
> architectures or introduce some ‘cosmetic’ changes like using different
> similar NumPy routines.
>
> For example, let’s say we want some powers of the matrix A and compute
> them as:
>
> A = np.array(some_array)
> A2 = np.dot(A, A)
> A3 = np.dot(A2, A)
> A4 = np.dot(A3, A)
>
> If we alternatively computed A4 like:
>
> A4 = np.linalg.matrix_power(A, 4),
>
> we get different values in our final outputs because obviously the
> operations are not equivalent up to machine accuracy.
>
> Is there any reference that one could share providing guidelines on how to
> choose reasonable values for atol and rtol in this kind of situation? For
> example, does the NumPy package use a fixed set of values for its own
> development? the default ones?
>
>
>
> I don't think there's a clear guide in docs or blog post anywhere. You can
> get a sense of what works by browsing the unit tests for numpy and scipy.
> numpy.linalg, scipy.linalg and scipy.special are particularly relevant
> probably. For a rough rule of thumb: if you test on x86_64 and precision is
> on the order of 1e-13 to 1e-16, then set a relative tolerance 10 to 100
> times higher to account for other hardware, BLAS implementations, etc.
>
>
>
> Cheers,
>
> Ralf
>
>
>
>
> Thanks in advance for any help,
> Cheers,
> Bernard.
>
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] EHN: Discusions about 'add numpy.topk'

2021-05-29 Thread Ilhan Polat

Since this going into the top namespace, I'd also vote against the matlab-y
"topk" name. And even matlab didn't do what I would expect and went with
maxk

https://nl.mathworks.com/help/matlab/ref/maxk.html

I think "max_k" is a good generalization of the regular "max". Even when
auto-completing, this showing up under max makes sense to me instead of
searching them inside "t"s. Besides, "argmax_k" also follows suite, that of
the previous convention. To my eyes this is an acceptable disturbance to an
already very crowded namespace.



a few moments later

But then again an ugly idea rears its head proposing this going into the
existing max function. But I'll shut up now :)







On Sun, May 30, 2021 at 12:50 AM Robert Kern  wrote:

> On Sat, May 29, 2021 at 3:35 PM Daniele Nicolodi 
> wrote:
>
>> What does k stand for here? As someone that never encountered this
>> function before I find both names equally confusing. If I understand
>> what the function is supposed to be doing, I think largest() would be
>> much more descriptive.
>>
>
> `k` is the number of elements to return. `largest()` can connote that it's
> only returning the one largest value. It's fairly typical to include a
> dummy variable (`k` or `n`) in the name to indicate that the function lets
> you specify how many you want. See, for example, the stdlib `heapq`
> module's `nlargest()` function.
>
> https://docs.python.org/3/library/heapq.html#heapq.nlargest
>
> "top-k" comes from the ML community where this function is used to
> evaluate classification models (`k` instead of `n` being largely an
> accident of history, I imagine). In many classification problems, the
> number of classes is very large, and they are very related to each other.
> For example, ImageNet has a lot of different dog breeds broken out as
> separate classes. In order to get a more balanced view of the relative
> performance of the classification models, you often want to check whether
> the correct class is in the top 5 classes (or whatever `k` is appropriate)
> that the model predicted for the example, not just the one class that the
> model says is the most likely. "5 largest" doesn't really work in the
> sentences that one usually writes when talking about ML classifiers; they
> are talking about the 5 classes that are associated with the 5 largest
> values from the predictor, not the values themselves. So "top k" is what
> gets used in ML discussions, and that transfers over to the name of the
> function in ML libraries.
>
> It is a top-down reflection of the higher level thing that people want to
> compute (in that context) rather than a bottom-up description of how the
> function is manipulating the input, if that makes sense. Either one is a
> valid way to name things. There is a lot to be said for numpy's
> domain-agnostic nature that we should prefer the bottom-up description
> style of naming. However, we are also in the midst of a diversifying
> ecosystem of array libraries, largely driven by the ML domain, and adopting
> some of that terminology when we try to enhance our interoperability with
> those libraries is also a factor to be considered.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] EHN: Discusions about 'add numpy.topk'

2021-05-30 Thread Ilhan Polat

after a coffee, I don't see the point of calling it still "k" so "max_n" is
my vote for what its worth.

On Sun, May 30, 2021 at 8:38 AM Ilhan Polat  wrote:

> Since this going into the top namespace, I'd also vote against the
> matlab-y "topk" name. And even matlab didn't do what I would expect and
> went with maxk
>
> https://nl.mathworks.com/help/matlab/ref/maxk.html
>
> I think "max_k" is a good generalization of the regular "max". Even when
> auto-completing, this showing up under max makes sense to me instead of
> searching them inside "t"s. Besides, "argmax_k" also follows suite, that of
> the previous convention. To my eyes this is an acceptable disturbance to an
> already very crowded namespace.
>
>
>
> a few moments later
>
> But then again an ugly idea rears its head proposing this going into the
> existing max function. But I'll shut up now :)
>
>
>
>
>
>
>
> On Sun, May 30, 2021 at 12:50 AM Robert Kern 
> wrote:
>
>> On Sat, May 29, 2021 at 3:35 PM Daniele Nicolodi 
>> wrote:
>>
>>> What does k stand for here? As someone that never encountered this
>>> function before I find both names equally confusing. If I understand
>>> what the function is supposed to be doing, I think largest() would be
>>> much more descriptive.
>>>
>>
>> `k` is the number of elements to return. `largest()` can connote that
>> it's only returning the one largest value. It's fairly typical to include a
>> dummy variable (`k` or `n`) in the name to indicate that the function lets
>> you specify how many you want. See, for example, the stdlib `heapq`
>> module's `nlargest()` function.
>>
>> https://docs.python.org/3/library/heapq.html#heapq.nlargest
>>
>> "top-k" comes from the ML community where this function is used to
>> evaluate classification models (`k` instead of `n` being largely an
>> accident of history, I imagine). In many classification problems, the
>> number of classes is very large, and they are very related to each other.
>> For example, ImageNet has a lot of different dog breeds broken out as
>> separate classes. In order to get a more balanced view of the relative
>> performance of the classification models, you often want to check whether
>> the correct class is in the top 5 classes (or whatever `k` is appropriate)
>> that the model predicted for the example, not just the one class that the
>> model says is the most likely. "5 largest" doesn't really work in the
>> sentences that one usually writes when talking about ML classifiers; they
>> are talking about the 5 classes that are associated with the 5 largest
>> values from the predictor, not the values themselves. So "top k" is what
>> gets used in ML discussions, and that transfers over to the name of the
>> function in ML libraries.
>>
>> It is a top-down reflection of the higher level thing that people want to
>> compute (in that context) rather than a bottom-up description of how the
>> function is manipulating the input, if that makes sense. Either one is a
>> valid way to name things. There is a lot to be said for numpy's
>> domain-agnostic nature that we should prefer the bottom-up description
>> style of naming. However, we are also in the midst of a diversifying
>> ecosystem of array libraries, largely driven by the ML domain, and adopting
>> some of that terminology when we try to enhance our interoperability with
>> those libraries is also a factor to be considered.
>>
>> --
>> Robert Kern
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg

2021-06-29 Thread Ilhan Polat

Dear all,

I'm writing some helper Cythpm functions for scipy.linalg which is kinda
performant and usable. And there is still quite some wiggle room for more.

In many linalg routines there is a lot of performance benefit if the
structure can be discovered in a cheap and reliable way at the outset. For
example if symmetric then eig can delegate to eigh or if triangular then
triangular solvers can be used in linalg.solve and lstsq so forth

Here is the Cythonized version for Jupyter notebook to paste to discover
the lower/upper bandwidth of square array A that competes well with A != 0
just to use some low level function (note the latter returns an array hence
more cost is involved) There is a higher level supervisor function that
checks C-contiguousness otherwise specializes to different versions of it

Initial cell

%load_ext Cython
%load_ext line_profiler
import cython
import line_profiler

Then another cell

%%cython
# cython: language_level=3
# cython: linetrace=True
# cython: binding = True
# distutils: define_macros=CYTHON_TRACE=1
# distutils: define_macros=CYTHON_TRACE_NOGIL=1

cimport cython
cimport numpy as cnp
import numpy as np
import line_profiler
ctypedef fused np_numeric_t:
cnp.int8_t
cnp.int16_t
cnp.int32_t
cnp.int64_t
cnp.uint8_t
cnp.uint16_t
cnp.uint32_t
cnp.uint64_t
cnp.float32_t
cnp.float64_t
cnp.complex64_t
cnp.complex128_t
cnp.int_t
cnp.long_t
cnp.longlong_t
cnp.uint_t
cnp.ulong_t
cnp.ulonglong_t
cnp.intp_t
cnp.uintp_t
cnp.float_t
cnp.double_t
cnp.longdouble_t


@cython.linetrace(True)
@cython.initializedcheck(False)
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A):
cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c
cdef np_numeric_t zero = 0

for r in xrange(n):
# Only bother if outside the existing band:
for c in xrange(r-lower_band):
if A[r, c] != zero:
lower_band = r - c
break

for c in xrange(n - 1, r + upper_band, -1):
if A[r, c] != zero:
upper_band = c - r
break

return lower_band, upper_band

Final cell for use-case ---

# Make arbitrary lower-banded array
n = 50 # array size
k = 3 # k'th subdiagonal
R = np.zeros([n, n], dtype=np.float32)
R[[x for x in range(n)], [x for x in range(n)]] = 1
R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1
R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1
R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2

Some very haphazardly put together metrics

%timeit band_check_internal(R)
2.59 µs ± 84.7 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit np.linalg.solve(R, zzz)
824 µs ± 6.24 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit R != 0.
1.65 µs ± 43.1 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)

So the worst case cost is negligible in general (note that the given code
is slower as it uses the fused type however if I go with tempita standalone
version is faster)

Two questions:

1) This is missing np.half/float16 functionality since any arithmetic with
float16 is might not be reliable including nonzero check. IS it safe to
view it as np.uint16 and use that specialization? I'm not sure about the
sign bit hence the question. I can leave this out since almost all linalg
suite rejects this datatype due to well-known lack of supprt.

2) Should this be in NumPy or SciPy linalg? It is quite relevant to be on
SciPy but then again this stuff is purely about array structures. But if
the opinion is for NumPy then I would need a volunteer because NumPy
codebase flies way above my head.


All feedback welcome

Best
ilhan
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg

2021-07-02 Thread Ilhan Polat

Ah right. So two things, the original reason f9r this question is because I
can't decide in https://github.com/scipy/scipy/pull/12824 whether others
would also benefit from quick structure determination.

I can keep it private function or we can put them some misc or lib folder
so all can use. Say there is a special method for triangular matrices but
you can't guarantee the structure so you can quickly check for it. At worst
O(n**2) complexity for diagonal arrays and almost O(2n) for full arrays
makes it quite appealing.

But then again maybe NumPy is a better place since probably it will be
faster to have this in pure C with the right headers and without the extra
Cython overhead.

Funny you mention the container idea. This is precisely what I'm doing in
PR mentioned above (I'll push when I'm done). I stole the idea from Tim
Davis himself in a Julia discussion for keeping the factorization as an
attribute to be used later if need be. So yes it makes a lot of sense
Sparse or not.

On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, 
wrote:

> Hi Ilhan,
>
> Overall I think something like this would be great. However, I wonder
> if you considered having a specialized container with a structure tag
> instead of trying to discover the structure. If it's a container, it
> can neatly wrap various lapack storage schemes and dispatch to an
> appropriate lapack functionality. Possibly even sparse storage
> schemes. And it seems a bit more robust than trying to discover the
> structure (e.g. what about off-band elements of  \sim 1e-16 etc).
>
> The next question is of course if this should live in scipy/numpy
> .linalg or as a separate repo, at least for some time (maybe in the
> scipy organization?). So that it can iterate faster, among other
> things.
> (I'd be interested in contributing FWIW)
>
> Cheers,
>
> Evgeni
>
>
> On Wed, Jun 30, 2021 at 1:22 AM Ilhan Polat  wrote:
> >
> > Dear all,
> >
> > I'm writing some helper Cythpm functions for scipy.linalg which is kinda
> performant and usable. And there is still quite some wiggle room for more.
> >
> > In many linalg routines there is a lot of performance benefit if the
> structure can be discovered in a cheap and reliable way at the outset. For
> example if symmetric then eig can delegate to eigh or if triangular then
> triangular solvers can be used in linalg.solve and lstsq so forth
> >
> > Here is the Cythonized version for Jupyter notebook to paste to discover
> the lower/upper bandwidth of square array A that competes well with A != 0
> just to use some low level function (note the latter returns an array hence
> more cost is involved) There is a higher level supervisor function that
> checks C-contiguousness otherwise specializes to different versions of it
> >
> > Initial cell
> >
> > %load_ext Cython
> > %load_ext line_profiler
> > import cython
> > import line_profiler
> >
> > Then another cell
> >
> > %%cython
> > # cython: language_level=3
> > # cython: linetrace=True
> > # cython: binding = True
> > # distutils: define_macros=CYTHON_TRACE=1
> > # distutils: define_macros=CYTHON_TRACE_NOGIL=1
> >
> > cimport cython
> > cimport numpy as cnp
> > import numpy as np
> > import line_profiler
> > ctypedef fused np_numeric_t:
> > cnp.int8_t
> > cnp.int16_t
> > cnp.int32_t
> > cnp.int64_t
> > cnp.uint8_t
> > cnp.uint16_t
> > cnp.uint32_t
> > cnp.uint64_t
> > cnp.float32_t
> > cnp.float64_t
> > cnp.complex64_t
> > cnp.complex128_t
> > cnp.int_t
> > cnp.long_t
> > cnp.longlong_t
> > cnp.uint_t
> > cnp.ulong_t
> > cnp.ulonglong_t
> > cnp.intp_t
> > cnp.uintp_t
> > cnp.float_t
> > cnp.double_t
> > cnp.longdouble_t
> >
> >
> > @cython.linetrace(True)
> > @cython.initializedcheck(False)
> > @cython.boundscheck(False)
> > @cython.wraparound(False)
> > cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A):
> > cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c
> > cdef np_numeric_t zero = 0
> >
> > for r in xrange(n):
> > # Only bother if outside the existing band:
> > for c in xrange(r-lower_band):
> > if A[r, c] != zero:
> > lower_band = r - c
> > break
> >
> > for c in xrange(n - 1, r + upper_band, -1):
> > if A[r, c] != zero:
> > upper_band = c - r
> > break
>

Re: [Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg

2021-07-02 Thread Ilhan Polat

Yes they go by the name of morally triangular matrices (quite a stupid name
but in their defense I think it was an insider joke) this is also given in
Tim Davis' book as an exercise via linked lists. The issue is that LAPACK
doesn't support these permuted matrices. Hence we are left with two options

Either copy paste row/columns so that the array stays contiguous or permute
a copy of the array. Both can be significant cost while trying to shave off
solving time.

But you are right this can be present even though solvers and eig routines
won't use it. I'll put my Cython code back in.



On Fri, 2 Jul 2021, 20:05 Oscar Benjamin, 
wrote:

> If you're going to provide routines for structure determination it
> might be worth looking at algorithms that can identify more general or
> less obvious structure as well. SymPy's matrices module needs a lot of
> work and is improving a lot which will become noticeable over the next
> few releases but one of the important optimisations being used is
> Tarjan's algorithm for finding the strongly connected components of a
> graph. This is a generalisation of checking for triangular or diagonal
> matrices. With this approach you can identify any permutation of the
> rows and columns of a square matrix that can bring it into block
> triangular or block diagonal form which can reduce many O(n**3)
> algorithms substantially. The big-O for Tarjan's algorithm itself is
> basically the same as checking whether a matrix is
> triangular/diagonal.
>
> For example the matrix determinant is invariant under permutations of
> the rows and columns. If you can permute a matrix into block
> triangular form then the determinant is just the product of the
> determinants of the diagonal blocks. If the base case algorithm has
> n**3 operations then reducing it to two operations of size n/2 is a
> speed up of ~4x. In the extreme this discovers that a matrix is
> triangular and reduces the whole operation to O(n) (plus the cost of
> Tarjan's algorithm). However the graph-based approach also benefits
> wider classes e.g. you get almost all the same benefit for a matrix
> that is almost diagonal but has a few off-diagonal elements.
>
> Using sympy master branch (as .strongly_connected_components() is not
> released yet):
>
> In [19]: from sympy import Matrix
>
> In [20]: M = Matrix([[1, 0, 2, 0], [9, 3, 1, 2], [3, 0, 4, 0], [5, 8, 6,
> 7]])
>
> In [21]: M
> Out[21]:
> ⎡1  0  2  0⎤
> ⎢9  3  1  2⎥
> ⎢3  0  4  0⎥
> ⎣5  8  6  7⎦
>
> In [22]: M.strongly_connected_components()  # Tarjan's algorithm
> Out[22]: [[0, 2], [1, 3]]
>
> In [23]: M[[0, 2, 1, 3], [0, 2, 1, 3]] # outer indexing for permutation
> Out[23]:
> ⎡1  2  0  0⎤
> ⎢3  4  0  0⎥
> ⎢9  1  3  2⎥
> ⎣5  6  8  7⎦
>
> In [24]: M.det()
> Out[24]: -10
>
> In [25]: M[[0,2],[0,2]].det() * M[[1, 3], [1, 3]].det()
> Out[25]: -10
>
> --
> Oscar
>
> On Fri, 2 Jul 2021 at 12:20, Ilhan Polat  wrote:
> >
> > Ah right. So two things, the original reason f9r this question is
> because I can't decide in https://github.com/scipy/scipy/pull/12824
> whether others would also benefit from quick structure determination.
> >
> > I can keep it private function or we can put them some misc or lib
> folder so all can use. Say there is a special method for triangular
> matrices but you can't guarantee the structure so you can quickly check for
> it. At worst O(n**2) complexity for diagonal arrays and almost O(2n) for
> full arrays makes it quite appealing.
> >
> > But then again maybe NumPy is a better place since probably it will be
> faster to have this in pure C with the right headers and without the extra
> Cython overhead.
> >
> > Funny you mention the container idea. This is precisely what I'm doing
> in PR mentioned above (I'll push when I'm done). I stole the idea from Tim
> Davis himself in a Julia discussion for keeping the factorization as an
> attribute to be used later if need be. So yes it makes a lot of sense
> Sparse or not.
> >
> > On Wed, 30 Jun 2021, 19:14 Evgeni Burovski, 
> wrote:
> >>
> >> Hi Ilhan,
> >>
> >> Overall I think something like this would be great. However, I wonder
> >> if you considered having a specialized container with a structure tag
> >> instead of trying to discover the structure. If it's a container, it
> >> can neatly wrap various lapack storage schemes and dispatch to an
> >> appropriate lapack functionality. Possibly even sparse storage
> >> schemes. And it seems a bit more robust than trying to discover the
> >> structure (e.g. what about off-band elements of  \sim 1e-16 etc).
&

[Numpy-discussion] Re: spam on the mailing lists

2021-09-29 Thread Ilhan Polat

I'd like to reheat the proposition that we enable discussion feature on
GitHub for the repos. Not only this makes things a bit more streamlined
(transfer nonbug reports to discussions to handle noise there) but also
makes it easier to control the discussions. Moreover since it is GitHub
there are API-based ways to import mailing lists for NumPy and SciPy with a
bit less effort.

On Wed, Sep 29, 2021 at 12:11 PM Andras Deak  wrote:

> On Wed, Sep 29, 2021 at 12:02 PM Ralf Gommers 
> wrote:
>
>>
>>
>> On Wed, Sep 29, 2021 at 11:33 AM Andras Deak 
>> wrote:
>>
>>> On Wed, Sep 29, 2021 at 11:28 AM Andras Deak 
>>> wrote:
>>>
 On Wed, Sep 29, 2021 at 11:15 AM Ralf Gommers 
 wrote:

>
>
> On Wed, Sep 29, 2021 at 9:32 AM Andras Deak 
> wrote:
>
>> Hi All,
>>
>> Today both of the python.org mailing lists I'm subscribed to (numpy
>> and scipy-dev) got the same kind of link shortener spam. I assume all the
>> mailing lists started getting these, and that these won't go away for a
>> while.
>>
>
> I don't see these on
> https://mail.python.org/archives/list/numpy-discussion@python.org/,
> nor did I receive them (and I did check my spam folder). Do you see it in
> the archive, or do you understand why you do receive them?
>

 Sorry for not being specific: they were sent as replies to the latest
 thread on each list, see e.g. at the bottom (6th email, 5th reply) of
 https://mail.python.org/archives/list/numpy-discussion@python.org/thread/BLCIC2WMJQ5VT6HJSUW4V5TNGQ36JQXI/

>>>
>>> Found the permalink: (warning, spam link there)
>>> https://mail.python.org/archives/list/numpy-discussion@python.org/message/MWI6AKF4QNQ45532MVA3XOXYW5GDFL6O/
>>>
>>
>> Thanks!
>>
>>
>
>> Is there any way to prevent these, short of moderating emails from
>> new list members? Assuming the engine even supports that. There aren't 
>> many
>> emails, especially from new members, and I can't think of other ways that
>> ensure no false positives in filtering.
>>
>
>> We don't have admin access to the python.org lists, so this is a bit of
>> a problem. We have never had a spam problem, so we can ask to block this
>> user first. If it continues to happen, we may be able to moderate new
>> subscriber emails, but we do need to ask for permissions first and I'm not
>> sure we'll get them.
>>
>
> Unfortunately (but unsurprisingly) there are multiple accounts doing this
> https://mail.python.org/archives/search?q=bit.ly&page=1&sort=date-desc
> This is why I figured that an _a posteriori_ whack-a-mole against these
> specific users might not be a feasible solution to the underlying problem.
>
> András
>
>
>
>> A better solution longer term is migrating to Discourse, which has far
>> better moderation tools than Mailman and is also more approachable for
>> people not used to mailing lists (which is most newcomers to open source).
>> Migrating is a bit of a pain, but with the new CZI grant having a focus on
>> improving the contributor experience, we should be able to do this.
>>
>> Cheers,
>> Ralf
>>
>>
>>
>>>
>> Since maintainer time is precious, I can volunteer to moderate such
>> emails if needed.
>> Cheers,
>>
>> András
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ralf.gomm...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
 ___
>>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>>> To unsubscribe send an email to numpy-discussion-le...@python.org
>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>>> Member address: ralf.gomm...@gmail.com
>>>
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: deak.and...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org

[Numpy-discussion] Re: spam on the mailing lists

2021-10-01 Thread Ilhan Polat

The reason why I mentioned GH discussions is that literally everybody who
is engaged with the code, is familiar with the format, included in the
codebase product and has replies in built unlike the Discourse (opinion is
mine) useless flat discussion design where replies are all over the place
just like the mailing list in case you are not using a tree view supporting
client. Hence topic hijacking is one of the main usability difficulties of
emails.

The goal here is to have a coherent engagement with everyone not just
within a small circle, such that there is indeed a discussion happening
rather than a few people chiming in. It would be a nice analytics exercise
to have how many active users using these lists. I'd say 20-25 max for
contribs and team members which is really not much. I know some people are
still using IRC and mailing lists but I wouldn't argue that these are the
modern media to have proper engaging discussions. "Who said to whom" is the
bread and butter of such discussions. And I do think that discourse is
exactly the same thing with mailing lists with a slightly better UI while
virtually everyone else in the world is doing replies.

I would be willing to help with the objections raised since I have been
using GH discussions for quite a while now and there are many tools
available for administration of the discussions. For example,

https://github.blog/changelog/2021-09-14-notification-emails-for-discussions/

is a recent feature. I don't work for GitHub obviously and have nothing to
do with them but the reasons I'm willing to hear about.

On Fri, Oct 1, 2021 at 3:07 PM Matthew Brett 
wrote:

> Hi,
>
> On Fri, Oct 1, 2021 at 1:57 PM Rohit Goswami 
> wrote:
> >
> > I guess then the approach overall would evolve to something like using
> the mailing list to announce discourse posts which need input. Though I
> would assume that the web interface essentially makes the mailing list
> almost like discourse, even for new users.
> >
> > The real issue IMO is still the moderation efforts and additional
> governance needed for maintaining discourse.
>
> Yes - that was what I meant.   I do see that mailing lists are harder
> to moderate, in that once the email has gone out, it is difficult to
> revoke.  So is the argument just that you *can* moderate on Discourse,
> therefore you need to think about it more?  Do we have any reason to
> think that more moderation will in fact be needed?  We've needed very
> little so far on the mailing list, as far as I can see.
>
> Chers,
>
> Matthew
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: spam on the mailing lists

2021-10-01 Thread Ilhan Polat

> I’m firmly against GH discussions because of the upvoting mechanism. We
don’t need to be Reddit or SO. .NET had a bad experience with the
discussions as well [1].

They are not used, by default, it's date ordered you can choose whichever.
Voting plays no role unless you want to sort by votes.

> Given that we've had a literal order of magnitude more messages about the
spam than the spam itself, maybe it's just a blip?

Indeed that is the case :) Guilty as charged. I'm probably being a bit
opportunist since hijacking is easy here

> GitHub Discussions is more of a Q&A platform, like Stackoverflow. I don't
think it really makes sense for free form discussion.

 I don't see how it is to be honest. I'm hearing this complaint quite often
but I can't see how that is. That's quite not my experience. Especially in
node.js repo and other participants of the discussions beta are quite happy
with it.

Maybe I should rephrase why I am mentining this; Very often, some thing is
popping up in the issues asking for whether X is suitable for Sci/NumPy and
we lead the user here and more often than not they don't follow up. I can't
blame them because the whole mailing list experience especially for the
newcomers is a dreadful experience and most of the time you don't get any
feedback. Also you can't move because in the issue we told them to come
here and nobody is interested, then things stop unless someone nudges the
repo issue which was the idea in the first place. So in a way we are
putting this barrier as in "go talk to the elders in the mountain and bring
some shiny gems on your way back" which makes not much sense. We are using
the issues and PRs anyways to discuss stuff  willingly or not so I can't
say I follow the argument for the holistic mailing list format. This
doesn't mean that I ignore the convenience because that was the case in the
last decades. I'm totally fine with it. But if we are going to move it
let's make it count not switch to an identical platform just for the sake
of it. If not Github then something actually encourages the community to
join and not getting in the way.

On Fri, Oct 1, 2021 at 6:31 PM Stephan Hoyer  wrote:

> On Fri, Oct 1, 2021 at 8:55 AM Matthew Brett 
> wrote:
>
>> Only to say that:
>>
>> * I used to have a very firm preference for mail, because I'm pretty
>> happy with Gmail as a mail interface, and I didn't want to have
>> another channel I had to monitor, but
>> * I've spent more time on Discourse over the last year, mainly on
>> Jupyter, but I have also set up instances for my own projects.  I now
>> have a fairly strong preference for Discourse, because of its very
>> nice Markdown authoring, pleasant web interface for reviewing
>> discussions and reasonable mailing list mode.
>>
>
> +1 Markdown support, the ability to edit/delete posts, a good web
> interface and the possibility for new-comers to jump into an ongoing
> discussion are all major advantages to Discourse.
>
> I am not concerned about spam management or moderation. NumPy-Discussion
> is not a very popular form, and we have plenty of mature contributors to
> help moderate.
>
>
>> * I have hardly used Github Discussions, so I can't comment on them.
>> Are there large projects that are happy with them?   How does that
>> compare to Discourse, for example?
>>
>
> GitHub Discussions is more of a Q&A platform, like Stackoverflow. I don't
> think it really makes sense for free form discussion.
>
>
>> * It will surely cause some harm if it is not clear where discussions
>> happen, mainly (mailing list, Discourse, Github Discussions) so it
>> seems to me better to decide on one standard place, and commit to
>> that.
>>
>
> +1 let's pick a place and stick to it!
>
>
>>
>> Cheers,
>>
>> Matthew
>>
>> On Fri, Oct 1, 2021 at 4:39 PM Rohit Goswami 
>> wrote:
>> >
>> > I’m firmly against GH discussions because of the upvoting mechanism. We
>> don’t need to be Reddit or SO. .NET had a bad experience with the
>> discussions as well [1].
>> >
>> > [1] https://github.com/dotnet/aspnetcore/issues/29935
>> >
>> > — Rohit
>> >
>> > On 1 Oct 2021, at 15:04, Andras Deak wrote:
>> >
>> > On Fri, Oct 1, 2021 at 4:27 PM Ilhan Polat 
>> wrote:
>> >>
>> >> The reason why I mentioned GH discussions is that literally everybody
>> who is engaged with the code, is familiar with the format, included in the
>> codebase product and has replies in built unlike the Discourse (opinion is
>> mine) useless flat discussion design where

[Numpy-discussion] Re: spam on the mailing lists

2021-10-01 Thread Ilhan Polat

Judging by the support of it, I'll check whether I missed the whole point
of Discourse when I was trying to use it, in the meantime.

On Fri, Oct 1, 2021 at 7:57 PM Stephan Hoyer  wrote:

> On Fri, Oct 1, 2021 at 10:21 AM Ilhan Polat  wrote:
>
>> > GitHub Discussions is more of a Q&A platform, like Stackoverflow. I
>> don't think it really makes sense for free form discussion.
>>
>>  I don't see how it is to be honest. I'm hearing this complaint quite
>> often but I can't see how that is. That's quite not my experience.
>> Especially in node.js repo and other participants of the discussions beta
>> are quite happy with it.
>>
>> Maybe I should rephrase why I am mentining this; Very often, some thing
>> is popping up in the issues asking for whether X is suitable for Sci/NumPy
>> and we lead the user here and more often than not they don't follow up. I
>> can't blame them because the whole mailing list experience especially for
>> the newcomers is a dreadful experience and most of the time you don't get
>> any feedback. Also you can't move because in the issue we told them to come
>> here and nobody is interested, then things stop unless someone nudges the
>> repo issue which was the idea in the first place. So in a way we are
>> putting this barrier as in "go talk to the elders in the mountain and bring
>> some shiny gems on your way back" which makes not much sense. We are using
>> the issues and PRs anyways to discuss stuff  willingly or not so I can't
>> say I follow the argument for the holistic mailing list format. This
>> doesn't mean that I ignore the convenience because that was the case in the
>> last decades. I'm totally fine with it. But if we are going to move it
>> let's make it count not switch to an identical platform just for the sake
>> of it. If not Github then something actually encourages the community to
>> join and not getting in the way.
>>
>
> I agree, "go talk to the elders in the mountain" is not a great experience.
>
> One of the other problems about mailing lists is that it's awkward or
> impossible to ping old discussions. E.g., if you find a mailing list thread
> discussing an issue from two years ago, you pretty much have to start a new
> thread to discuss it.
>
> I think GitHub discussions is a perfectly fine web-based platform and
> definitely an improvement over a mailing list, but do like Discourse a
> little better. It's literally one click for a user to sign up to post on
> Discourse if they already have a GitHub account.
>
>
>
>> On Fri, Oct 1, 2021 at 6:31 PM Stephan Hoyer  wrote:
>>
>>> On Fri, Oct 1, 2021 at 8:55 AM Matthew Brett 
>>> wrote:
>>>
>>>> Only to say that:
>>>>
>>>> * I used to have a very firm preference for mail, because I'm pretty
>>>> happy with Gmail as a mail interface, and I didn't want to have
>>>> another channel I had to monitor, but
>>>> * I've spent more time on Discourse over the last year, mainly on
>>>> Jupyter, but I have also set up instances for my own projects.  I now
>>>> have a fairly strong preference for Discourse, because of its very
>>>> nice Markdown authoring, pleasant web interface for reviewing
>>>> discussions and reasonable mailing list mode.
>>>>
>>>
>>> +1 Markdown support, the ability to edit/delete posts, a good web
>>> interface and the possibility for new-comers to jump into an ongoing
>>> discussion are all major advantages to Discourse.
>>>
>>> I am not concerned about spam management or moderation. NumPy-Discussion
>>> is not a very popular form, and we have plenty of mature contributors to
>>> help moderate.
>>>
>>>
>>>> * I have hardly used Github Discussions, so I can't comment on them.
>>>> Are there large projects that are happy with them?   How does that
>>>> compare to Discourse, for example?
>>>>
>>>
>>> GitHub Discussions is more of a Q&A platform, like Stackoverflow. I
>>> don't think it really makes sense for free form discussion.
>>>
>>>
>>>> * It will surely cause some harm if it is not clear where discussions
>>>> happen, mainly (mailing list, Discourse, Github Discussions) so it
>>>> seems to me better to decide on one standard place, and commit to
>>>> that.
>>>>
>>>
>>> +1 let's pick a place and stick to it!
>>>
>>>

[Numpy-discussion] Re: What happened to the numpy.random documentation?

2021-10-15 Thread Ilhan Polat

As a cosmic coincidence this happened to me yesterday. My goal: generate
n-long arrays of 1s and -1s picked from a uniform distribution to mimic
BLAS function ?LARNV.

Full disclosure: I know a bit about the subject, but I'm not an expert and
too lazy to gather my thoughts about it at the moment because I think I'll
need "choice" so I just want to end up there quickly.

- So I went to the docs, and picked numpy.random from the left menu. It
started with some paragraphs, ok something somehing bitgenerator ok fine
interesting.
- Started scrolling down, OK old vs new hmm what is RandomState...
nevermind replaced with something new; where is the numpy.random.rand
stuff? ... nevermind scroll more
- OK some more stuff about what's new... then a try except block. None of
this seems familiar
- Introduction section !? in the middle of the page. OK am I in the correct
page?
- Things are getting tough now I'm in the PCG64 something something... What
is all stuff about?
- OK intro is over then a section "What's new or different"... I am not
introduced sufficiently yet...
- Giving up and scrolling all the way down and up a few more

- OK go back to google, search for "numpy choice" end up in
numpy.random.choice.
- There is a warning ... it says this is old stuff see QuickStart
- click and we are back  :)

This is really not a complaint, I think there is an occupational hazard
about random stuff throughout the internet. An insatiable urge to teach
people about pseudorandom processes but I don't think it is the right way
to go about it. Unless you are knee deep in this area almost none of this
matters to an average user. We had this discussion before over SciPy about
setting a seed as easy as the good ol' "numpy.random.seed" but all I am
getting is extensive details about how the engine works. I appreciate the
state-of-art implementation of any method but it can be saved for the
interested who wants to geek about it. Otherwise it would be really
beneficial if quick start is indeed quick. Currently it is a bit unquick :)

Best,
ilhan








On Fri, Oct 15, 2021 at 12:13 PM Kevin Sheppard 
wrote:

>
>
> On Thu, Oct 14, 2021 at 7:22 PM Ralf Gommers 
> wrote:
>
>>
>>
>> On Thu, Oct 14, 2021 at 7:19 PM Kevin Sheppard <
>> kevin.k.shepp...@gmail.com> wrote:
>>
>>> I think the issue in random specifically is that a raw list of
>>> available functions does not provide suitable guidance for someone looking
>>> for random variate generating function.  This is because the module-level
>>> API is mostly dominated by methods of the singleton RandomState instance.
>>> Best practice going forward is to use the methods of a Generator instance,
>>> most likely provided by default_rng(). A simple API-list will not be able
>>> to provide this guidance.
>>>
>>
>> The list can be annotated with headings and one-line or one-paragraph
>> descriptions, something like:
>>
>> ```
>> ## Generator interface
>>
>> This is the recommended interface ... 
>>
>> ## Interface for unit testing and legacy code
>>
>> 
>> ```
>>
>> The complaint is very much valid here, I have made the same one before.
>> The way the page currently is written makes little sense - it addresses a
>> user transitioning from the old to the new interface, or explicitly
>> comparing the two for some reason. To a user just looking for information
>> on NumPy today, that's more confusing than helpful.
>>
>> The page also talks about "The new interface", "What's new and
>> different", "Some long-overdue API cleanup", and "Since Numpy version
>> 1.17.0" - that all belongs in a NEP, and not in the API reference docs.
>>
>> Cheers,
>> Ralf
>>
>>
>>
> I don't think the doc style there is ideal.  I would still say that a
> relatively naive dump of `np.random` (something that would be everything in
> [v for v in dir(np.random) if not v.startswith("_")] would not lead to an
> idea set of docs because most of the "obvious" functions are methods of the
> legacy RandomState.  A good set would need something like (excluding
> headers)
>
> default_rng
> Generator
> Generator.random
> Generator.integers
> ...
>
> 
> Legacy Methods
> ==
> 
> np.random.random_sample
> np.random.randint
> RandmState
> ...
>
> IMO many (likely most) methods exposed in np.random should not be on the
> default landing page for np.random.
>
> Best,
> Kevin
>
>
>
>>> FFT has a very simple API and so a simple list make sense.  Similarly,
>>> np.random before the generation was revamped, which is hy the old-style was
>>> adequate for <=1.16, but not for >=1.17
>>>
>>> Kevin
>>>
>>>
>>> On Thu, Oct 14, 2021 at 6:09 PM Paul M.  wrote:
>>>
 Hi Melissa,

 I think that's the right approach.  Looking through the current docs, I
 think the page on the FFT module is exemplary in this regard:

 https://numpy.org/doc/stable/reference/routines.fft.html

 It lists all the available functions (with links to details), and then
 has a section on "Background Information", "Implementation De

[Numpy-discussion] Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Ilhan Polat

I've asked this in Cython mailing list but probably I should also get some
feedback here too.

I have the following function defined in Cython and using flat memory
pointers to hold n by n array data.


cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef double *
work1 = malloc(n*n*sizeof(double)) cdef double *work2 = 
malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations here # ...
dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0], &n, &info )
dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)









Here, I have done everything in C layout with work1 and work2 but I have to
convert work2 into Fortran layout to be able to solve AX = B. A can be
transposed in Lapack internally via the flag 'T' so the only obstacle I
have now is to shuffle work2 which holds B transpose in the eyes of Fortran
since it is still in C layout.

If I go naively and make loops to get one layout to the other that actually
spoils all the speed benefits from this Cythonization due to cache misses.
In fact 60% of the time is spent in that naive loop across the whole
function. Same goes for the copy_fortran() of memoryviews.

I have measured the regular NumPy np.asfortranarray()  and the performance
is quite good enough compared to the actual linear solve. Hence whatever it
is doing underneath I would like to reach out and do the same possibly via
the C-API. But my C knowledge basically failed me around this line
https://github.com/numpy/numpy/blob/8dbd507fb6c854b362c26a0dd056cd04c9c10f25/numpy/core/src/multiarray/multiarraymodule.c#L1817

I have found the SO post from
https://stackoverflow.com/questions/45143381/making-a-memoryview-c-contiguous-fortran-contiguous
but I am not sure if that is the canonical way to do it in newer Python
versions.

Can anyone show me how to go about it without interacting with Python
objects?

Best,
ilhan
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Ilhan Polat

Indeed for matrix multiplication and many other L3 BLAS functions, we are
lucky however for linear solve function ?getrs unfortunately no avail.

On Thu, Nov 11, 2021 at 12:31 AM Benjamin Root  wrote:

> I have found that a bunch of lapack functions seem to have arguments for
> stating whether or not the given arrays are C or F ordered. Then you
> wouldn't need to worry about handling the layout yourself. For example, I
> have some C++ code like so:
>
> extern "C" {
>
> /**
>  * Forward declaration for LAPACK's Fortran dgemm function to allow use in
> C/C++ code.
>  *
>  * This function is used for matrix multiplication between two arrays of
> doubles.
>  *
>  * For complete reference:
> http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html
>  */
> void dgemm_(const char* TRANSA, const char* TRANSB, const int* M, const
> int* N, const int* K,
> const double* ALPHA, const double* A, const int* LDA, const double* B,
> const int* LDB,
> const double* BETA, double* C, const int* LDC);
> }
>
> ...
>
> dgemm_("C", "C", &nLayers, &N, &nVariables, &alpha, matrices.IW->data(),
> &nVariables,
> inputs.data(), &N, &beta, intermediate.data(), &nLayers);
>
> (in this case, I was using boost multiarrays, but the basic idea is the
> same). IIRC, a bunch of other lapack functions had similar features.
>
> I hope this is helpful.
>
> Ben Root
>
>
>
> On Wed, Nov 10, 2021 at 6:02 PM Ilhan Polat  wrote:
>
>> I've asked this in Cython mailing list but probably I should also get
>> some feedback here too.
>>
>> I have the following function defined in Cython and using flat memory
>> pointers to hold n by n array data.
>>
>>
>> cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef double *
>> work1 = malloc(n*n*sizeof(double)) cdef double *work2 = > *>malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations here #
>> ... dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0], &n, &
>> info ) dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Here, I have done everything in C layout with work1 and work2 but I have
>> to convert work2 into Fortran layout to be able to solve AX = B. A can be
>> transposed in Lapack internally via the flag 'T' so the only obstacle I
>> have now is to shuffle work2 which holds B transpose in the eyes of Fortran
>> since it is still in C layout.
>>
>> If I go naively and make loops to get one layout to the other that
>> actually spoils all the speed benefits from this Cythonization due to cache
>> misses. In fact 60% of the time is spent in that naive loop across the
>> whole function. Same goes for the copy_fortran() of memoryviews.
>>
>> I have measured the regular NumPy np.asfortranarray()  and the
>> performance is quite good enough compared to the actual linear solve. Hence
>> whatever it is doing underneath I would like to reach out and do the same
>> possibly via the C-API. But my C knowledge basically failed me around this
>> line
>> https://github.com/numpy/numpy/blob/8dbd507fb6c854b362c26a0dd056cd04c9c10f25/numpy/core/src/multiarray/multiarraymodule.c#L1817
>>
>> I have found the SO post from
>> https://stackoverflow.com/questions/45143381/making-a-memoryview-c-contiguous-fortran-contiguous
>> but I am not sure if that is the canonical way to do it in newer Python
>> versions.
>>
>> Can anyone show me how to go about it without interacting with Python
>> objects?
>>
>> Best,
>> ilhan
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ben.v.r...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Ilhan Polat

Hmm not sure I understand the question but this is what I mean by naive
looping, suppose I allocate a scratch register work3, then

for i in range(n): for j in range(n): work3[j*n+i] = work2[i*n+j]



This basically doing the row to column based indexing and obviously we
create a lot of cache misses since work3 entries are accessed in the
shuffled fashion. The idea of all this Cython attempt is to avoid such
access hence if the original some_C_layout_func takes 10 units of time, 6
of it is spent on this loop when the data doesn't fit the cache. When I
discard the correctness of the function and comment out this loop and then
remeasure the original func spends roughly 3 units of time. However take
any random array in C order in NumPy using regular Python and use
np.asfortranarray() it spends roughly about 0.1 units of time. So
apparently it is possible to do this somehow at the low level in a
performant way. That's what I would like to understand or clear out my
misunderstanding.





On Thu, Nov 11, 2021 at 12:56 AM Andras Deak  wrote:

> On Thursday, November 11, 2021, Ilhan Polat  wrote:
>
>> I've asked this in Cython mailing list but probably I should also get
>> some feedback here too.
>>
>> I have the following function defined in Cython and using flat memory
>> pointers to hold n by n array data.
>>
>>
>> cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef double *
>> work1 = malloc(n*n*sizeof(double)) cdef double *work2 = > *>malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations here #
>> ... dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0], &n, &
>> info ) dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Here, I have done everything in C layout with work1 and work2 but I have
>> to convert work2 into Fortran layout to be able to solve AX = B. A can be
>> transposed in Lapack internally via the flag 'T' so the only obstacle I
>> have now is to shuffle work2 which holds B transpose in the eyes of Fortran
>> since it is still in C layout.
>>
>> If I go naively and make loops to get one layout to the other that
>> actually spoils all the speed benefits from this Cythonization due to cache
>> misses. In fact 60% of the time is spent in that naive loop across the
>> whole function.
>>
>>
> Sorry if this is a dumb question, but is this true whether or not you loop
> over contiguous blocks of the input vs the output array? Or is the faster
> of the two options still slower than the linsolve?
>
> András
>
>
>>
>>  Same goes for the copy_fortran() of memoryviews.
>>
>> I have measured the regular NumPy np.asfortranarray()  and the
>> performance is quite good enough compared to the actual linear solve. Hence
>> whatever it is doing underneath I would like to reach out and do the same
>> possibly via the C-API. But my C knowledge basically failed me around this
>> line
>> https://github.com/numpy/numpy/blob/8dbd507fb6c854b362c26a0dd056cd04c9c10f25/numpy/core/src/multiarray/multiarraymodule.c#L1817
>>
>> I have found the SO post from
>> https://stackoverflow.com/questions/45143381/making-a-memoryview-c-contiguous-fortran-contiguous
>> but I am not sure if that is the canonical way to do it in newer Python
>> versions.
>>
>> Can anyone show me how to go about it without interacting with Python
>> objects?
>>
>> Best,
>> ilhan
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Ilhan Polat

Here are some actual numbers within the context of operation (nogil removed
and def'd for linetracing)


Line # Hits Time Per Hit % Time Line Contents
== 80 #
Bilinear identity to shave off some flops 81 # inv(V-U) (V+U) = inv(V-U)
(V-U+2V) = I + 2 inv(V-U) U 82 1 15.0 15.0 0.1 daxpy(&n2, &neg_one, &work2[0
], &int1, &work3[0], &int1) 83 84 # Convert array layout for solving AX = B
85 1 3.0 3.0 0.0 for i in range(n): 86 60 137.0 2.3 0.9 for j in range(n):
87 3600 8437.0 2.3 57.7 work4[j*n+i] = work2[i*n+j] 88 89 1 408.0 408.0 2.8
dgetrf( &n, &n, &work3[0], &n, &ipiv[0], &info ) 90 1 4122.0 4122.0 28.2
dgetrs('T', &n, &n, &work3[0], &n, &ipiv[0], &work4[0], &n, &info )
91 1 25.0 25.0 0.2 dscal(&n2, &two, &work4[0], &int1) 92 # Add identity
matrix 93 1 4.0 4.0 0.0 for i in range(n): 94 60 146.0 2.4 1.0 work4[i*(n+1
)] += 1. 95 1 16.0 16.0 0.1 dcopy(&n2, &work4[0], &int1, &Am[0, 0, 0], &int1
)



















On Thu, Nov 11, 2021 at 1:04 AM Ilhan Polat  wrote:

> Hmm not sure I understand the question but this is what I mean by naive
> looping, suppose I allocate a scratch register work3, then
>
> for i in range(n): for j in range(n): work3[j*n+i] = work2[i*n+j]
>
>
>
> This basically doing the row to column based indexing and obviously we
> create a lot of cache misses since work3 entries are accessed in the
> shuffled fashion. The idea of all this Cython attempt is to avoid such
> access hence if the original some_C_layout_func takes 10 units of time, 6
> of it is spent on this loop when the data doesn't fit the cache. When I
> discard the correctness of the function and comment out this loop and then
> remeasure the original func spends roughly 3 units of time. However take
> any random array in C order in NumPy using regular Python and use
> np.asfortranarray() it spends roughly about 0.1 units of time. So
> apparently it is possible to do this somehow at the low level in a
> performant way. That's what I would like to understand or clear out my
> misunderstanding.
>
>
>
>
>
> On Thu, Nov 11, 2021 at 12:56 AM Andras Deak 
> wrote:
>
>> On Thursday, November 11, 2021, Ilhan Polat  wrote:
>>
>>> I've asked this in Cython mailing list but probably I should also get
>>> some feedback here too.
>>>
>>> I have the following function defined in Cython and using flat memory
>>> pointers to hold n by n array data.
>>>
>>>
>>> cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef double *
>>> work1 = malloc(n*n*sizeof(double)) cdef double *work2 = >> *>malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations here #
>>> ... dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0], &n, &
>>> info ) dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Here, I have done everything in C layout with work1 and work2 but I have
>>> to convert work2 into Fortran layout to be able to solve AX = B. A can be
>>> transposed in Lapack internally via the flag 'T' so the only obstacle I
>>> have now is to shuffle work2 which holds B transpose in the eyes of Fortran
>>> since it is still in C layout.
>>>
>>> If I go naively and make loops to get one layout to the other that
>>> actually spoils all the speed benefits from this Cythonization due to cache
>>> misses. In fact 60% of the time is spent in that naive loop across the
>>> whole function.
>>>
>>>
>> Sorry if this is a dumb question, but is this true whether or not you
>> loop over contiguous blocks of the input vs the output array? Or is the
>> faster of the two options still slower than the linsolve?
>>
>> András
>>
>>
>>>
>>>  Same goes for the copy_fortran() of memoryviews.
>>>
>>> I have measured the regular NumPy np.asfortranarray()  and the
>>> performance is quite good enough compared to the actual linear solve. Hence
>>> whatever it is doing underneath I would like to reach out and do the same
>>> possibly via the C-API. But my C knowledge basically failed me around this
>>> line
>>> https://github.com/numpy/numpy/blob/8dbd507fb6c854b362c26a0dd056cd04c9c10f25/numpy/core/src/multiarray/multiarraymodule.c#L1817
>>>
>>> I have found the SO post from
>>> https://stackoverflow.com/questions/45143381/m

[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Ilhan Polat

Ah I see. Thank you Sebastian, I was hoping to avoid all that blocking
(since HW dependency leaves some performance at many tables) or recursive
zooming stuff with some off-the-shelf tool but apparently I'm walking in
the dusty corners again collecting spider webs :) As you said, there are
quite a lot of low hanging fruits we might collect regarding such data
manipulations which will boost basically everything since these ops are
ubiquitous.

In case any one is wondering the context; this is for the scipy.linalg.expm
overhaul mainly kept updated at https://github.com/scipy/scipy/issues/12838



On Thu, Nov 11, 2021 at 2:40 AM Sebastian Berg 
wrote:

> On Thu, 2021-11-11 at 01:04 +0100, Ilhan Polat wrote:
> > Hmm not sure I understand the question but this is what I mean by naive
> > looping, suppose I allocate a scratch register work3, then
> >
> > for i in range(n): for j in range(n): work3[j*n+i] = work2[i*n+j]
> >
>
> NumPy does not end up doing anything special.  Special would be to use
> a blocked iteration and NumPy doesn't have it unfortunately.
> The only thing it does is use pointers to cut some overheads, something
> (very rough) like:
>
> ptr1 = arr1.data
> ptr2_col = arr2.data
>
> strides2_col = arr.strides[0]
> strides2_row = arr2.strides[1]
>
> for i in range(n):
> ptr2 = ptr2_col
> for j in range(n):
>  *ptr2 = *ptr1
>  ptr1++
>  ptr2 += strides2_row
>
> ptr2_col += strides2_col
>
> And if you write that in cython, you are likely faster since you can
> cut quite a few corners (all is aligned, contiguous, etc.).
> (with potentially, loop unrolling/compiler optimization fluctuations,
> numpy probably tells GCC to unroll and optimize the innermost loop
> there)
>
> I would not be surprised if you can find a lightweight fast copy-
> transpose out there, or if some tools like MKL/Cuda just include it. It
> is too bad NumPy is missing it.
>
> Cheers,
>
> Sebastian
>
>
> >
> >
> > This basically doing the row to column based indexing and obviously we
> > create a lot of cache misses since work3 entries are accessed in the
> > shuffled fashion. The idea of all this Cython attempt is to avoid such
> > access hence if the original some_C_layout_func takes 10 units of time,
> > 6
> > of it is spent on this loop when the data doesn't fit the cache. When I
> > discard the correctness of the function and comment out this loop and
> > then
> > remeasure the original func spends roughly 3 units of time. However
> > take
> > any random array in C order in NumPy using regular Python and use
> > np.asfortranarray() it spends roughly about 0.1 units of time. So
> > apparently it is possible to do this somehow at the low level in a
> > performant way. That's what I would like to understand or clear out my
> > misunderstanding.
> >
> >
> >
> >
> >
> > On Thu, Nov 11, 2021 at 12:56 AM Andras Deak 
> > wrote:
> >
> > > On Thursday, November 11, 2021, Ilhan Polat 
> > > wrote:
> > >
> > > > I've asked this in Cython mailing list but probably I should also
> > > > get
> > > > some feedback here too.
> > > >
> > > > I have the following function defined in Cython and using flat
> > > > memory
> > > > pointers to hold n by n array data.
> > > >
> > > >
> > > > cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef
> > > > double *
> > > > work1 = malloc(n*n*sizeof(double)) cdef double *work2 =
> > > >  > > > *>malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations
> > > > here #
> > > > ... dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0],
> > > > &n, &
> > > > info ) dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Here, I have done everything in C layout with work1 and work2 but I
> > > > have
> > > > to convert work2 into Fortran layout to be able to solve AX = B. A
> > > > can be
> > > > transposed in Lapack internally via the flag 'T' so the only
> > > > obstacle I
> > > > have now is to shuffle work2 which holds B transpose in the eyes of
> > > > Fortran
> > > > since it is still in C la

[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-11 Thread Ilhan Polat

In case anyone needs this in the future, here is what I managed to put
together, and please let me know if I am doing something reckless or wrong.
It is slightly faster than numpy.asfortranarray and it doesn't show any
cache miss symptoms but can't say I did a thorough bencmark testing. I
chose the blocksize 16 completely based on the current locations of the
planets. 32 or 64 can also work but NumPy/SciPy is used on all kinds of
esoteric places so went for a small number.


@cython.cdivision(True) @cython.wraparound(False) @cython.boundscheck(False)
@cython.initializedcheck(False) cdef void swap_c_and_f_layout(double *a,
double *b, int r, int c, int n) nogil: """Recursive matrix transposition
from a to b, both n**2-long flat arrays""" cdef int i, j, ith_row, r2, c2
cdef double *bb=b cdef double *aa=a if c < 16: for j in range(c): ith_row =
0 for i in range(r): bb[ith_row] = aa[i] ith_row += n aa += n bb += 1 else: #
If tall if (r > c): r2 = r//2 swap_c_and_f_layout(a, b, r2, c, n)
swap_c_and_f_layout(a + r2, b+(r2)*n, r-r2, c, n) else: # Nope c2 = c//2
swap_c_and_f_layout(a, b, r, c2, n); swap_c_and_f_layout(a+(c2)*n, b+c2, r,
c-c2, n)





















For the desperate souls reading this in the future; I feel your pain :)

On Thu, Nov 11, 2021 at 3:36 AM Ilhan Polat  wrote:

> Ah I see. Thank you Sebastian, I was hoping to avoid all that blocking
> (since HW dependency leaves some performance at many tables) or recursive
> zooming stuff with some off-the-shelf tool but apparently I'm walking in
> the dusty corners again collecting spider webs :) As you said, there are
> quite a lot of low hanging fruits we might collect regarding such data
> manipulations which will boost basically everything since these ops are
> ubiquitous.
>
> In case any one is wondering the context; this is for the
> scipy.linalg.expm overhaul mainly kept updated at
> https://github.com/scipy/scipy/issues/12838
>
>
>
> On Thu, Nov 11, 2021 at 2:40 AM Sebastian Berg 
> wrote:
>
>> On Thu, 2021-11-11 at 01:04 +0100, Ilhan Polat wrote:
>> > Hmm not sure I understand the question but this is what I mean by naive
>> > looping, suppose I allocate a scratch register work3, then
>> >
>> > for i in range(n): for j in range(n): work3[j*n+i] = work2[i*n+j]
>> >
>>
>> NumPy does not end up doing anything special.  Special would be to use
>> a blocked iteration and NumPy doesn't have it unfortunately.
>> The only thing it does is use pointers to cut some overheads, something
>> (very rough) like:
>>
>> ptr1 = arr1.data
>> ptr2_col = arr2.data
>>
>> strides2_col = arr.strides[0]
>> strides2_row = arr2.strides[1]
>>
>> for i in range(n):
>> ptr2 = ptr2_col
>> for j in range(n):
>>  *ptr2 = *ptr1
>>  ptr1++
>>  ptr2 += strides2_row
>>
>> ptr2_col += strides2_col
>>
>> And if you write that in cython, you are likely faster since you can
>> cut quite a few corners (all is aligned, contiguous, etc.).
>> (with potentially, loop unrolling/compiler optimization fluctuations,
>> numpy probably tells GCC to unroll and optimize the innermost loop
>> there)
>>
>> I would not be surprised if you can find a lightweight fast copy-
>> transpose out there, or if some tools like MKL/Cuda just include it. It
>> is too bad NumPy is missing it.
>>
>> Cheers,
>>
>> Sebastian
>>
>>
>> >
>> >
>> > This basically doing the row to column based indexing and obviously we
>> > create a lot of cache misses since work3 entries are accessed in the
>> > shuffled fashion. The idea of all this Cython attempt is to avoid such
>> > access hence if the original some_C_layout_func takes 10 units of time,
>> > 6
>> > of it is spent on this loop when the data doesn't fit the cache. When I
>> > discard the correctness of the function and comment out this loop and
>> > then
>> > remeasure the original func spends roughly 3 units of time. However
>> > take
>> > any random array in C order in NumPy using regular Python and use
>> > np.asfortranarray() it spends roughly about 0.1 units of time. So
>> > apparently it is possible to do this somehow at the low level in a
>> > performant way. That's what I would like to understand or clear out my
>> > misunderstanding.
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Nov 11, 2021 at 12:56 AM Andras Deak 
>> > wrote:
>> >
>> > > On Thursday, November 11, 2021, Ilh

[Numpy-discussion] Re: [Job] NumPy Open Source Developer at NVIDIA

2022-03-02 Thread Ilhan Polat

Found the original in the spam folder.

Hi all,

I'm excited to share that I'm hiring a remote NumPy developer at NVIDIA!
The majority of their time will be focused on open source contributions to
the NumPy project, so I wanted to share the opportunity on this list.

We are continuing to expand our support of the PyData ecosystem and looking
to hire strong engineers who are or can become contributors to NumPy,
Pandas, Scikit-learn, SciPy, and NetworkX. Please see the job posting for
more details.

Non-US based applicants are eligible in certain countries. I can follow up
with individuals to confirm eligibility.

Thanks,
Mike

On Wed, Mar 2, 2022 at 10:39 PM Stephan Hoyer  wrote:

> Hi Inessa -- could you share the original job description? It looks like
> it got lost from your message :)
>
> On Wed, Mar 2, 2022 at 12:28 PM Inessa Pawson  wrote:
>
>> Hi, Mike!
>> This is wonderful news! NumPy could certainly use more help.
>>
>> Cheers,
>> Inessa
>>
>> Inessa Pawson
>> Contributor Experience Lead | NumPy
>> email: ine...@albuscode.org
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: sho...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Dropping the pdf documentation.

2022-05-23 Thread Ilhan Polat

As the person initiated the PDF drop in SciPy, I'd give my reasoning for
why it bugged me in the first place

- The typography is \subsubpar (as a TeX person should say) and just an
eyesore, this actually matters a lot more than you would assume and
unreadable in mobile without constant zooming because of nonresponsive
format
- Almost all links are broken and left as double backticks since it is not
originally designed for PDF navigation
- Code copy/pasting is broken (due to how the TeX package for listings
setup) regardless of the PDF viewer
- It is mostly empty space hence bloats the page number because it comes
from the HTML format and not the other way around as say, TeX4ht workflow
would follow.
- It is an absolute waste of resources on CI/CD since it fires up per Pull
Request (maybe we can argue to reduce it to per main-branch-merge but
doesn't change the fact that it is just wasteful and burdensome)
- Like Ralf mentioned the infrastructure for a TeX run is unacceptable for
today's standards (but it is the LaTeX maintainers to blame for it and I
know some of them, they know this very well and trying hard to reduce it)
- It is a very unstable workflow and errors out depending on the planets
alignment because, again, it is coming from an awkward Markdown source
which is not designed for. Becomes very annoying for maintainers to see it
fail for otherwise a perfectly valid code.

The API reference PDF (7.2 mb) is also difficult to find compared to the
front page version which is the User guide (3.x mb). So probably there is
no demand for it anyways because it didn't cause too much noise as far as I
know.







On Mon, May 23, 2022 at 8:34 AM Ralf Gommers  wrote:

>
>
> On Mon, May 23, 2022 at 6:51 AM Matti Picus  wrote:
>
>>
>> On 23/5/22 01:51, Rohit Goswami wrote:
>> >
>> > Being very hard to read should not be reason enough to stop generating
>> > them. In places with little to no internet connectivity often the PDF
>> > documentation is invaluable.
>> >
>> > I personally use the PDF documentation both on my phone and e-reader
>> > when I travel simply because it is more accessible and has better
>> > search capabilities.
>> >
>> > It is true that SciPy has removed them, but that doesn't necessarily
>> > mean we need to follow suit. Especially relevant (IMO) is that large
>> > parts of the NumPy documentation still make sense when read
>> > sequentially (going back to when it was at some point partially kanged
>> > from Travis' book).
>> >
>> > I'd be happy to spend time (and plan to) working on fixing concrete
>> > issues other than straw-man and subjective arguments.
>> >
>> > Personally I'd like to see the NumPy documentation have PDFs in a
>> > fashion where each page / chapter can be downloaded individually.
>> >
>> > -- Rohit
>> >
>> > P.S.: If we have CI timeout issues, for the PDF docs we could also
>> > have a dedicated repo and only build for releases.
>> >
>> > P.P.S: FWIW the Python docs are also still distributed in PDF form.
>> >
>> > On 22 May 2022, at 21:41, Stephan Hoyer wrote:
>> >
>> > +1 let’s drop the PDF docs. They are already very hard to read.
>> >
>> > On Sun, May 22, 2022 at 1:06 PM Charles R Harris
>> >  wrote:
>> >
>> > Hi All,
>> >
>> > This is a proposal to drop the generation of pdf documentation
>> > and only generate the html version. This is a one way change
>> > due to the difficulty maintaining/fixing the pdf versions. See
>> > minimal discussion here
>> > <
>> https://github.com/numpy/numpy/issues/21557#issuecomment-1133920412>.
>> >
>> > Chuck
>> >
>>
>> Thanks Rohit for the offer to take on this project.
>>
>> I don't think we should block the release on the existence of PDF
>> documentation. It is a "nice to have", not a hard requirement.
>>
>>
>> One strategy to discover problems with the PDF builds in CI would be to
>> add a weekly build of PDF.
>>
>
> That would just mean more CI maintenance/breakage, that the same folks who
> always take care of CI issues inevitably are going to have to look at.
>
> I'm +1 for removing pdf builds, they are not worth the maintainer effort -
> we shouldn't put them in CI, and they break at release time too often. It
> will remain possible for interested users to rebuild the docs themselves -
> and we can/will accept patches for docstring issues that trip up the pdf
> but not the html build. That's the same support level we have for other
> things that we do not run in CI.
>
> When we removed the SciPy pdf docs, the one concern was that there was no
> longer an offline option (by Juan, a very knowledgeable user and occasional
> contributor). So I suspect that most of the pdf downloads are for users who
> want that offline option, but we don't tell them that html+zip is the
> preferred one.
>
> Another benefit of removal is to slim down our dev Docker images a lot -
> right now the numpy-dev image is 300 MB larger than the scipy-dev one
> because of the inclusi

[Numpy-discussion] Re: Dropping the pdf documentation.

2022-05-23 Thread Ilhan Polat

On Mon, May 23, 2022 at 11:12 AM Rohit Goswami 
wrote:

> I am unaware of the state of the SciPy documentation at the time it was
> dropped. However, many of these arguments do not seem to apply to the NumPy
> documentation hosted at https://numpy.org/doc/.
>
They were almost identical, same machinery (like most things). Well this is
subjective but the typography is unfit for any code based format.

> The typography is \\subsubpar (as a TeX person should say) and just an
> eyesore, this actually matters a lot more than you would assume and
> unreadable in mobile without constant zooming because of nonresponsive
> format
>
> This is a valid concern, but there are third-party tools to deal with
> reflow (both at the mobile level and by preprocessing like with k2pdfopt:
> https://www.willus.com/k2pdfopt/)
>
The contents are nonresponsive. No tool can fix a native responsiveness
issues.  I am familiar with those tools. The questions is why work so hard
when you have the HTML already?

> There are no broken links in our User Guide, and even external links (e.g.
> PyObject links to the Python documentation) work. Internal links to
> different parts of the PDF also work.
>
> I have not read our Reference Guide cover to cover in a while (other than
> the NumPy-C API chapter) but I do not remember any backticks anywhere.
> Please correct me if this is incorrect.
>
OK this one was on me. I've updated the reader and now things went back to
normal. Sorry for the noise.

> This isn't the case for Firefox's PDF viewer and others I have tried
> (Adobe, Zathura). Though on Linux most pdf copy-pastes can be a little
> difficult.
>
 All mentioned viewers fail to retain the format of the code and copy as
text. You should paste it to somewhere to get the problem. Unless it is
single line in the examples the rest is not really working properly. In
case you are not familiar there are quite a number of ways to fix this but
definitely not worth the effort.

> Untrue, our typography and layout might not be perfect but we do not have
> a lot of empty space.
>
This is demonstrably true, the document margins are against quite a few
technical document typography rules (mostly how page setup and typeface
choices are done). It is as it is because the documentation is also
following an indentation format very much like Python code which uses
whitespace too generously. The document itself is quite an eyesore if you
care about those things.np.einsum is a prime example how it shouldn't
render in a PDF document. All choices are coming from the indentation of
markdown. Because it uses none of the advantages of a PDF. That is
typography and font layout. It cannot use it because the source is not
providing context. Because it is coming from a function signature. This is
also related to Markdown comment below.

> We have a reasonable 30 minute timeout for the pdf build and we have
> discussed running this less frequently.
>
This doesn't change the fact that you are downloading way too many complex
tools and moving images that are bloated. Just because it is free does not
justify its use. It is just a huge waste to repeat that excessive
compilation each and every time. I would also say it is a bit on the
irresponsible side.

> Also can be mitigated, we can shift to tectonic or simply use a custom
> texlive install to have less packages (for the size issue).
>
No. This is still a very large payload mainly due to the typography tools
are used and their dependencies. Maintaining a custom TeXLive is just
asking for trouble since the packages are updated very frequently (I know
because we tried this many times at work to keep a mobile Receipt
generator).

> It is a very unstable workflow and errors out depending on the planets
> alignment because, again, it is coming from an awkward Markdown source
> which is not designed for. Becomes very annoying for maintainers to see it
> fail for otherwise a perfectly valid code.
>
> We don't have markdown sources?
>
What I mean is that LaTeX source is text-based with context in it. But we
are providing markdown sources. This causes problems in the meantime both
in translation and also layout.

> I understand that perhaps SciPy's documentation was in far worse shape
> than NumPy, but we shouldn't paint with a broad brush
>
That's not true. They are almost identical. These are common issues that
still exists in the NumPy version. To be honest, it is very hard to make a
case for PDF in its given condition. You can still compile and use it. We
shouldn't continue bothering with it at the CI level just because there is
a marginal interest in it. I am not ranting about NumPy because SciPy. This
is a very bad TeX design and to fix it we have to get away from auto-doc
generation which I am sure none of us want for now. That is unfortunately
how good docs are now today, mathworks constantly being praised about it
despite its notoriety. Hence I don't see any case for keeping generating
this PDF. If you want to have a prope

[Numpy-discussion] Re: Dropping the pdf documentation.

2022-05-23 Thread Ilhan Polat

On Mon, May 23, 2022 at 2:00 PM Rohit Goswami 
wrote:

> The contents are nonresponsive. No tool can fix a native responsiveness
> issues.  I am familiar with those tools. The questions is why work so hard
> when you have the HTML already?
>
> I'm afraid I don't understand this argument. It is true that PDFs are not
> responsive without software assistance, but HTML documents when printed
> (e.g. CTRL+P) do not have any way of generating the Appendix / outlinks to
> related sections etc. Yet we still have HTML documentation. IMO this simply
> means they are not mutually exclusive.
>
The argument is about why one should use PDF on a mobile device. I am not
even going to bother with the argument. The world moved on. See any app on
your device. No one renders PDF. Because this is not what it is designed
for. But everybody sends you a custom PDF for archival purposes. You might
think they are all wrong but that's your opinion.

> About this, the full page layout on the HTML pages has exactly the same
> amount of whitespace. It can be argued that for a full width layout there
> is exactly the same whitespace and indentation.
>
> Additionally, even trying to print out say, np.einsum will first have 3
> pages of the sidebar when using a naive CTRL+P approach.
>
> The argument that the typography is poor goes beyond the documentation
> format.
>
> In fact, even the "responsiveness" is rather overrated at the moment. With
> a mobile device again the first few screenfulls are simply the sidebar with
> routines and other things. After which there's still whitespace, and things
> are still just as indented as in the PDF. Only now I also don't have a
> global TOC which is easy to see.
>
It is not overrated. You are basically saying UX people are just doing
bloated fancy work. This is not how things are designed. I am not
recommending Ctrl+P on the HTML document. We are saying use the HTML files
offline. This is again not an issue to argue about. The pages are following
a Markdown format. If you want to have this subpar document you can
generate it yourself. Having a burden on the CI system because maybe
someone uses it is not sufficient reason to keep it.  Or at least that's my
motivation.

> The assertion that there is parity between serving ~2000 pages of
> documentation as HTML and ZIP files as opposed to a PDF seems to be flawed
> from the get go.
>
Again, you are comparing doc formats. The argument is to not to distribute
PDFs. If you like that document regardless of its current state then fine
do it. But as I mentioned, current state of affairs in the documentation
world is not even bothering with PDF. When do you get PDFs? Exactly in
technical manuals which are custom designed and provided with the products
for archiving specific to that product. A documentation that invalidates
its version every 6 months is again not a valid argument. PDF is a document
format. And you have to generate it properly. Currently it is a very bad
copy of the HTML version with no attention to the medium with which the
information is presented. And the burden is on you to provide significant
demand for it, the traffic to the site shows how much HTML is used.

> I should add that NumPy does indeed have a dedicated docs team and
> consolidated effort. As mentioned earlier we meet regularly about these
> issues and it would be nice if the meetings are not unequivocally
> sidestepped by the mailing list. We also apply for funding (GSoD / NumFocus
> SDG) for our docs.
>
> I understand there are frustrations with the PDF, but I am still not
> convinced at this point that the HTML versions are even at par with the PDF
> experience.
>
You are assuming that everyone is sharing your experience of PDF. I am also
not convinced that abusing PDF format warrants its use in documentation. As
I mentioned, the burden is you to prove its worth.  That's why we are
proposing to remove it. Otherwise it wouldn't be discussed here and removed
in SciPy.

> It is nice that I have the time and ability to generate my documentation
> locally for my niche needs should I so wish it. It is less nice that we
> assume that it must be niche and everyone would have the same energy
> because HTML is theoretically more responsive, even though our docs are not.
>
That's what we started with, it is a long and annoying nontrivial process
with diminishing outcome. We shouldn't spend this energy on a document that
is not requested in general. If there is a demand for it we don't see it
anywhere. Glad that we are aggressively agreeing.

Also HTML responsiveness need no proof. Just look at the page source on
your browser and change the size of your window. The docs are responsive
though not perfect (it collapses to the wrong frame in the smallest size
for example but fixable) but definitely much more readable. PDFs are
substantially more powerful than HTML but you need to exploit that by
custom documentation. Not with auto-generated signatures. Technical
writers, UX and documenta

[Numpy-discussion] Re: Dropping the pdf documentation.

2022-05-23 Thread Ilhan Polat

rst time we have this on the mailing list, so we should simply defer
> the discussion. We can add analytics to the /doc page and check back in a
> few releases. IMO the argument about "load" is a bit fallacious, we don't
> actually seem to be generating or serving PDFs per commit (but I might be
> wrong).
>
> Also, definitely not agreeing at all yet :)
>
> Also HTML responsiveness need no proof. Just look at the page source on
> your browser and change the size of your window. The docs are responsive
> though not perfect (it collapses to the wrong frame in the smallest size
> for example but fixable) but definitely much more readable.
>
> Here are some steps to reproduce the mobile experience. They rely on
> either using an actual mobile device or "web inspector" or "developer
> tools":
>
>- Switch to a standard format, and watch the sidebar take up most of
>the real-estate
>- Try using the zip as discussed above
>
> Additionally, I really don't intend to bash on the HTML, of course we
> could add more breakpoints, and special casing until it looks / behaves
> better.
>
> Until we do so, the removal of the PDFs seem awkward.
>
> As for CI load and metrics, we don't have hard numbers for a lot of these
> things and it feels strange to discuss what we feel is "responsible use".
>
> --- Rohit
>
> P.S. By happy coincidence, I see we have an upcoming documentation meeting
> in around 3 hours. As always, everyone is welcome to come discuss here:
> https://hackmd.io/oB_boakvRqKR-_2jRV-Qjg
>
> On 23 May 2022, at 12:42, Ilhan Polat wrote:
>
>
>
> On Mon, May 23, 2022 at 2:00 PM Rohit Goswami 
> wrote:
>
>> The contents are nonresponsive. No tool can fix a native responsiveness
>> issues.  I am familiar with those tools. The questions is why work so hard
>> when you have the HTML already?
>>
>> I'm afraid I don't understand this argument. It is true that PDFs are not
>> responsive without software assistance, but HTML documents when printed
>> (e.g. CTRL+P) do not have any way of generating the Appendix / outlinks to
>> related sections etc. Yet we still have HTML documentation. IMO this simply
>> means they are not mutually exclusive.
>>
> The argument is about why one should use PDF on a mobile device. I am not
> even going to bother with the argument. The world moved on. See any app on
> your device. No one renders PDF. Because this is not what it is designed
> for. But everybody sends you a custom PDF for archival purposes. You might
> think they are all wrong but that's your opinion.
>
>> About this, the full page layout on the HTML pages has exactly the same
>> amount of whitespace. It can be argued that for a full width layout there
>> is exactly the same whitespace and indentation.
>>
>> Additionally, even trying to print out say, np.einsum will first have 3
>> pages of the sidebar when using a naive CTRL+P approach.
>>
>> The argument that the typography is poor goes beyond the documentation
>> format.
>>
>> In fact, even the "responsiveness" is rather overrated at the moment.
>> With a mobile device again the first few screenfulls are simply the sidebar
>> with routines and other things. After which there's still whitespace, and
>> things are still just as indented as in the PDF. Only now I also don't have
>> a global TOC which is easy to see.
>>
> It is not overrated. You are basically saying UX people are just doing
> bloated fancy work. This is not how things are designed. I am not
> recommending Ctrl+P on the HTML document. We are saying use the HTML files
> offline. This is again not an issue to argue about. The pages are following
> a Markdown format. If you want to have this subpar document you can
> generate it yourself. Having a burden on the CI system because maybe
> someone uses it is not sufficient reason to keep it.  Or at least that's my
> motivation.
>
>> The assertion that there is parity between serving ~2000 pages of
>> documentation as HTML and ZIP files as opposed to a PDF seems to be flawed
>> from the get go.
>>
> Again, you are comparing doc formats. The argument is to not to distribute
> PDFs. If you like that document regardless of its current state then fine
> do it. But as I mentioned, current state of affairs in the documentation
> world is not even bothering with PDF. When do you get PDFs? Exactly in
> technical manuals which are custom designed and provided with the products
> for archiving specific to that product. A documentation that invalidates
> its version every 6 months is again not a valid argument.

[Numpy-discussion] Re: Feature request: dot product along arbitrary axes

2022-07-03 Thread Ilhan Polat

I don't understand. Both theretically and coding wise Matmul is the most
readable thing that you can have within those options. That is in fact what
the definition is.

Can you give an example?

On Mon, Jul 4, 2022, 04:49  wrote:

> Currently there are lots of ways to compute dot products (dot, vdot,
> inner, tensordot, einsum...), but none of them are really convenient for
> the case of arrays of vectors, where one dimension (usually the last or the
> first) is the vector dimension. The simplest way to do this currently is
> `np.sum(a * b, axis=axis)`, but this makes vector algebra less readable
> without a wrapper function, and it's probably not optimized as much as
> matrix products. Another way to do it is by adding appropriate dimensions
> and using matmul, but that's arguably less readable and not obvious to do
> generically for arbitrary axes. I think either np.dot or np.vdot could
> easily be extended with an `axis` parameter that would convert it into a
> bulk vector operation, with the same semantics as `np.sum(a * b,
> axis=axis)`. It should also maybe have a `keep_dims` parameter, which is
> useful for preserving broadcasting.
>
> I submitted a corresponding issue at
> https://github.com/numpy/numpy/issues/21915
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: Feature request: dot product along arbitrary axes

2022-07-05 Thread Ilhan Polat

It might be just me, that @ product is way more readable than chaining
different operators below that I don't find readable at all but anyways
that's taste I guess. Also if you are going to do this, for a better
performance code, you shouldn't bend the ops but you should wrangle the
array to the correct type so that you end up straightforward array ops.
Anyways, nevermind my noise if you are happy with it.



On Wed, Jul 6, 2022 at 1:36 AM  wrote:

> Maybe I wasn't clear, I'm talking about the 1-dimensional vector product,
> but applied to N-D arrays of vectors. Certainly dot products can be
> realized as matrix products, and often are in mathematics for convenience,
> but matrices and vectors are not the same thing, theoretically or coding
> wise. If I have two (M, N, k) arrays a and b where k is the vector
> dimension, to dot product them using matrix notation I have to do:
>
> (a[:, :, np.newaxis, :] @ b[:, :, :, np.newaxis])[:, :, 0, 0]
>
> Which I certainly don't find readable (I always have to scratch my head a
> little bit to figure out whether the newaxis's are in the right places). If
> this is a common operation in larger expressions, then it basically has to
> be written as a separate function, which then someone reading the code may
> have to look at for the semantics. It also breaks down if you want to write
> generic vector functions that may be applied along different axes; then you
> need to do something like
>
> np.squeeze(np.expand_dims(a, axis=axis) @ np.expand_dims(b, axis=axis+1),
> (axis, axis+1))
>
> (after normalizing the axis; if it's negative you'd need to do axis-1 and
> axis instead).
>
> Compare this to the simplicity, composability and consistency of:
>
> a.dot(b, axis=-1) * np.cross(c, d, axis=-1).dot(e, axis=-1) /
> np.linalg.norm(f, axis=-1)
>
> (the cross and norm operators already support an axis parameter)
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: How will Google DeepMind's AlphaTensor effect numpy

2022-10-26 Thread Ilhan Polat

Moreover, not to mention the result is mostly valid for mod2 aritmetic;
something the authors chose to mention it in the fine print causing this, a
bit overblown in my opinion, excitement.

So for now it seems like we don't need to take action for regular matmul.


On Wed, Oct 26, 2022, 16:03 Robert Kern  wrote:

> On Wed, Oct 26, 2022 at 9:31 AM  wrote:
>
>> Hello!
>>
>> I was curious on how AlphaTensor will effect NumPy and other similar
>> applications considering it has found a way to perform 3x3 matrix
>> multiplication efficiently.
>> https://www.deepmind.com/blog/discovering-novel-algorithms-with-alphatensor.
>> I am not even sure how NumPy does this under the hood, is it 2x2?
>>
>> Is anyone working on implementing this 3x3 algorithm for NumPy? Is it too
>> early, and if so why? Are there any concerns about this algorithm?
>>
>
> numpy links against accelerated linear algebra libraries like OpenBLAS and
> Intel MKL to provide the matrix multiplication. If they find that the
> AlphaTensor results are better than the options they currently have, then
> numpy will get them. In general, I think it is unlikely that they will be
> used. Even the older state of the art that they compare with, like
> Strassen's algorithm, are not often used in practice. Concerns like memory
> movement and the ability to use instruction-level parallelism on each kind
> of CPU tend to dominate over a marginal change in the number of
> multiplication operations. The answers to this StackOverflow question give
> some more information:
>
>
> https://stackoverflow.com/questions/1303182/how-does-blas-get-such-extreme-performance
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Re: status of long double support and what to do about it

2022-11-17 Thread Ilhan Polat

Thanks Ralf, this indeed strikes a chord also from SciPy point of view.

In my limited opinion, it's not just the dtype but also everything else
that this dtype is going to be used in defines its proper implementation.
Upstream (lapack, fft, so on) and downstream (practically everything else)
does not have generic support on all platforms hence we are providing
convenience to some and taking it away afterwards by not providing the
necessary machinery essentially making it a storage format (similar but
less problematic situation with half) with some valuable and convenient
arithmetics on top.

Hence I'd lean on (B) but I'm curious how we can pull off (C) because that
is going to be tricky for Cythonization etc. which would, probably in
practice, force downcasting to double to do anything with it. Or maybe I'm
missing the context of its utilization.



On Thu, Nov 17, 2022 at 11:12 PM Ralf Gommers 
wrote:

> Hi all,
>
> We have to do something about long double support. This is something I
> wanted to propose a long time ago already, and moving build systems has
> resurfaced the pain yet again.
>
> This is not a full proposal yet, but the start of a discussion and gradual
> plan of attack.
>
> The problem
> ---
> The main problem is that long double support is *extremely* painful to
> maintain, probably far more than justified. I could write a very long story
> about that, but instead I'll just illustrate with some of the key points:
>
> (1) `long double` is the main reason why we're having such a hard time
> with building wheels on Windows, for SciPy in particular. This is because
> MSVC makes long double 64-bit, and Mingw-w64 defaults to 80-bit. So we have
> to deal with Mingw-w64 toolchains, proposed compiler patches, etc. This
> alone has been a massive time sink. A couple of threads:
>   https://github.com/numpy/numpy/issues/20348
>
> https://discuss.scientific-python.org/t/releasing-or-not-32-bit-windows-wheels/282
> The first issue linked above is one of the key ones, with a lot of detail
> about the fundamental problems with `long double`. The Scientific Python
> thread focused more on Fortran, however that Fortran + Windows problem is
> at least partly the fault of `long double`. And Fortran may be rejuvenated
> with LFortran and fortran-lang.org; `long double` is a dead end.
>
> (2) `long double` is not a well-defined format. We support 9 specific
> binary representations, and have a ton of code floating around to check for
> those, and manually fiddle with individual bits in long double numbers.
> Part of that is the immediate pain point for me right now: in the configure
> stage of the build we consume object files produced by the compiler and
> parse them, matching some known bit patterns. This check is so weird that
> it's the only one that I cannot implement in Meson (short of porting the
> hundreds of lines of Python code for it to C), see
> https://github.com/mesonbuild/meson/issues/11068 for details. To get an
> idea of the complexity here:
>
> https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/setup_common.py#L264-L434
>
> https://github.com/numpy/numpy/blob/9e144f7c1598221510d49d8c6b79c66dc000edf6/numpy/core/src/npymath/npy_math_private.h#L179-L484
>
> https://github.com/numpy/numpy/blob/main/numpy/core/src/npymath/npy_math_complex.c.src#L598-L619
>
> https://github.com/numpy/numpy/blob/main/numpy/core/src/multiarray/dragon4.c#L2480-L3052
> Typically `long double` has multiple branches, and requires more code than
> float/double.
>
> (3) We spend a lot of time dealing with issues and PR to keep `long
> double` working, as well as dealing with hard-to-diagnose build issues.
> Which sometimes even stops people from building/contributing, especially on
> Windows. Some recent examples:
> https://github.com/numpy/numpy/pull/20360
> https://github.com/numpy/numpy/pull/18536
> https://github.com/numpy/numpy/pull/21813
> https://github.com/numpy/numpy/pull/22405
> https://github.com/numpy/numpy/pull/19950
> https://github.com/numpy/numpy/pull/18330/commits/aa9fd3c7cb
> https://github.com/scipy/scipy/issues/16769
> https://github.com/numpy/numpy/issues/14574
>
> (4) `long double` isn't all that useful. On both Windows and macOS `long
> double` is 64-bit, which means it is just a poor alias to `double`. So it
> does literally nothing for the majority of our users, except confuse them
> and take up extra memory. On Linux, `long double` is 80-bit precision,
> which means it doesn't do all that much there either, just a modest bump in
> precision.
>
> Let me also note that it's not just the user-visible dtypes that we have
> to consider; long double types are also baked into the libnpymath static
> library that we ship with NumPy. That's a thing we have to do something
> about anyway (shipping static libraries is not the best idea, see
> https://github.com/numpy/numpy/issues/20880). We just have to make sure
> to not forget about it when thinking about solu

[Numpy-discussion] Re: Addition of useful new functions from the array API specification

2022-12-07 Thread Ilhan Polat

On matrix_transpose() :
Every time this discussion brought up, there was a huge resistance to add
more methods to array object or new functions (I have been involved in some
of them on the pro .H side, links you have given and more in the mailing
list) and now we are adding .mT and not .H? That is very surprising to me
(and disappointing) after forcing people to write A.conj().T for years
which is fundamentally the most common 2D operation regarding transpose and
from a user experience perspective. Having a function name with the word
"matrix" is already problematic but I can live with that. But adding .mT to
the main namespace seems really going against all decisions made in the
past. I also wish for cache-oblivious inplace transposition too that would
make many linalg functions perform faster but I wouldn't dare to propose
inplace_transpose() or .iT because it is not that important for *all*
users. And in a way, neither is mT.

Again not trying to starting an old dumpster fire but surely there must
have been some consideration for .H before we ended up with milliTranspose.
In fact, as I am typing this, I am already regretting it. Just a rant about
typing too much conj().T lately I guess.







On Wed, Dec 7, 2022 at 10:26 PM Aaron Meurer  wrote:

> Hi all.
>
> As discussed in today's community meeting, I plan to start working on
> adding some useful functions to NumPy which are part of the array API
> standard https://data-apis.org/array-api/latest/index.html.
>
> Although these are all things that will be needed for NumPy to be
> standard compliant, my focus for now at least is going to be on new
> functionality that is useful for NumPy independent of the standard.
> The things that I (and possibly others) plan on working on are:
>
> - A new function matrix_transpose() and corresponding ndarray
> attribute x.mT. Unlike transpose(), matrix_transpose() will require at
> least 2 dimensions and only operate on the last two dimensions (it's
> effectively an alias for swapaxes(x, -1, -2)). This was discussed in
> the past at https://github.com/numpy/numpy/issues/9530 and
> https://github.com/numpy/numpy/issues/13797. See
>
> https://data-apis.org/array-api/latest/API_specification/generated/signatures.linear_algebra_functions.matrix_transpose.html
>
> - namedtuple outputs for eigh, qr, slogdet and svd. This would only
> apply to the instances where they currently return a tuple (e.g.,
> svd(compute_uv=False) would still just return an array). See the
> corresponding pages at
> https://data-apis.org/array-api/latest/extensions/index.html for the
> namedtuple names. These four functions are the ones that are part of
> the array API spec, but if there are other functions that aren't part
> of the spec which we'd like to update to namedtuples as well for
> consistency, I can look into that.
>
> - New functions matrix_norm() and vector_norm(), which split off the
> behavior of norm() between vector and matrix specific functionalities.
> This is a cleaner API and would allow these functions to be proper
> gufuncs. See
> https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.vector_norm.html
> and
> https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.matrix_norm.html
> .
>
> - New function vecdot() which does a broadcasted 1-D dot product along
> a specified axis
>
> https://data-apis.org/array-api/latest/API_specification/generated/signatures.linear_algebra_functions.vecdot.html#signatures.linear_algebra_functions.vecdot
>
> - New function svdvals(), which is equivalent to
> svd(compute_uv=False). The idea here is that functions that have
> different return types depending on keyword arguments are problematic
> for various reasons (e.g., they are hard to type annotate), so it's
> cleaner to split these APIs. Functionality-wise there's not much new
> here, so this is lower priority than the rest.
>
> - New function permute_dims(), which works just like transpose() but
> it has a required axis argument. This is more explicit and can't be
> confused with doing a matrix transpose, which transpose() does not do
> for stacked matrices by default.
>
> - Adding a copy argument to reshape(). This has already been discussed
> at https://github.com/numpy/numpy/issues/9818. The main motivation is
> to replace the current usage of modifying array.shape inplace. (side
> note: this also still needs to be added to numpy.array_api)
>
> You can see the source code of numpy.array_api for an idea of what
> pure Python implementations of these changes look like, but to be
> clear, the proposal here is to add these to the main NumPy namespace,
> not to numpy.array_api.
>
> One question I have is which of the new functions proposed should be
> implemented as pure Python wrappers and which should be implemented in
> C as ufuncs/gufuncs?
>
> Unless there are any objections, I plan to start working on
> implementing these right away.
>
> Aaron Meurer
> ___

[Numpy-discussion] Re: Addition of useful new functions from the array API specification

2022-12-08 Thread Ilhan Polat

I am familiar with that issue and many older ones in this mailing list too.

The argument I am trying to make is that just because it is a view should
not directly imply that it should go in the NumPy main namespace. I don't
know what array API designers think but .H is order of magnitude more
common than tensor transpose. numpy.sort is also inplace and also tricky
but we have it. Plus ".H" can have the correct transpose for the tensors
and can behave like matrix_transpose for reals or whatever. The point I was
trying to make is that this array API spec should also involve usability
aspects of the tool and if complex operations are not discussed in array
API then either this API spec is incomplete or it is a float array API.

But like I said, I don't want to start any discussion, did a bit too much
spectral work on time-series lately so finger-wounds are still fresh I
guess. Apologies for the rant.




On Thu, Dec 8, 2022 at 11:54 PM Aaron Meurer  wrote:

> On Wed, Dec 7, 2022 at 3:49 PM Ilhan Polat  wrote:
> >
> >
> > On matrix_transpose() :
> > Every time this discussion brought up, there was a huge resistance to
> add more methods to array object or new functions (I have been involved in
> some of them on the pro .H side, links you have given and more in the
> mailing list) and now we are adding .mT and not .H? That is very surprising
> to me (and disappointing) after forcing people to write A.conj().T for
> years which is fundamentally the most common 2D operation regarding
> transpose and from a user experience perspective. Having a function name
> with the word "matrix" is already problematic but I can live with that. But
> adding .mT to the main namespace seems really going against all decisions
> made in the past. I also wish for cache-oblivious inplace transposition too
> that would make many linalg functions perform faster but I wouldn't dare to
> propose inplace_transpose() or .iT because it is not that important for
> *all* users. And in a way, neither is mT.
> >
> > Again not trying to starting an old dumpster fire but surely there must
> have been some consideration for .H before we ended up with milliTranspose.
> In fact, as I am typing this, I am already regretting it. Just a rant about
> typing too much conj().T lately I guess.
>
> .H was discussed in this issue https://github.com/numpy/numpy/issues/13797
>
> The problem with .H is that it wouldn't be a view, since it takes a
> conjugate. Some ideas were suggested to fix this, but they are much
> more nontrivial to implement, and it's not even clear if they are
> desired (basically you'd need a new complex conjugate dtype). x.mT on
> the other hand can easily be a view, since it's basically just a
> shorthand for swapaxes(x, -1, -2).
>
> More to the point, my plan here is only to work on functions that are
> part of the array API specification (and possibly extending these
> features to related things like adding namedtuples for other functions
> if there are any). Hermitian transpose has not yet been discussed for
> addition to the array API specification, which only recently gained
> support for complex numbers.
>
> Aaron Meurer
>
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Dec 7, 2022 at 10:26 PM Aaron Meurer  wrote:
> >>
> >> Hi all.
> >>
> >> As discussed in today's community meeting, I plan to start working on
> >> adding some useful functions to NumPy which are part of the array API
> >> standard https://data-apis.org/array-api/latest/index.html.
> >>
> >> Although these are all things that will be needed for NumPy to be
> >> standard compliant, my focus for now at least is going to be on new
> >> functionality that is useful for NumPy independent of the standard.
> >> The things that I (and possibly others) plan on working on are:
> >>
> >> - A new function matrix_transpose() and corresponding ndarray
> >> attribute x.mT. Unlike transpose(), matrix_transpose() will require at
> >> least 2 dimensions and only operate on the last two dimensions (it's
> >> effectively an alias for swapaxes(x, -1, -2)). This was discussed in
> >> the past at https://github.com/numpy/numpy/issues/9530 and
> >> https://github.com/numpy/numpy/issues/13797. See
> >>
> https://data-apis.org/array-api/latest/API_specification/generated/signatures.linear_algebra_functions.matrix_transpose.html
> >>
> >> - namedtuple outputs for eigh, qr, slogdet and svd. This would only
> >> apply to the instances where they currently return a tuple (e.g.,
> >> svd(compute_uv=False) would still just return an array). See t

[Numpy-discussion] Re: non normalised eigenvectors

2023-02-25 Thread Ilhan Polat

Could you elaborate a bit more about what you mean with original
eigenvectors? They denote the direction hence you can scale them to any
size anyways.

On Sat, Feb 25, 2023 at 5:38 PM  wrote:

> Dear all,
>
> I am not an expert in NumPy but my undergraduate student is having some
> issues with the way Numpy returns the normalized eigenvectors corresponding
> to the eigenvalues. We do understand that an eigenvector is divided by the
> norm to get the unit eigenvectors, however we do need the original vectors
> for the purpose of my research. This has been a really frustrated
> experience as NumPy returns the normalized vectors as a default. I
> appreciate any suggestions of how to go about this issue. This seems to be
> a outstanding issue from people using Numpy.
>
> Thanks
>
> LP
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ilhanpo...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Linking F2PY to Openblas on Windows

2023-03-21 Thread Ilhan Polat

Dear all,

I'd like to submit to our f2py crew here for a seemingly trivial problem
that has been bugging me for a while. I am trying to compile a Fortran
library (SLICOT [1] ) on windows with MinGW UCRT 64 interface. But let me
simplify the issue.

Let say I have a fortran file, MB01OE.f. It is a standalone code with only
extenal dependency of bunch of LAPACK routines and I'd like to compile this
into a module (if I can pull this off, I can include the rest) Hence I
thought I can just link it with the existing OpenBLAS on my computer.

Hence I use the following command

 f2py -L/opt/64/lib/ -lopenblas -c --verbose
.\SLICOT-Reference\src\MB01OE.f --lower -m slicot

which then gives me the output below. My original plan was to use meson for
this but I can't even make the manual route work so that would wait until I
fix this issue. I'd be grateful if someone can spot the mistake I'm making.


[1] : https://github.com/SLICOT/SLICOT-Reference




running build
running config_cc
INFO: unifing config_cc, config, build_clib, build_ext, build commands
--compiler options
running config_fc
INFO: unifing config_fc, config, build_clib, build_ext, build commands
--fcompiler options
running build_src
INFO: build_src
INFO: building extension "slicot" sources
INFO: f2py options: ['--lower']
INFO: f2py:>
C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10\slicotmodule.c
creating C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10
Reading fortran codes...
Reading file '.\\SLICOT-Reference\\src\\MB01OE.f'
(format:fix,strict)
Line #107 in .\SLICOT-Reference\src\MB01OE.f:"  PARAMETER (
ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 )"
get_parameters: got "eval() arg 1 must be a string, bytes or code
object" on 4
Line #107 in .\SLICOT-Reference\src\MB01OE.f:"  PARAMETER (
ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 )"
get_parameters: got "eval() arg 1 must be a string, bytes or code
object" on 4
Line #107 in .\SLICOT-Reference\src\MB01OE.f:"  PARAMETER (
ZERO = 0.0D0, ONE = 1.0D0, TWO = 2.0D0 )"
get_parameters: got "eval() arg 1 must be a string, bytes or code
object" on 4
rmbadname1: Replacing "max" with "max_bn".
Post-processing...
Block: slicot
Block: mb01oe
In: :slicot:.\SLICOT-Reference\src\MB01OE.f:mb01oe
get_parameters: got "eval() arg 1 must be a string, bytes or code object"
on 4
In: :slicot:.\SLICOT-Reference\src\MB01OE.f:mb01oe
get_parameters: got "eval() arg 1 must be a string, bytes or code object"
on 4
In: :slicot:.\SLICOT-Reference\src\MB01OE.f:mb01oe
get_parameters: got "eval() arg 1 must be a string, bytes or code object"
on 4
Applying post-processing hooks...
  character_backward_compatibility_hook
Post-processing (stage 2)...
Building modules...
Building module "slicot"...
Generating possibly empty wrappers"
Maybe empty "slicot-f2pywrappers.f"
Constructing wrapper function "mb01oe"...
getarrdims:warning: assumed shape array, using 0 instead of '*'
getarrdims:warning: assumed shape array, using 0 instead of '*'
getarrdims:warning: assumed shape array, using 0 instead of '*'
  mb01oe(uplo,trans,n,alpha,beta,r,h,e,[ldr,ldh,lde])
Wrote C/API module "slicot" to file
"C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10\slicotmodule.c"
INFO:   adding
'C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10\fortranobject.c'
to sources.
INFO:   adding
'C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10' to
include_dirs.
copying
C:\Users\ilhan\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\f2py\src\fortranobject.c
-> C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10
copying
C:\Users\ilhan\AppData\Local\Programs\Python\Python310\lib\site-packages\numpy\f2py\src\fortranobject.h
-> C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10
INFO:   adding
'C:\Users\ilhan\AppData\Local\Temp\tmpr1ct3ys_\src.win-amd64-3.10\slicot-f2pywrappers.f'
to sources.
INFO: build_src: building npy-pkg config files
running build_ext
INFO: No module named 'numpy.distutils._msvccompiler' in numpy.distutils;
trying from distutils
DEBUG: new_compiler returns 
INFO: customize MSVCCompiler
INFO: customize MSVCCompiler using build_ext


libraries = []
library_dirs  =
['C:\\Users\\ilhan\\AppData\\Local\\Programs\\Python\\Python310\\libs',
'C:\\Users\\ilhan\\AppData\\Local\\Programs\\Python\\Python310\\PCbuild\\amd64']
include_dirs  =
['C:\\Users\\ilhan\\AppData\\Local\\Programs\\Python\\Python310\\include',
'C:\\Users\\ilhan\\AppData\\Local\\Programs\\Python\\Python310\\Include']

Unable to find productdir in registry
Env var VS140COMNTOOLS is not set or invalid
No productdir found
INFO: get_default_fcompiler: matching types: '['gnu', 'intelv', 'absoft',
'compaqv', 'int

[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

2023-08-15 Thread Ilhan Polat

On Tue, Aug 15, 2023 at 2:44 PM  wrote:

> > From my point of view, such function is a bit of a corner-case to be
> added to numpy. And it doesn’t justify it’s naming anymore. It is not one
> operation anymore. It is a cumsum and prepending 0. And it is very
> difficult to argue why prepending 0 to cumsum is a part of cumsum.
>
> That is backwards. Consider the array [x0, x1, x2].
>
> The sum of the first 0 elements is 0.
> The sum of the first 1 elements is x0.
> The sum of the first 2 elements is x0+x1.
> The sum of the first 3 elements is x0+x1+x2.
>
> Hence, the array of partial sums is [0, x0, x0+x1, x0+x1+x2].
>
> Thus, the operation [x0, x1, x2] -> [0, x0, x0+x1, x0+x1+x2] is a natural
> and primitive one.
>
>
You are describing ndarray.sum() behavior here inside an array as
intermediate results; sum is an aggregator that produces single item from a
list of items. Then you can argue about missing items behavior and the
values you have provided are exactly the values the accumulator would get.
However, cumsum, cumprod, diff etc. are "array functions". In other words
they provide fast vectorized access to otherwise laborious for loops. You
have to consider the equivalent for loops working on the array *data*, not
the ideal math framework over the number field. You don't start with the
array element that is before the first element for an array function hence
no elements -> 0 is only applicable to sum but not to the array function.
Or at least that would be my argument.

If you have no element meaning 0 elements the cumulative sum is not 0, it
is the empty array. Because there is no array to cumulatively "sum"
(remember we are working on the array to generate another array, not
aggregating). You can argue what empty set translates to under summation
etc. but I don't think it applies here. But that's my opinion. I'm not sure
why folks wanted to have this at all. It is the same as asking whether this
code

for k in range(0):
...some code ...

should at least spin once (fortran-ish behavior). I don't know why it
should. But then again, it becomes a bikeshedding with some conflicting
idealistic mathy axioms thrown at each other.

NumPy cumsum returns empty array for empty array (I think all software does
this including matlab). ndarray.sum() however returns scalar 0 (and I think
most software does this too), because that's pretty much a no-op over the
initialization value and aggregated, in the example above

x=0
for k in range(0):
x += 1
return x # returns 0

I think all these point to the missing convenient functionality that
extends arrays. In matlab "[0 arr 10]" nicely extends the array to a new
one but in NumPy you need to punch quite some code and some courage to
remember whether it is hstack or vstack or concat or block as the correct
naming which decreases the "code morale". So if people want to quickly
extend arrays they either have to change the code for their needs or create
larger arrays which is pretty much #6044. So I think this is a feature
request of "prepend", "append" in a convenient fashion not to ufuncs but to
ndarray. Because concatenation is just pain in NumPy and ubiquitous
operation all around. Hence probably we should get a decision on that
instead of discussing each case separately.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

89 matches

Mail list logo