Re: [Numpy-discussion] Interface numpy arrays to Matlab?

2017-08-29 Thread Andras Deak
On Tue, Aug 29, 2017 at 1:08 PM, Neal Becker  wrote:
> [...]
> [I] would guess that inside every Matlab array is a numpy array crying to be
> freed - in both cases an array is a block of memory together with shape and
> stride information.  So I would hope a direct conversion could be done, at
> least via C API if not directly with python numpy API.  But it seems nobody
> has done this, so maybe it's not that simple?

I was going to suggest this Stack Overflow post earlier but figured
that you must have found it already:
https://stackoverflow.com/questions/34155829/how-to-efficiently-convert-matlab-engine-arrays-to-numpy-ndarray
Based on that it seems that at least arrays returned from the MATLAB
engine can be reasonably converted using their underlying data
(`_data` attribute, together with the `size` attribute to unravel
multidimensional arrays).
The other way around (i.e. passing numpy arrays to the MATLAB engine)
seems less straightforward: all I could find was
https://www.mathworks.com/matlabcentral/answers/216498-passing-numpy-ndarray-from-python-to-matlab
The comments there suggest that you can instantiate `matlab.double`
objects from lists that you can pass to the MATLAB engine. Explicitly
converting your arrays to lists along this step don't sound too good
to me.
Disclaimer: I haven't tried either methods.
Regards,

András Deák


> On Mon, Aug 28, 2017 at 5:32 PM Gregory Lee  wrote:
>>
>> I have not used Transplant, but it sounds fairly similar to
>> Python-matlab-bridge.  We currently optionally call Matlab via
>> Python-matlab-bridge in some of the the tests for the PyWavelets package.
>>
>> https://arokem.github.io/python-matlab-bridge/
>> https://github.com/arokem/python-matlab-bridge
>>
>> I would be interested in hearing about the benefits/drawbacks relative to
>> Transplant if there is anyone who has used both.
>>
>>
>> On Mon, Aug 28, 2017 at 4:29 PM, CJ Carey 
>> wrote:
>>>
>>> Looks like Transplant can handle this use-case.
>>>
>>> Blog post: http://bastibe.de/2015-11-03-matlab-engine-performance.html
>>> GitHub link: https://github.com/bastibe/transplant
>>>
>>> I haven't given it a try myself, but it looks promising.
>>>
>>> On Mon, Aug 28, 2017 at 4:21 PM, Stephan Hoyer  wrote:

 If you can use Octave instead of Matlab, I've had a very good experience
 with Oct2Py:
 https://github.com/blink1073/oct2py

 On Mon, Aug 28, 2017 at 12:20 PM, Neal Becker 
 wrote:
>
> I've searched but haven't found any decent answer.  I need to call
> Matlab from python.  Matlab has a python module for this purpose, but it
> doesn't understand numpy AFAICT.  What solutions are there for efficiently
> interfacing numpy arrays to Matlab?
>
> Thanks,
> Neal
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@python.org
 https://mail.python.org/mailman/listinfo/numpy-discussion

>>>
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] different values for ndarray when printed with or without [ ]

2017-10-18 Thread Andras Deak
On Wed, Oct 18, 2017 at 12:44 PM, Nissim Derdiger
 wrote:
> Hi all,
>
> I have a ndarray, that shows different values when called like that:
> print(arr) or like that print(arr[0::]).
>
> When changing it back to a python string (with list = arr.tolist()) – both
> prints return same value, but when converting that list back to np array
> (arr=np.array(list)) – the printing issue returns.
>
> Any ideas what may cause that?

Hi Nissim,

I suggest adding some specifics. What is the shape and dtype of your
array? What are the differences in values? In what way are they
different? The best would be if you could provide a minimal,
reproducible example.
Regards,

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] different values for ndarray when printed with or without

2017-10-18 Thread Andras Deak
On Wed, Oct 18, 2017 at 7:30 PM, Nissim Derdiger  wrote:
> 3. difference between values are:
> [  2.25699615e+02   5.51561475e-01   3.81394744e+00   1.03807904e-01]
> Instead of:
> [225.69961547851562, 0.5515614748001099, 3.8139474391937256, 
> 0.10380790382623672]

The behaviour you're describing sounds like a matter of
pretty-printing. Numpy uses a shortened format for printing numeric
values by default. When you convert to a list, you leave numpy behind
and you get the native python behaviour. If you want to control how
this pretty-printing happens in numpy, take a close look at
numpy.set_printoptions:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.set_printoptions.html
.
Now, I still don't see how taking a trivial view of your array would
affect this printing, but I believe your values themselves are
identical (i.e. correct) in both cases, and they are only displayed
differently. If you were to do further computations with your arrays,
the results would be the same.
Regards,

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] building numpy with python3.7

2018-01-18 Thread Andras Deak
Hello,

After failing with several attempts to build numpy on python 3.7.0a4,
the combo that worked with pip was
cython 0.28a0 (current master)
numpy 1.15.0.dev0 (current master)
in a fresh, clean venv.
Older cython (0.27.3 where the aforementioned issue seems to have been
solved https://github.com/cython/cython/issues/1955) didn't help.
Stable numpy 1.14.0 didn't seem to work even with cython master.
This seems to be the same setup that Thomas mentioned, I mostly want
to note that anything less doesn't seem to compile yet.

András

On Wed, Dec 20, 2017 at 9:15 AM, Hannes Breytenbach  wrote:
> Hi Chuck
>
> I'm using Cython 0.28a0.
>
> Hannes
>
> 
> From: "Charles R Harris" 
> To: "Discussion of Numerical Python" 
> Sent: Tuesday, December 19, 2017 10:01:40 PM
> Subject: Re: [Numpy-discussion] building numpy with python3.7
>
>
>
> On Mon, Dec 18, 2017 at 3:20 AM, Hannes Breytenbach 
> wrote:
>>
>> OK, thanks for the link to the issue.  I'm not using virtual environments
>> - I built python3.7 (and 2.7) with the `make altinstall` method. I managed
>> to get the numpy build working on 3.7 by removing the
>> `random/mtrand/mtrand.c` file so that it gets (re-)generated during the
>> build using the latest cython version.
>>
>> Thanks for the help!
>
>
> Just to be sure, which Cython version is that?
>
> Chuck
>
> !DSPAM:5a39704114261779167816!
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
> !DSPAM:5a39704114261779167816!
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] building numpy with python3.7

2018-01-18 Thread Andras Deak
On Thursday, January 18, 2018, Charles R Harris 
wrote:
>
>
> On Thu, Jan 18, 2018 at 8:54 AM, Andras Deak 
wrote:
>>
>> Hello,
>>
>> After failing with several attempts to build numpy on python 3.7.0a4,
>> the combo that worked with pip was
>> cython 0.28a0 (current master)
>> numpy 1.15.0.dev0 (current master)
>> in a fresh, clean venv.
>> Older cython (0.27.3 where the aforementioned issue seems to have been
>> solved https://github.com/cython/cython/issues/1955) didn't help.
>> Stable numpy 1.14.0 didn't seem to work even with cython master.
>> This seems to be the same setup that Thomas mentioned, I mostly want
>> to note that anything less doesn't seem to compile yet.
>
> Where did you get 1.14.0 source? If it was with pip, it was generated
using cython 0.26.1.
> Chuck

Yes, I did with my last attempts (in earlier attempts I did pull earlier
versions from github but only before I realized I had to upgrade cython
altogether). So that's probably it; sorry, I had no understanding of the
workings of pip.
I'll retry with cython stable and numpy from source, and only post again if
I find something surprising in order to reduce further noise.
Thanks,

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Hoop jumping, and other sports

2018-02-07 Thread Andras Deak
On Thu, Feb 8, 2018 at 12:35 AM, Allan Haldane  wrote:
> On 02/07/2018 04:26 PM, Charles R Harris wrote:
>> Hi All,
>>
>> I was thinking about things to do to simplify the NumPy development
>> process. One thing that came to mind was our use of prefixes on commits,
>> BUG, TST, etc. Those prefixes were originally introduced by David
>> Cournapeau when he was managing releases in order help him track commits
>> that might need backports. I like the prefixes, but now that we are
>> organized by PRs, rather than commits, the only place we really need
>> them, for some meaning of "need", is in the commit titles, and
>> maintainers can change and edit those without problems. So I would like
>> to propose that we no longer be picky about having them in the commit
>> summary line. Furthermore, that got me thinking that there are probably
>> other things we could do to simplify the development process. So I'd
>> like folks to weigh in with other ideas for simplification or complaints
>> about nit picky things that have annoyed them.
>>
>> Chuck
>
> When I was first contributing, the main obstacle was not the nitpicks
> but reading through all the contributor guidelines pages, as well as
> learning github. I also remember finding it hard to find that
> documentation in the first place.
>
> It is at
> https://docs.scipy.org/doc/numpy/dev/index.html
> and a shorter summary at
> https://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html
>
> Maybe we should have a much more prominent link about how to contribute,
> eg on the main "README.md" front page, or at the start of the user
> guide, which links to a "really really" short contributing guide for
> someone who does not use github, maybe a screenful or two only. Even the
> short development workflow above has lots of info that usually isn't
> needed and takes a long time to read through.
>
> Allan
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

As a new (though so far superficial) contributor I can say that I
didn't find it difficult at all to locate the resources needed for
contributing. I found the gitwash link very easily and naturally, and
I didn't find it too long nor confusing (but I did have a fresh
understanding of git itself, so I can't reliably assess the contents
from a git newcomer's standpoint). The willingness to touch numpy
without a thorough understanding of its internals is much more of a
barrier at least in my case.

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.pad -- problem?

2018-04-29 Thread Andras Deak
> mean(y):  -1.3778013372117948e-16
> ypad:
>  [-1.37780134e-16 -1.37780134e-16 -1.37780134e-16  0.e+00
>   3.09016994e+00  5.87785252e+00  8.09016994e+00  9.51056516e+00
>   1.e+01  9.51056516e+00  8.09016994e+00  5.87785252e+00
>   3.09016994e+00  1.22464680e-15 -3.09016994e+00 -5.87785252e+00
>  -8.09016994e+00 -9.51056516e+00 -1.e+01 -9.51056516e+00
>  -8.09016994e+00 -5.87785252e+00 -3.09016994e+00 -2.44929360e-15
>  -7.40148683e-17 -7.40148683e-17]
>
> The left pad is correct, but the right pad is different and not the mean of
> y)  --- why?

This is how np.pad computes mean padding:
https://github.com/numpy/numpy/blob/01541f2822d0d4b37b96f6b42e35963b132f1947/numpy/lib/arraypad.py#L1396-L1400
elif mode == 'mean':
for axis, ((pad_before, pad_after), (chunk_before, chunk_after)) \
in enumerate(zip(pad_width, kwargs['stat_length'])):
newmat = _prepend_mean(newmat, pad_before, chunk_before, axis)
newmat = _append_mean(newmat, pad_after, chunk_after, axis)

That is, first the mean is prepended, then appended, and in the latter
step the updates (front-padded) array is used for computing the mean
again. Note that with arbitrary precision this is fine, since
appending n*`mean` to an array with mean `mean` should preserve the
mean. But with doubles you can get errors on the order of the machine
epsilon, which is what happens here:

In [16]: ypad[3:-2].mean()
Out[16]: -1.1663302849022412e-16

In [17]: ypad[:-2].mean()
Out[17]: -3.700743415417188e-17

So the prepended values are `y.mean()`, but the appended values are
`ypad[:-2].mean()` which includes the near-zero padding values. I
don't think this error should be a problem in practice, but I agree
it's surprising.

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.pad -- problem?

2018-04-29 Thread Andras Deak
PS. my exact numbers are different from yours (probably a
multithreaded thing?), but `ypad[:-2].mean()` agrees with the last 3
elements in `ypad` in my case and I'm sure this is true for yours too.

On Sun, Apr 29, 2018 at 11:36 PM, Andras Deak  wrote:
>> mean(y):  -1.3778013372117948e-16
>> ypad:
>>  [-1.37780134e-16 -1.37780134e-16 -1.37780134e-16  0.e+00
>>   3.09016994e+00  5.87785252e+00  8.09016994e+00  9.51056516e+00
>>   1.e+01  9.51056516e+00  8.09016994e+00  5.87785252e+00
>>   3.09016994e+00  1.22464680e-15 -3.09016994e+00 -5.87785252e+00
>>  -8.09016994e+00 -9.51056516e+00 -1.e+01 -9.51056516e+00
>>  -8.09016994e+00 -5.87785252e+00 -3.09016994e+00 -2.44929360e-15
>>  -7.40148683e-17 -7.40148683e-17]
>>
>> The left pad is correct, but the right pad is different and not the mean of
>> y)  --- why?
>
> This is how np.pad computes mean padding:
> https://github.com/numpy/numpy/blob/01541f2822d0d4b37b96f6b42e35963b132f1947/numpy/lib/arraypad.py#L1396-L1400
> elif mode == 'mean':
> for axis, ((pad_before, pad_after), (chunk_before, chunk_after)) \
> in enumerate(zip(pad_width, kwargs['stat_length'])):
> newmat = _prepend_mean(newmat, pad_before, chunk_before, axis)
> newmat = _append_mean(newmat, pad_after, chunk_after, axis)
>
> That is, first the mean is prepended, then appended, and in the latter
> step the updates (front-padded) array is used for computing the mean
> again. Note that with arbitrary precision this is fine, since
> appending n*`mean` to an array with mean `mean` should preserve the
> mean. But with doubles you can get errors on the order of the machine
> epsilon, which is what happens here:
>
> In [16]: ypad[3:-2].mean()
> Out[16]: -1.1663302849022412e-16
>
> In [17]: ypad[:-2].mean()
> Out[17]: -3.700743415417188e-17
>
> So the prepended values are `y.mean()`, but the appended values are
> `ypad[:-2].mean()` which includes the near-zero padding values. I
> don't think this error should be a problem in practice, but I agree
> it's surprising.
>
> András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.pad -- problem?

2018-04-29 Thread Andras Deak
On Sun, Apr 29, 2018 at 11:39 PM, Eric Wieser
 wrote:
> I would consider this a bug, and think we should fix this.

In that case `mode='median'` should probably fixed as well.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matmul as a ufunc

2018-05-29 Thread Andras Deak
On Tue, May 29, 2018 at 5:40 AM, Stephan Hoyer  wrote:
> But given that idiomatic NumPy code uses 1D arrays in favor of explicit
> row/column vectors with shapes (1,n) and (n,1), I do think it does make
> sense for matrix transpose on 1D arrays to be the identity, because matrix
> transpose should convert back and forth between row and column vectors
> representations.
>
> Certainly, matrix transpose should error on 0d arrays, because it doesn't
> make sense to transpose a scalar.

Apologies for the probably academic nitpick, but if idiomatic code
uses 1d arrays as vectors then shouldn't scalars be compatible with
matrices with dimension (in the mathematical sense) of 1? Since the
matrix product of shapes (1,n) and (n,1) is (1,1) but the same for
shapes (n,) and (n,) is (), it might make sense after all for the
matrix transpose to be identity for scalars.
I'm aware that this is tangential to the primary discussion, but I'm
also wondering if I'm being confused about the subject (wouldn't be
the first time that I got confused about numpy scalars).

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matmul as a ufunc

2018-05-29 Thread Andras Deak
On Tue, May 29, 2018 at 12:16 PM, Daπid  wrote:
> Right now, np.int(8).T throws an error, but np.transpose(np.int(8)) gives a
> 0-d array. On one hand, it is nice to be able to use the same code for

`np.int` is just python `int`! What you mean is `np.int64(8).T` which
works fine, so does `np.array(8).T`.
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LaTeX version of boolean indexing

2018-10-11 Thread Andras Deak
On Thu, Oct 11, 2018 at 6:54 PM Matthew Harrigan
 wrote:
>
> Hello,
>
> I am documenting some code, translating the core of the algorithm to LaTeX.  
> The style I have currently is very similar to the einsum syntax (which is 
> awesome btw).  Here is an example of some of the basic operations in NumPy.  
> One part I do not know how to capture well is boolean indexing, ie:
>
> mask = np.array([1, 0, 1])
> x = np.array([1, 2, 3])
> y = x[mask]

That is fancy indexing with an index array rather than boolean
indexing. That's why the result is [2, 1, 2] rather than [1, 3].
In case this is really what you need, it's the case of your indices
originating from another sequence: `y_i = x_{m_i}` where `m_i` is your
indexing sequence.
For proper boolean indexing you lose the one-to-one correspondence
between input and output (due to the size almost always changing), so
you might not be able to formalize it this nicely with an index
appearing in both sides. But something with an indicator might work...

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] LaTeX version of boolean indexing

2018-10-11 Thread Andras Deak
On Thu, Oct 11, 2018 at 7:45 PM Matthew Harrigan
 wrote:
>
> What do you mean by indicator?
>

I mostly meant what wikipedia seems to call "set-builder notation"
(https://en.wikipedia.org/wiki/Set-builder_notation#Sets_defined_by_a_predicate).
Since your "input" is `{x_i | i in [0,1,2]}` but your output is a `y_j
for j in [0,1]`, the straightforward thing I could think of was
defining the set of valid `y_j` values (with an implicit assumption of
the order being preserved, I guess). This would mean you can say
something like
`y_i \in {x_j | m_j}` (omitting the \left/\right/\vert fluff for
simplicity here) where `m_j` are the elements of the boolean mask
(say, `m = [True, False, True]`). In this context I'd understand it
that `m_j` is the predicate and `x_j` are the corresponding values,
however the notation isn't entirely ambiguous (see also a remark on
the above wikipedia page) so you can't really get away with omitting
further explanation in order to resolve ambiguity. Though I guess
calling `m_j` elements of a mask would do the same thing.
The other option that comes to mind is to define the auxiliary indices
`n_i` for which `m_j` are True, then you of course denote the result
with integer indices: `y_i = x_{n_i}` where `i` goes from 0 to the
number of `True`s in `m_j`. But then you have the same difficulty
defining `n_i`.
All in all I'm not sure there's an elegant and concise notation for
boolean masking.

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Depreciating asfortranarray and ascontiguousarray

2018-10-25 Thread Andras Deak
On Thu, Oct 25, 2018 at 11:48 PM Joseph Fox-Rabinovitz
 wrote:
>
> In that vein, would it be advisable to re-implement them as aliases for the 
> correctly behaving functions instead?
>
> - Joe

Wouldn't "probably, can't be changed without breaking external code"
still apply? As I understand the suggestion for _deprecation_ is only
because there's (a lot of) code relying on the current behaviour (or
at least there's risk).

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy pprint?

2018-11-06 Thread Andras Deak
On Tue, Nov 6, 2018 at 8:26 AM Foad Sojoodi Farimani
 wrote:
>
> Dear Mark,
>
> Thanks for the reply. I will write in between your lines:
>
> On Tue, Nov 6, 2018 at 6:11 AM Mark Harfouche  
> wrote:
>>
>> Foad,
>>
>> Visualizing data is definitely a complex field. I definitely feel your pain.
>
> I have actually been using numpy for a couple of years without noticing these 
> issues. recently I have been trying to encourage my collogues to move from 
> MATLAB to Python and also prepare some workshops for PhD network of my 
> university.
>>
>> Printing your data is but one way of visualizing it, and probably only 
>> useful for very small and constrained datasets.
>
> well actually it can be very useful. Consider Pandas .head() and .tail() 
> methods or Sympy's pretty printing functionalities. for bigger datasets the 
> function can get the terminals width and height and then based on the input 
> (U(n),D(n),L(n),R(n),UR(n,m),UL(n,m),DR(n,m),DL(n,m)) display what can be 
> shown and put horizontal 3-dots \u2026 … or vertical/inclined ones. Or id it 
> is Jupyter then one can use Markdown/LaTeX for pretty printing or even HTML 
> to add sliders as suggested by Eric.
>>
>> Have you looked into set_printoptions to see how numpy’s existing 
>> capabilities might help you with your visualization?
>
> This is indeed very useful. specially the threshold option can help a lot 
> with adjusting the width. but only for specific cases.
>>
>> The code you showed seems quite good. I wouldn’t worry about performance 
>> when it comes to functions that will seldom be called in tight loops.
>
> Thanks but I know it is very bad:
>
> it does not work properly for floats
> it only works for 1D and 2D
> there can be some recursive function I believe.
>>
>> As you’ll learn more about python and numpy, you’ll keep expanding it to 
>> include more use cases.
>> For many of my projects, I create small submodules for visualization 
>> tailored to the specific needs of the particular project.
>> I’ll try to incorporate your functions and see how I use them.
>
> Thanks a lot. looking forwards to your feedback
>>
>> Your original post seems to have some confusion about C Style vs F Style 
>> ordering. I hope that has been resolved.
>
> I actually came to the conclusion that calling it C-Style or F-Style or maybe 
> row-major column-major are bad practices. Numpy's ndarrays are not 
> mathematical multidimensional arrays but Pythons nested, homogenous and 
> uniform lists.  it means for example 1, [1], [[1]] and [[[1]]] are all 
> different, while in all other mathematical languages out there (including 
> Sympy's matrices) they are the same.

I'm probably missing your point, because I don't understand your
claim. Mathematically speaking, 1 and [1] and [[1] and [[[1]]] are
different objects. One is a scalar, the second is an element of R^n
with n=1 which is basically a scalar too from a math perspective, the
third one is a 2-index object (an operator acting on R^1), the last
one is a three-index object. These are all mathematically distinct.
Furthermore, row-major and column-major order are a purely technical
detail describing how the underlying data that is being represented by
these multidimensional arrays is laid out in memory. So C/F-style
order and the semantics of multidimensional arrays, at least as I see
it, are independent notions.

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 'nansqrt' function?

2019-02-14 Thread Andras Deak
On Thu, Feb 14, 2019 at 10:46 AM Mauro Cavalcanti  wrote:
>
> Chuck,
>
> IPython is full of secrets! More traditional users (myself included) usually 
> look for the official documentation, so it would be really useful if such 
> hints were also available there.

Sorry if I'm stating the obvious, but what ipython does is merely give
you the appropriate docstring, in this case the module docstring of
numpy.lib.nanfunctions
(https://github.com/numpy/numpy/blob/master/numpy/lib/nanfunctions.py#L1-L22).
So you have to know where to look to find it, but it's official.
I believe Chuck's remark of "looks like a module level entry should be
added to the documentation under 'NaN functions
(numpy.lib.nanfunctions)'" suggests exactly to have this information
in the online docs, somewhere at
https://docs.scipy.org/doc/numpy/reference/routines.html
Regards,

András


>
> Best regards,
>
> Em qui, 14 de fev de 2019 às 00:44, Charles R Harris 
>  escreveu:
>>
>>
>>
>> On Wed, Feb 13, 2019 at 3:45 PM Mauro Cavalcanti  wrote:
>>>
>>> Chuck,
>>>
>>> I attempted to find such a list from the Numpy website. A complete list 
>>> like yours should be quite handy for users if available there.
>>>
>>
>> In ipython
>>
>> In [1]: numpy.lib.nanfunctions?
>>
>> will give it to you. But it looks like a module level entry should be added 
>> to the documentation under "NaN functions (numpy.lib.nanfunctions)" in a 
>> "Routines" entry. Maybe "Histograms" also.
>>
>> 
>>
>> Chuck
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Dr. Mauro J. Cavalcanti
> E-mail: mauro...@gmail.com
> Web: http://sites.google.com/site/maurobio
> "Life is complex. It consists of real and imaginary parts."
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] [SciPy-User] Why slicing Pandas column and then subtract gives NaN?

2019-02-15 Thread Andras Deak
> The original data was in CSV format. I read it in using pd.read_csv(). It 
> does have column names, but no row names. I don’t think numpy reads csv files
I routinely read csv files using numpy.loadtxt
https://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html

> And also, when I do a[2:5]-b[:3], it does not throw any “index out of range” 
> error. I was able to catch that, but in both Matlab and R. You get an error. 
> This is frustrating!!
That's basic slicing behaviour of python. You might like it or not,
but it's baked into the language:
>>> [1,2][:10], [1,2][5:7]
([1, 2], [])
One would need very good reasons to break this in case of a third-party library.

András

> 
> From: NumPy-Discussion 
>  on behalf of Juan 
> Nunez-Iglesias 
> Sent: Friday, February 15, 2019 4:15 AM
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] [SciPy-User] Why slicing Pandas column and 
> then subtract gives NaN?
>
>
> I don’t have index when I read in the data. I just want to slice two series 
> to the same length, and subtract. That’s it!
>
> I also don’t what numpy methods wrapped within methods. They work, but hard 
> do understand.
>
> How would you do it? In Matlab or R, it’s very simple, one line.
>
>
> Why are you using pandas at all? If you want the Matlab equivalent, use NumPy 
> from the beginning (or as soon as possible). I personally agree with you that 
> pandas is too verbose, which is why I mostly use NumPy for this kind of 
> arithmetic, and reserve pandas for advanced data table type functionality 
> (like groupbys and joining on indices).
>
> As you saw yourself, a.values[1:4] - b.values[0:3] works great. If you read 
> in your data into NumPy from the beginning, it’ll be a[1:4] - b[0:3] just 
> like in Matlab. (Or even better: a[1:] - b[:-1]).
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python 3.8: heads up: numpy/ma/core.py causes warning that breaks numpy detection in boost build

2019-03-01 Thread Andras Deak
On Fri, Mar 1, 2019 at 11:24 AM  wrote:
> python-3.8.0 (alpha) emits the following warning on numpy/ma/core.py
> (verified up to official 1.16.2):
>
> 8<
> /usr/python3/site_python3/lib64/numpy.egg/numpy/ma/core.py:4466: 
> SyntaxWarning: "is" with a literal. Did you mean "=="?
>   if self.shape is ():
> /usr/python3/site_python3/lib64/numpy.egg/numpy/ma/core.py:4466: 
> SyntaxWarning: "is" with a literal. Did you mean "=="?
>   if self.shape is ():
> /usr/python3/site_python3/lib64/numpy.egg/numpy/core/include
> >8
Hi,

I believe the warning is really helpful, testing for an empty tuple
with `is` probably only works because of a CPython implementation
detail (namely that the empty tuple is interned by the interpreter).
In other words, this seems like a bug which is fixed by using ==
instead.
Regards,

András


>
>
> In turn, this breaks boost's numpy detection, which is done with:
>
> 8<
> notice: [python-cfg] running command '/usr/python3/bin/python -c "import sys; 
> sys.stderr = sys.stdout; import numpy; print(numpy.get_include())"'
> >8
>
> and thus produces an unusable include path (containing the warning
> message from python)
>
> The following:
>
> 8<
> *** numpy/ma/core.py.PYTHON_3_8_COMPAT  Wed Feb 27 12:45:06 2019
> --- numpy/ma/core.pyWed Feb 27 12:45:06 2019
> ***
> *** 4463,4469 
>   if m is nomask:
>   # compare to _count_reduce_items in _methods.py
>
> ! if self.shape is ():
>   if axis not in (None, 0):
>   raise np.AxisError(axis=axis, ndim=self.ndim)
>   return 1
> --- 4463,4469 
>   if m is nomask:
>   # compare to _count_reduce_items in _methods.py
>
> ! if self.shape == ():
>   if axis not in (None, 0):
>   raise np.AxisError(axis=axis, ndim=self.ndim)
>   return 1
> >8
>
> fixes everything for me
>
> thanks
> ciao
> -g
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] For broadcasting, can m by n by k matrix be multiplied with n by k matrix?

2019-04-19 Thread Andras Deak
On Sat, Apr 20, 2019 at 12:24 AM C W  wrote:
>
> Am I miss reading something? Thank you in advance!

Hey,

You are missing that the broadcasting rules typically apply to
arithmetic operations and methods that are specified explicitly to
broadcast. There is no mention of broadcasting in the docs of np.dot
[1], and its behaviour is a bit more complicated.
Specifically for multidimensional arrays (which you have), the doc says

If a is an N-D array and b is an M-D array (where M>=2), it is a sum
product over the last axis of a and the second-to-last axis of b:
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

So your (3,4,5) @ (3,5) would want to collapse the 4-length axis of
`a` with the 3-length axis of `b`; this won't work. If you want
elementwise multiplication according to the broadcasting rules, just
use `a * b`:

>>> a = np.arange(3*4*5).reshape(3,4,5)
... b = np.arange(4*5).reshape(4,5)
... (a * b).shape
(3, 4, 5)


[1]: https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] For broadcasting, can m by n by k matrix be multiplied with n by k matrix?

2019-04-19 Thread Andras Deak
I agree with Stephan, I can never remember how np.dot works for
multidimensional arrays, and I rarely need its behaviour. Einsum, on
the other hand, is both intuitive to me and more general.
Anyway, yes, if y has a leading singleton dimension then its transpose
will have shape (28,28,1) which leads to that unexpected trailing
singleton dimension. If you look at how the shape changes in each step
(first transpose, then np.dot) you can see that everything's doing
what it should (i.e. what you tell it to do).
With np.einsum you'd have to consider that you want to pair the last
axis of X with the first axis of y.T, i.e. the last axis of y
(assuming the latter has only two axes, so it doesn't have that
leading singleton). This would correspond to the rule 'abc,dc->abd',
or if you want to allow arbitrary leading dimensions on y,
'abc,...c->ab...':
>>> X = np.arange(3*4*5).reshape(3,4,5)
... y1 = np.arange(6*5).reshape(6,5)
... y2 = y1[:,None]  # inject leading singleton
... print(np.einsum('abc,dc->abd', X, y1).shape)
... print(np.einsum('abc,...c->ab...', X, y2).shape)
(3, 4, 6)
(3, 4, 6, 1)

András

On Sat, Apr 20, 2019 at 1:06 AM Stephan Hoyer  wrote:
>
> You may find np.einsum() more intuitive than np.dot() for aligning axes -- 
> it's certainly more explicit.
>
> On Fri, Apr 19, 2019 at 3:59 PM C W  wrote:
>>
>> Thanks, you are right. I overlooked it's for addition.
>>
>> The original problem was that I have matrix X (RBG image, 3 layers), and 
>> vector y.
>>
>> I wanted to do np(X, y.T).
>> >>> X.shape   # 100 of 28 x 28 matrix
>> (100, 28, 28)
>> >>> y.shape   # Just one 28 x 28 matrix
>> (1, 28, 28)
>>
>> But, np.dot() gives me four axis shown below,
>> >>> z = np.dot(X, y.T)
>> >>> z.shape
>> (100, 28, 28, 1)
>>
>> The fourth axis is unexpected. Should y.shape be (28, 28), not (1, 28, 28)?
>>
>> Thanks again!
>>
>> On Fri, Apr 19, 2019 at 6:39 PM Andras Deak  wrote:
>>>
>>> On Sat, Apr 20, 2019 at 12:24 AM C W  wrote:
>>> >
>>> > Am I miss reading something? Thank you in advance!
>>>
>>> Hey,
>>>
>>> You are missing that the broadcasting rules typically apply to
>>> arithmetic operations and methods that are specified explicitly to
>>> broadcast. There is no mention of broadcasting in the docs of np.dot
>>> [1], and its behaviour is a bit more complicated.
>>> Specifically for multidimensional arrays (which you have), the doc says
>>>
>>> If a is an N-D array and b is an M-D array (where M>=2), it is a sum
>>> product over the last axis of a and the second-to-last axis of b:
>>> dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
>>>
>>> So your (3,4,5) @ (3,5) would want to collapse the 4-length axis of
>>> `a` with the 3-length axis of `b`; this won't work. If you want
>>> elementwise multiplication according to the broadcasting rules, just
>>> use `a * b`:
>>>
>>> >>> a = np.arange(3*4*5).reshape(3,4,5)
>>> ... b = np.arange(4*5).reshape(4,5)
>>> ... (a * b).shape
>>> (3, 4, 5)
>>>
>>>
>>> [1]: https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] For broadcasting, can m by n by k matrix be multiplied with n by k matrix?

2019-04-19 Thread Andras Deak
Actually, the second version I wrote is inaccurate, because `y.T` will
permute the remaining axes in the result, but the '...' in einsum
won't do this.

On Sat, Apr 20, 2019 at 1:24 AM Andras Deak  wrote:
>
> I agree with Stephan, I can never remember how np.dot works for
> multidimensional arrays, and I rarely need its behaviour. Einsum, on
> the other hand, is both intuitive to me and more general.
> Anyway, yes, if y has a leading singleton dimension then its transpose
> will have shape (28,28,1) which leads to that unexpected trailing
> singleton dimension. If you look at how the shape changes in each step
> (first transpose, then np.dot) you can see that everything's doing
> what it should (i.e. what you tell it to do).
> With np.einsum you'd have to consider that you want to pair the last
> axis of X with the first axis of y.T, i.e. the last axis of y
> (assuming the latter has only two axes, so it doesn't have that
> leading singleton). This would correspond to the rule 'abc,dc->abd',
> or if you want to allow arbitrary leading dimensions on y,
> 'abc,...c->ab...':
> >>> X = np.arange(3*4*5).reshape(3,4,5)
> ... y1 = np.arange(6*5).reshape(6,5)
> ... y2 = y1[:,None]  # inject leading singleton
> ... print(np.einsum('abc,dc->abd', X, y1).shape)
> ... print(np.einsum('abc,...c->ab...', X, y2).shape)
> (3, 4, 6)
> (3, 4, 6, 1)
>
> András
>
> On Sat, Apr 20, 2019 at 1:06 AM Stephan Hoyer  wrote:
> >
> > You may find np.einsum() more intuitive than np.dot() for aligning axes -- 
> > it's certainly more explicit.
> >
> > On Fri, Apr 19, 2019 at 3:59 PM C W  wrote:
> >>
> >> Thanks, you are right. I overlooked it's for addition.
> >>
> >> The original problem was that I have matrix X (RBG image, 3 layers), and 
> >> vector y.
> >>
> >> I wanted to do np(X, y.T).
> >> >>> X.shape   # 100 of 28 x 28 matrix
> >> (100, 28, 28)
> >> >>> y.shape   # Just one 28 x 28 matrix
> >> (1, 28, 28)
> >>
> >> But, np.dot() gives me four axis shown below,
> >> >>> z = np.dot(X, y.T)
> >> >>> z.shape
> >> (100, 28, 28, 1)
> >>
> >> The fourth axis is unexpected. Should y.shape be (28, 28), not (1, 28, 28)?
> >>
> >> Thanks again!
> >>
> >> On Fri, Apr 19, 2019 at 6:39 PM Andras Deak  wrote:
> >>>
> >>> On Sat, Apr 20, 2019 at 12:24 AM C W  wrote:
> >>> >
> >>> > Am I miss reading something? Thank you in advance!
> >>>
> >>> Hey,
> >>>
> >>> You are missing that the broadcasting rules typically apply to
> >>> arithmetic operations and methods that are specified explicitly to
> >>> broadcast. There is no mention of broadcasting in the docs of np.dot
> >>> [1], and its behaviour is a bit more complicated.
> >>> Specifically for multidimensional arrays (which you have), the doc says
> >>>
> >>> If a is an N-D array and b is an M-D array (where M>=2), it is a sum
> >>> product over the last axis of a and the second-to-last axis of b:
> >>> dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
> >>>
> >>> So your (3,4,5) @ (3,5) would want to collapse the 4-length axis of
> >>> `a` with the 3-length axis of `b`; this won't work. If you want
> >>> elementwise multiplication according to the broadcasting rules, just
> >>> use `a * b`:
> >>>
> >>> >>> a = np.arange(3*4*5).reshape(3,4,5)
> >>> ... b = np.arange(4*5).reshape(4,5)
> >>> ... (a * b).shape
> >>> (3, 4, 5)
> >>>
> >>>
> >>> [1]: https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html
> >>> ___
> >>> NumPy-Discussion mailing list
> >>> NumPy-Discussion@python.org
> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-23 Thread Andras Deak
On Sun, Jun 23, 2019 at 10:37 PM Sebastian Berg
 wrote:
> Yeah, likely worth a short. I doubt many uses for the n-dimensional
> axis transpose, so maybe a futurewarning approach can work. If not, I
> suppose the solution is the deprecation for ndim != 2.

Any chance that the n-dimensional transpose is being used in code
interfacing fortran/matlab and python? One thing the current
multidimensional transpose is good for is to switch between row-major
and column-major order. I don't know, however, whether this switch
actually has to be done often in code, in practice.

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-25 Thread Andras Deak
On Tue, Jun 25, 2019 at 4:29 AM Cameron Blocker
 wrote:
>
> In my opinion, the matrix transpose operator and the conjugate transpose 
> operator should be one and the same. Something nice about both Julia and 
> MATLAB is that it takes more keystrokes to do a regular transpose instead of 
> a conjugate transpose. Then people who work exclusively with real numbers can 
> just forget that it's a conjugate transpose, and for relatively simple 
> algorithms, their code will just work with complex numbers with little 
> modification.
>

I'd argue that MATLAB's feature of `'` meaning adjoint (conjugate
transpose etc.) and `.'` meaning regular transpose causes a lot of
confusion and probably a lot of subtle bugs. Most people are unaware
that `'` does a conjugate transpose and use it habitually, and when
for once they have a complex array they don't understand why the
values are off (assuming they even notice). Even the MATLAB docs
conflate the two operations occasionally, which doesn't help at all.
Transpose should _not_ incur conjugation automatically. I'm already a
bit wary of special-casing matrix dynamics this much, when ndarrays
are naturally multidimensional objects. Making transposes be more than
transposes would be a huge mistake in my opinion, already for matrices
(2d arrays) and especially for everything else.

András



> Ideally, I'd like to see a .H that was the defacto Matrix/Linear 
> Algebra/Conjugate transpose that for 2 or more dimensions, conjugate 
> transposes the last two dimensions and for 1 dimension just conjugates (if 
> necessary). And then .T can stay the Array/Tensor transpose for general axis 
> manipulation. I'd be okay with .T raising an error/warning on 1D arrays if .H 
> did not. I commonly write things like u.conj().T@v even if I know both u and 
> v are 1D just so it looks more like an inner product.
>
> -Cameron
>
> On Mon, Jun 24, 2019 at 6:43 PM Ilhan Polat  wrote:
>>
>> I think enumerating the cases along the way makes it a bit more tangible for 
>> the discussion
>>
>>
>> import numpy as np
>> z = 1+1j
>> z.conjugate()  # 1-1j
>>
>> zz = np.array(z)
>> zz  # array(1+1j)
>> zz.T  # array(1+1j)  # OK expected.
>> zz.conj()  # 1-1j ?? what happened; no arrays?
>> zz.conjugate()  # 1-1j ?? same
>>
>> zz1d = np.array([z]*3)
>> zz1d.T  # no change so this is not the regular 2D array
>> zz1d.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
>> zz1d.conj().T  # array([1.-1.j, 1.-1.j, 1.-1.j])
>> zz1d.T.conj()  # array([1.-1.j, 1.-1.j, 1.-1.j])
>> zz1d[:, None].conj()  # 2D column vector - no surprises if [:, None] is known
>>
>> zz2d = zz1d[:, None]  # 2D column vector - no surprises if [:, None] is known
>> zz2d.conj()  # 2D col vec conjugated
>> zz2d.conj().T  # 2D col vec conjugated transposed
>>
>> zz3d = np.arange(24.).reshape(2,3,4).view(complex)
>> zz3d.conj()  # no surprises, conjugated
>> zz3d.conj().T  # ?? Why not the last two dims swapped like other stacked ops
>>
>> # For scalar arrays conjugation strips the number
>> # For 1D arrays transpose is a no-op but conjugation works
>> # For 2D arrays conjugate it is the matlab's elementwise conjugation op .'
>> # and transpose is acting like expected
>> # For 3D arrays conjugate it is the matlab's elementwise conjugation op .'
>> # but transpose is the reversing all dims just like matlab's permute()
>> # with static dimorder.
>>
>> and so on. Maybe we can try to identify all the use cases and the quirks 
>> before we can make design the solution. Because these are a bit more 
>> involved and I don't even know if this is exhaustive.
>>
>>
>> On Mon, Jun 24, 2019 at 8:21 PM Marten van Kerkwijk 
>>  wrote:
>>>
>>> Hi Stephan,
>>>
>>> Yes, the complex conjugate dtype would make things a lot faster, but I 
>>> don't quite see why we would wait for that with introducing the `.H` 
>>> property.
>>>
>>> I do agree that `.H` is the correct name, giving most immediate clarity 
>>> (i.e., people who know what conjugate transpose is, will recognize it, 
>>> while likely having to look up `.CT`, while people who do not know will 
>>> have to look up regardless). But at the same time agree that the docstring 
>>> and other documentation should start with "Conjugate tranpose" - good to 
>>> try to avoid using names of people where you have to be in the "in crowd" 
>>> to know what it means.
>>>
>>> The above said, if we were going with the initial suggestion of `.MT` for 
>>> matrix transpose, then I'd prefer `.CT` over `.HT` as its conjugate version.
>>>
>>> But it seems there is little interest in that suggestion, although sadly a 
>>> clear path forward has not yet emerged either.
>>>
>>> All the best,
>>>
>>> Marten
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.py

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-25 Thread Andras Deak
On Tue, Jun 25, 2019 at 1:03 PM Ilhan Polat  wrote:
>
> I have to disagree, I hardly ever saw such bugs 

I know the exact behaviour of MATLAB isn't very relevant for this
discussion, but anyway the reason I think this is a problem in MATLAB
is that there are a bunch of confused questions on Stack Overflow due
to this behaviour. Just from the first page this[1] query I could find
examples [2-8] (quite a few are about porting MATLAB to numpy or vice
versa).

>  and moreover  is not compatible if you don't also transpose 
> it but expected in almost all contexts of matrices, vectors and scalars. 
> Elementwise conjugation is well inline with other elementwise operations 
> starting with a dot in matlab hence still consistent.

I probably misunderstood your point here, Ilhan, because it sounds to
me that you're arguing that conjugation should not come without a
transpose. This is different from saying that transpose should not
come without conjugation (although I'd object to both). And `.'` is
exactly _not_ an elementwise operation in MATLAB: it's the transpose,
despite the seemingly element-wise syntax. `arr.'` will _not_
conjugate your array, `arr'` will (while both will transpose).
Finally, I don't think "MATLAB does it" is a very good argument
anyway; my subjective impression is that several of the issues with
np.matrix are due to behaviour that resembles that of MATLAB. But
MATLAB is very much built for matrices, and there are no 1d arrays, so
they don't end up with some of the pitfalls that numpy does.

>
> I would still expect an conjugation+transposition to be the default since 
> just transposing a complex array is way more special and rare than its 
> ubiquitous regular usage.
>

Coming back to numpy, I disagree with your statement. I'd say "just
transposing a complex _matrix_ is way more special and rare than its
ubiquitous regular usage", which is true. Admittedly I have a patchy
math background, but to me it seems that the "conjugation and
transpose go hand in hand" claim is mostly valid for linear algebra,
i.e. actual matrices and vectors. However numpy arrays are much more
general, and I very often want to reverse the shape of a complex
non-matrix 2d array (i.e. transpose it) for purposes of broadcasting
or vectorized matrix operations, and not want to conjugate it in the
process.
Do you at least agree that the feature of conjugate+transpose as
default mostly makes sense for linear algebra, or am I missing other
typical (and general numerical programming) use cases?

András

[1]: 
https://stackoverflow.com/search?q=%5Bmatlab%5D+conjugate+transpose+is%3Aa&mixed=1
[2]: https://stackoverflow.com/a/45272576
[3]: https://stackoverflow.com/a/54179564
[4]: https://stackoverflow.com/a/42320906
[5]: https://stackoverflow.com/a/23510668
[6]: https://stackoverflow.com/a/11416502
[7]: https://stackoverflow.com/a/49057640
[8]: https://stackoverflow.com/a/54309764

> ilhan
>
>
> On Tue, Jun 25, 2019 at 10:57 AM Andras Deak  wrote:
>>
>> On Tue, Jun 25, 2019 at 4:29 AM Cameron Blocker
>>  wrote:
>> >
>> > In my opinion, the matrix transpose operator and the conjugate transpose 
>> > operator should be one and the same. Something nice about both Julia and 
>> > MATLAB is that it takes more keystrokes to do a regular transpose instead 
>> > of a conjugate transpose. Then people who work exclusively with real 
>> > numbers can just forget that it's a conjugate transpose, and for 
>> > relatively simple algorithms, their code will just work with complex 
>> > numbers with little modification.
>> >
>>
>> I'd argue that MATLAB's feature of `'` meaning adjoint (conjugate
>> transpose etc.) and `.'` meaning regular transpose causes a lot of
>> confusion and probably a lot of subtle bugs. Most people are unaware
>> that `'` does a conjugate transpose and use it habitually, and when
>> for once they have a complex array they don't understand why the
>> values are off (assuming they even notice). Even the MATLAB docs
>> conflate the two operations occasionally, which doesn't help at all.
>> Transpose should _not_ incur conjugation automatically. I'm already a
>> bit wary of special-casing matrix dynamics this much, when ndarrays
>> are naturally multidimensional objects. Making transposes be more than
>> transposes would be a huge mistake in my opinion, already for matrices
>> (2d arrays) and especially for everything else.
>>
>> András
>>
>>
>>
>> > Ideally, I'd like to see a .H that was the defacto Matrix/Linear 
>> > Algebra/Conjugate transpose that for 2 or more dimensions, conjugate 
>> > transposes

Re: [Numpy-discussion] Syntax Improvement for Array Transpose

2019-06-26 Thread Andras Deak
Dear Ilhan,

Thanks for writing these up.
I feel that from a usability standpoint most people would support #3
(.H/.mH), especially considering Marten's very good argument about @.
Having to wrap your transposed matrices in function calls half defeats
the purpose of being able to write stacked matrix operations elegantly
within the ndarray class. The question is of course whether it's
feasible from a project management/API design stand point (just to
state the obvious).
Regarding #1 (1d transpose): I just want to make it clear as someone
who switched from MATLAB to python (and couldn't be happier) that we
should treat MATLAB's behaviour as more of a cautionary tale rather
than design ethos. I paused for exactly 5 seconds the first time I ran
into the no-op of 1d transposes, and then I thought "yes, this makes
sense", and that was it. To put it differently, I think it's more
about MATLAB injecting false assumptions into users than about numpy
behaving surprisingly. (On a side note, MATLAB's quirks are one of the
reasons that the Spyder IDE, designed to be a MATLAB replacement, has
very weird quirks that regularly trip up python users.)
Regards,

András

On Wed, Jun 26, 2019 at 9:04 AM Ilhan Polat  wrote:
>
> Maybe a bit of a grouping would help, because I am also losing track here. 
> Let's see if I could manage to get something sensible because, just like 
> Marten mentioned, I am confusing myself even when I am thinking about this
>
> 1- Transpose operation on 1D arrays:
> This is a well-known confusion point for anyone that arrives at NumPy 
> usage from, say matlab background or any linear algebra based user. Andras 
> mentioned already that this is a subset of NumPy users so we have to be 
> careful about the user assumptions. 1D arrays are computational constructs 
> and mathematically they don't exist and this is the basis that matlab 
> enforced since day 1. Any numerical object is an at least 2D array including 
> scalars hence transposition flips the dimensions even for a col vector or row 
> vector. That doesn't mean we cannot change it or we need to follow matlab but 
> this is kind of what anybody kinda sorta wouda expect. For some historical 
> reason, on numpy side transposition on 1D arrays did nothing since they have 
> single dimensions. Hence you have to create a 2D vector for transpose from 
> the get go to match the linear algebra intuition. Points that has been 
> discussed so far are about whether we should go further and even intercept 
> this behavior such that 1D transpose gives errors or warnings as opposed to 
> the current behavior of silent no-op. as far as I can tell, we have a 
> consensus that this behavior is here to stay for the foreseeable future.
>
> 2- Using transpose to reshape the (complex) array or flip its dimensions
> This is a usage that has been mentioned above that I don't know much 
> about. I usually go the "reshape() et al." way for this but apparently folks 
> use it to flip dimensions and they don't want the automatically conjugation 
> which is exactly the opposite of a linear algebra oriented user is used to 
> have as an adjoint operator. Therefore points that have been discussed about 
> are whether to inject conjugation into .T behavior of complex arrays or not. 
> If not can we have an extra .H or something that specifically does .conj().T 
> together (or .T.conj() order doesn't matter). The main feel (that I got so 
> far) is that we shouldn't touch the current way and hopefully bring in 
> another attribute.
>
> 3- Having a shorthand notation such as .H or .mH etc.
> If the previous assertion is true then the issue becomes what should be 
> the new name of the attribute and how can it have the nice properties of a 
> transpose such as returning a view etc. However this has been proposed and 
> rejected before e.g., GH-8882 and GH-13797. There is a catch here though, 
> because if the alternative is .conj().T then it doesn't matter whether it 
> copies or not because .conj().T doesn't return a view either and therefore 
> the user receives a new array anyways. Therefore no benefits lost. Since the 
> idea is to have a shorthand notation, it seems to me that this point is 
> artificial in that sense and not necessarily a valid argument for rejection. 
> But from the reluctance of Ralf I feel like there is a historical wear-out on 
> this subject.
>
> 4- transpose of 3+D arrays
> I think we missed the bus on this one for changing the default behavior 
> now and there are glimpses of confirmation of this above in the previous 
> mails. I would suggest discussing this separately.
>
> So if you are not already worn out and not feeling sour about it, I would 
> like to propose the discussion of item 3 opened once again. Because the need 
> is real and we don't need to get choked on the implementation details right 
> away.
>
> Disclaimer: I do applied math so I have a natural bias towards the linalg-y 
> way of doing things. And sorry about

Re: [Numpy-discussion] NEP 28 — A standard community policy for dropping support of old Python and NumPy versions

2019-08-06 Thread Andras Deak
Hi,

This is just a reminder for others like myself who have too limited a
cognitive buffer: this NEP was renumbered and it's NEP 29 now
https://github.com/numpy/numpy/pull/14086/files (just to prevent
possible confusion).

András

On Wed, Jul 24, 2019 at 1:52 AM Thomas Caswell  wrote:
>
> Folks,
>
> This NEP is a proposing a standard policy for the community to determine when 
> we age-out support for old  versions of Python. This came out of in-person 
> discussions at SciPy earlier in July and scattered discussion across github. 
> This is being proposed by maintainers from Matplotlib, scikit-learn,
> IPython, Jupyter, yt, SciPy, NumPy, and scikit-image.
>
> TL;DR:
>
> We propose only supporting versions of CPython initially released in the 
> preceding 42 months of a major or minor release of any of our projects.
>
> Please see https://github.com/numpy/numpy/pull/14086 and keep the discussion 
> there as there are many interested parties (from the other projects) that may 
> not be subscribed to the numpy mailing list.
>
> Tom
>
> --
> Thomas Caswell
> tcasw...@gmail.com
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Calling BLAS functions from Python

2019-08-27 Thread Andras Deak
On Tue, Aug 27, 2019 at 1:18 PM Jens Jørgen Mortensen  wrote:
>
> Hi!
>
> I'm trying to use dgemm, zgemm and friends from scipy.linalg.blas to
> multiply matrices efficiently.  As an example, I'd like to do:
>
>  c += a.dot(b)
>
> using whatever BLAS scipy is linked to and I want to avoid copies of
> large matrices.  This works the way I want it:
(snip)

Hi,

This is not a direct answer to your question, but are you only trying
to use low-level BLAS or the sake of memory, or are there other
considerations? I'm not certain but I think `a.dot` will call BLAS for
matrices (hence its speed and multithreaded capabilities), so CPU time
might already be optimal. As for memory, most numpy functions (and
definitely numpy arithmetic) can be used with functions in the main
numpy namespace, using the `out` keyword to specify an existing array
in which to print. So for your simple example,

>>> import numpy as np
... a = np.ones((2, 3), order='F')
... b = np.ones((3, 4), order='F')
... c = np.zeros((7, 4), order='F')[:2, :]
... np.add(c, a.dot(b), out=c)
array([[3., 3., 3., 3.],
   [3., 3., 3., 3.]])

>>> c
array([[3., 3., 3., 3.],
   [3., 3., 3., 3.]])

As you can see non-contiguous arrays Just Work™. You will still create
a temporary array for `a.dot(b)` but I'm not sure you can spare that.
Would low-level BLAS allow you to reduce memory at that step as well?
And is there other motivation for you to go down to the metal?
Regards,

András



>
>  >>> import numpy as np
>  >>> from scipy.linalg.blas import dgemm
>  >>> a = np.ones((2, 3), order='F')
>  >>> b = np.ones((3, 4), order='F')
>  >>> c = np.zeros((2, 4), order='F')
>  >>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
> array([[3., 3., 3., 3.],
> [3., 3., 3., 3.]])
>  >>> print(c)
> [[3. 3. 3. 3.]
>   [3. 3. 3. 3.]]
>
> but if c is not contiguous, then c is not overwritten:
>
>  >>> c = np.zeros((7, 4), order='F')[:2, :]
>  >>> dgemm(1.0, a, b, 1.0, c, 0, 0, 1)
> array([[3., 3., 3., 3.],
> [3., 3., 3., 3.]])
>  >>> print(c)
> [[0. 0. 0. 0.]
>   [0. 0. 0. 0.]]
>
> Which is also what the docs say, but I think the raw BLAS function dgemm
> could do the update of c in-place by setting LDC=7.  See here:
>
>  http://www.netlib.org/lapack/explore-html/d7/d2b/dgemm_8f.html
>
> Is there a way to call the raw BLAS function from Python?
>
> I found this capsule thing, but I don't know if there is a way to call
> that (maybe using ctypes):
>
>  >>> from scipy.linalg import cython_blas
>  >>> cython_blas.__pyx_capi__['dgemm']
>  __pyx_t_5scipy_6linalg_11cython_blas_d *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *,
> __pyx_t_5scipy_6linalg_11cython_blas_d *, int *)" at 0x7f06fe1d2ba0>
>
> Best,
> Jens Jørgen
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] round / set_printoptions discrepancy

2019-09-13 Thread Andras Deak
On Fri, Sep 13, 2019 at 12:58 PM Irvin Probst
 wrote:
>
> Hi,
> Is it expected/documented that np.round and np.set_printoptions do not
> output the same result on screen ?
> I tumbled into this running this code:
>
> import numpy as np
> mes = np.array([
>  [16.06, 16.13, 16.06, 16.00, 16.06, 16.00, 16.13, 16.00]
> ])
>
> avg = np.mean(mes, axis=1)
> print(np.round(avg, 2))
> np.set_printoptions(precision=2)
> print(avg)
>
>
> Which outputs:
>
> [16.06]
> [16.05]
>
> Is that worth a bug report or did I miss something ? I've been able to
> reproduce this on many windows/linux PCs with python/numpy releases from
> 2017 up to last week.
>
> Thanks.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Hi,

I just want to add that you can use literal 16.055 to reproduce this:
>>> import numpy as np
>>> np.set_printoptions(precision=2)
>>> np.array([16.055]).round(2)
array([16.06])
>>> np.array([16.055])
array([16.05])

I would think it has to do with "round to nearest even":
>>> np.array(16.055)
array(16.05)
>>> np.array(16.065)
array(16.07)
>>> np.array(16.065).round(2)
16.07

But it's as if `round` rounded decimal digits upwards (16.055 ->
16.06, 16.065 -> 16.07), whereas the `repr` rounded to the nearest
odd(!) digit (16.055 -> 16.05, 16.065 -> 16.07). Does this make any
sense? I'm on numpy 1.17.2.
(Scalars or 1-length 1d arrays don't seem to make a difference).
Regards,

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] round / set_printoptions discrepancy

2019-09-13 Thread Andras Deak
On Fri, Sep 13, 2019 at 2:59 PM Philip Hodge  wrote:
>
> On 9/13/19 8:45 AM, Irvin Probst wrote:
> > On 13/09/2019 14:05, Philip Hodge wrote:
> >>
> >> Isn't that just for consistency with Python 3 round()?  I agree that
> >> the discrepancy with np.set_printoptions is not necessarily expected,
> >> except possibly for backwards compatibility.
> >>
> >>
> >
> > I've just checked and np.set_printoptions behaves as python's round:
> >
> > >>> round(16.055,2)
> > 16.05
> > >>> np.round(16.055,2)
> > 16.06
> >
> > I don't know why round and np.round do not behave the same, actually I
> > would even dare to say that I don't care :-)
> > However np.round and np.set_printoptions should provide the same
> > output, shouldn't they ? This discrepancy is really disturbing whereas
> > consistency with python's round looks like the icing on the cake but
> > in no way a required feature.
> >
>
> Python round() is supposed to round to the nearest even value, if the
> two closest values are equally close.  So round(16.055, 2) returning
> 16.05 was a surprise to me.  I checked the documentation and found a
> note that explained that this was because "most decimal fractions can't
> be represented exactly as a float."  round(16.55) returns 16.6.
>
> Phil
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion

Ah, of course, endless double-precision shenanigans...
>>> format(16.055, '.30f')
'16.054715782905695960'

>>> format(16.55, '.30f')
'16.550710542735760100'

András
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Np.genfromtxt Problem

2019-10-04 Thread Andras Deak
On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar  wrote:
>
>
> I have a snippet of code
>
> #!/usr/bin/env python3
> # -*- coding: utf-8 -*-
> """
>
> Created on Tue Sep 24 07:51:11 2019
>
> """
> import numpy as np
>
> files = []
>
> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
> skip_footer=1, encoding=None)
>
> print(data)
>
>
> If file is a single file the code generates the data that I want.
> However I have a list of files that I want to process. According to
> numpy.genfromtxt fname can be a "File, filename, list, or generator to
> read."  If I use [13-7a_apo-1acl.RMSD13-7_apo-1acl.RMSD
> 14-7_apo-1acl.RMSD15-7_apo-1acl.RMSD17-7_apo-1acl.RMSD ] get the
> error:

Hi Stephen,

As far as I know genfromtxt is designed to read the contents of a
single file. Consider this quote from the docs for the first
parameter:
"The strings in a list or produced by a generator are treated as lines."
And the general description of the function says
"Load data from a text file, with missing values handled as specified."
("a text file", singular)
So if I understand correctly the list case is there so that you can
pass `f.readlines()` or equivalent into genfromtxt. From a
higher-level standpoint, how would reading multiple files behave if
the files have different structure, and what type and shape should the
function return in that case?
If one file can be read just fine then I suggest looping over them to
read each, one after the other. You can then tell python what to do
with each returned array and so it doesn't have to guess.
Regards,

András




>
> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py',
> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test',
> current_namespace=True)
> Traceback (most recent call last):
>
>File
> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py",
> line 12, in 
>  data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8,
> skip_footer=1, encoding=None)
>
>File
> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> line 1762, in genfromtxt
>  next(fhd)
>
> StopIteration
>
> I have tried very combination of search terms that I can think of in
> order to find an example of how to make this work without success.
>
> How can I make this work?
>
> Thanks in advance.
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Problem with np.savetxt

2019-10-08 Thread Andras Deak
On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar  wrote:
>
> I am embarrassed to be asking this question, but I have exhausted Google
> at this point .
>
> I have a number of identically formatted text files from which I want to
> extract data, as an example (hopefully, putting these in as quotes will
> persevere the format):
>
> > ===
> > PSOVina version 2.0
> > Giotto H. K. Tai & Shirley W. I. Siu
> >
> > Computational Biology and Bioinformatics Lab
> > University of Macau
> >
> > Visit http://cbbio.cis.umac.mo for more information.
> >
> > PSOVina was developed based on the framework of AutoDock Vina.
> >
> > For more information about Vina, please visit http://vina.scripps.edu.
> >
> > ===
> >
> > Output will be 13-7_out.pdbqt
> > Reading input ... done.
> > Setting up the scoring function ... done.
> > Analyzing the binding site ... done.
> > Using random seed: 1828390527
> > Performing search ... done.
> >
> > Refining results ... done.
> >
> > mode |   affinity | dist from best mode
> >  | (kcal/mol) | rmsd l.b.| rmsd u.b.
> > -++--+--
> >1-8.862004149  0.000  0.000
> >2-8.403522829  2.992  6.553
> >3-8.401384636  2.707  5.220
> >4-7.886402037  4.907  6.862
> >5-7.845519031  3.233  5.915
> >6-7.837434227  3.954  5.641
> >7-7.834584887  3.188  7.294
> >8-7.694395765  3.746  7.553
> >9-7.691211177  3.536  5.745
> >   10-7.670759445  3.698  7.587
> >   11-7.661882758  4.882  7.044
> >   12-7.636280303  2.347  3.284
> >   13-7.635788052  3.511  6.250
> >   14-7.611175249  2.427  3.449
> >   15-7.586368357  2.142  2.864
> >   16-7.531307666  2.976  4.980
> >   17-7.520501084  3.085  5.775
> >   18-7.512906514  4.220  7.672
> >   19-7.307403528  3.240  4.354
> >   20-7.256063348  3.694  7.252
> > Writing output ... done.
>   At this point, my python script consists of only the following:
>
> > #!/usr/bin/env python3
> > # -*- coding: utf-8 -*-
> > """
> >
> > Created on Tue Sep 24 07:51:11 2019
> >
> > """
> > import numpy as np
> >
> > data = []
> >
> > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> > skip_header=27, skip_footer=1, encoding=None)
> >
> > print(data)
> >
> > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
>
> The problem lies in tfe np.savetxt line, on execution I get:
>
> > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> > current_namespace=True)
> > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> >  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> >  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> >  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> >  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> > Traceback (most recent call last):
> >
> >   File
> > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> > line 16, in 
> > np.savetxt('13-7', [data], fmt='%16.9f', header='13-7')
> >
> >   File "<__array_function__ internals>", line 6, in savetxt
> >
> >   File
> > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py",
> > line 1438, in savetxt
> > % (str(X.dtype), format))
> >
> > TypeError: Mismatch between array dtype (' > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f
> > %16.9f')
>
> The data is in the data file, but the only entry in '13-7', the saved
> file, is the label. Obviously, the error is in the format argument.

Hi,

One problem is the format: the error is telling you that you have
strings in your array (compare the `'
> Help will be much appreciated.
>
> Thanks in advance.
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Problem with np.savetxt

2019-10-08 Thread Andras Deak
PS. if you just want to specify the width of the fields you wouldn't
have to convert anything, because you can specify the size and
justification of a %s format. But arguably having float data as floats
is more natural anyway.

On Tue, Oct 8, 2019 at 3:42 PM Andras Deak  wrote:
>
> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar  
> wrote:
> >
> > I am embarrassed to be asking this question, but I have exhausted Google
> > at this point .
> >
> > I have a number of identically formatted text files from which I want to
> > extract data, as an example (hopefully, putting these in as quotes will
> > persevere the format):
> >
> > > ===
> > > PSOVina version 2.0
> > > Giotto H. K. Tai & Shirley W. I. Siu
> > >
> > > Computational Biology and Bioinformatics Lab
> > > University of Macau
> > >
> > > Visit http://cbbio.cis.umac.mo for more information.
> > >
> > > PSOVina was developed based on the framework of AutoDock Vina.
> > >
> > > For more information about Vina, please visit http://vina.scripps.edu.
> > >
> > > ===
> > >
> > > Output will be 13-7_out.pdbqt
> > > Reading input ... done.
> > > Setting up the scoring function ... done.
> > > Analyzing the binding site ... done.
> > > Using random seed: 1828390527
> > > Performing search ... done.
> > >
> > > Refining results ... done.
> > >
> > > mode |   affinity | dist from best mode
> > >  | (kcal/mol) | rmsd l.b.| rmsd u.b.
> > > -++--+--
> > >1-8.862004149  0.000  0.000
> > >2-8.403522829  2.992  6.553
> > >3-8.401384636  2.707  5.220
> > >4-7.886402037  4.907  6.862
> > >5-7.845519031  3.233  5.915
> > >6-7.837434227  3.954  5.641
> > >7-7.834584887  3.188  7.294
> > >8-7.694395765  3.746  7.553
> > >9-7.691211177  3.536  5.745
> > >   10-7.670759445  3.698  7.587
> > >   11-7.661882758  4.882  7.044
> > >   12-7.636280303  2.347  3.284
> > >   13-7.635788052  3.511  6.250
> > >   14-7.611175249  2.427  3.449
> > >   15-7.586368357  2.142  2.864
> > >   16-7.531307666  2.976  4.980
> > >   17-7.520501084  3.085  5.775
> > >   18-7.512906514  4.220  7.672
> > >   19-7.307403528  3.240  4.354
> > >   20-7.256063348  3.694  7.252
> > > Writing output ... done.
> >   At this point, my python script consists of only the following:
> >
> > > #!/usr/bin/env python3
> > > # -*- coding: utf-8 -*-
> > > """
> > >
> > > Created on Tue Sep 24 07:51:11 2019
> > >
> > > """
> > > import numpy as np
> > >
> > > data = []
> > >
> > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None,
> > > skip_header=27, skip_footer=1, encoding=None)
> > >
> > > print(data)
> > >
> > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7')
> >
> > The problem lies in tfe np.savetxt line, on execution I get:
> >
> > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py',
> > > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet',
> > > current_namespace=True)
> > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911'
> > >  '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490'
> > >  '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814'
> > >  '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179'
> > >  '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147']
> > > Traceback (most recent call last):
> > >
> > >   File
> > > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py",
> 

Re: [Numpy-discussion] np.genfromtxt StopIteration Error

2019-10-11 Thread Andras Deak
Hi Stephen,

Is this not what your original question to this list was about? See
https://mail.python.org/pipermail/numpy-discussion/2019-October/080130.html
and replies.
I still believe that you _can't_ give genfromtxt file names in an
iterable. The iterable input is only inteded to contain the contents
of a single file, probably as if read using `f.readlines()`.
Genfromtxt seems to read data from at most one file with each call.
Regards,

András

On Fri, Oct 11, 2019 at 8:41 PM Stephen P. Molnar
 wrote:
>
> Thanks for the reply.
>
> Keep in mind that i am a Chemist, not an IT person. I used to be a
> marginally proficient FORTRAN II user in the ancient past.
>
> I tried running your code. Please see my comments/questing below:
>
> On 10/11/2019 01:12 PM, Bennet Fauber wrote:
> > I think genfromtxt() wants a filename as the first argument, and you
> > have to tell it the entries in the file are strings not numerics.
> >
> > test.py
> > --
> > import os
> > import glob
> > import numpy as np
> >
> > fileList = []
> > filesList = []
> >
> > for files in glob.glob("*.log"):
> >  fileName, fileExtension = os.path.splitext(files)
> >  fileList.append(fileName)
> >  filesList.append(files)
> >
> > print('fileList = ', fileList)
> > print('filesList = ', filesList
> >
> > fname = '/tmp/foo.txt'
> There is no '/temp/foo.txt' Where did it come from in your example?
> > print('fname = ', fname)
> > data = np.genfromtxt(fname, dtype=str)
> > print(data)
> > --
> >
> > Contents of /tmp/foo.txt
> > --
> > 15-7.log
> > 18-7.log
> > 14-7.log
> > C-VX3.log
> > --
> >
> > Sample run
> I'm using python 3.7.3, should this make a difference?
> >
> > $ python --version
> > Python 2.7.15+
> >
> > $ python t.py
> > ('fileList = ', ['15-7', '18-7', '14-7', 'C-VX3'])
> > ('filesList = ', ['15-7.log', '18-7.log', '14-7.log', 'C-VX3.log'])
> > ('fname = ', '/tmp/foo.txt')
> > ['15-7.log' '18-7.log' '14-7.log' 'C-VX3.log']
> >
> > Is that any help?
> if I use data = np.genfromtxt('14-7.log', dtype=str, usecols=(1),
> skip_header=27, skip_footer=1, encoding=None) with a specific file name.
> in this example 14-7, I get the resutt I desired:
>
> # 14-7
> -9.960902669
> -8.979504781
> -8.942611364
> -8.915523010
> -8.736508831
> -8.663387139
> -8.410739711
> -8.389146347
> -8.296798909
> -8.168454106
> -8.127990818
> -8.127103774
> -7.979090739
> -7.941872682
> -7.900766215
> -7.881485228
> -7.837826485
> -7.815909505
> -7.722540286
> -7.720346742
>
> so, my question is; why the StopIteration error message in my original
> query? Why is the dcrtipt not iterating over the log files?
>
> Sorry to be so dense.
>
>
> >
> > On Fri, Oct 11, 2019 at 12:41 PM Stephen P. Molnar
> >  wrote:
> >> I have been fighting with the genfromtxt function in numpy for a while now 
> >> and am trying a slightly different approach.
> >>
> >> Here is the code:
> >>
> >>
> >> import os
> >> import glob
> >> import numpy as np
> >>
> >> fileList = []
> >> filesList = []
> >>
> >> for files in glob.glob("*.log"):
> >> ?? fileName, fileExtension = os.path.splitext(files)
> >> ?? fileList.append(fileName)
> >> ?? filesList.append(files)
> >>
> >> print('fileList = ', fileList)
> >> print('filesList = ', filesList)
> >>
> >> fname = filesList
> >> print('fname = ', fname)
> >> data = np.genfromtxt(fname, usecols=(1), skip_header=27, skip_footer=1, 
> >> encoding=None)
> >> print(data)
> >>
> >> np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}')
> >> print(data.dG)
> >>
> >> I am using the Spyder IDE which has a variable explorer which shows:
> >>
> >> filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log']
> >> fileList = ['C-VX3', '18-7', '14-7', '15-7']
> >>
> >> so the lists that genfromtxt needs are being generated.
> >>
> >> Goggling 'numpy genfromtxt stopiteration error' does not seem to address 
> >> this problem. At least, I didn't find plaything that I thought applied.
> >>
> >> I would greatly appreciate some assistance here.
> >>
> >> Thanks is advance.
> >>
> >> --
> >> Stephen P. Molnar, Ph.D.
> >> www.molecular-modeling.net
> >> 614.312.7528 (c)
> >> Skype:  smolnar1
> >>
> >> ___
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion@python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> --
> Stephen P. Molnar, Ph.D.
> www.molecular-modeling.net
> 614.312.7528 (c)
> Skype:  smolnar1
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-

Re: [Numpy-discussion] new numpy.org is live

2020-05-24 Thread Andras Deak
Dear Inessa,

The new design looks great, thanks for all the hard work from everyone!
Is there a well-defined channel where we can file potential bug
reports and feature requests for the website? Or do those just go on
the main numpy repo as issues?
Regards,

András

On Sun, May 24, 2020 at 2:11 PM Inessa Pawson  wrote:
>
> The NumPy web team is excited to announce the launch of the newly redesigned 
> numpy.org. To transform the website into a comprehensive, yet user-centric, 
> resource of all things NumPy was a primary focus of this months-long effort. 
> We thank Joe LaChance, Ralf Gommers, Shaloo Shalini, Shekhar Prasad Rajak, 
> Ross Barnowski, and Mars Lee for their extensive contributions to the project.
>
> The new site features a curated collection of NumPy related educational 
> resources for every user level, an overview of the entire Python scientific 
> computing ecosystem, and several case studies highlighting the importance of 
> the library to the many advances in scientific research as well as the 
> industry in recent years. The “Install” and “Get Help” pages offer advice on 
> how to find answers to installation and usage questions, while those who are 
> looking to connect with others within our large and diverse community will 
> find the “Community” page very helpful.
>
> The new website will be updated on a regular basis with news about the NumPy 
> project development milestones, community initiatives and events. Visitors 
> are encouraged to explore the website and sign up for the newsletter.
>
> Next, the NumPy web team will focus on updating graphics and project identity 
> (a new logo is coming!), adding an installation widget and translations, 
> better integrating the project documentation via the new Sphinx theme, and 
> improving the interactive terminal experience. Also, we are looking to expand 
> our portfolio of case studies and would appreciate any assistance in this 
> matter.
>
> Best regards,
> Inessa Pawson
> NumPy web team
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy FFT normalization options issue (addition of new option)

2020-06-28 Thread Andras Deak
On Sun, Jun 28, 2020 at 9:37 PM Neal Becker  wrote:
>
> Honestly, I don't find "forward" very informative.  There isn't any real 
> convention on whether FFT of IFFT have any normalization.
> To the best of my experience, either forward or inverse could be normalized 
> by 1/N, or each normalized by 1/sqrt(N), or neither
> could be normalized.  I will say my expertise is in signal processing and 
> communications.
>
> Perhaps
> norm = {full, half, none} would be clearest to me.

If I understand your point correctly and the discussion so far, the
intention here is to use the keyword to denote the convention for an
FFT-IFFT pair rather than just normalization in a single
transformation (either FFT or IFFT).
The idea being that calling ifft on the output of fft while using the
same `norm` would be more or less identity. This would work for
"half", but not for, say, "full". We need to come up with a name that
specifies where normalization happens with regards to the
forward-inverse pair.
Does this make sense, considering your point?

András

>
> Thanks,
> Neal
>
> On Sat, Jun 27, 2020 at 10:40 AM Sebastian Berg  
> wrote:
>>
>> On Fri, 2020-06-26 at 21:53 -0700, leofang wrote:
>> > Hi all,
>> >
>> >
>> > Since I brought this issue from CuPy to Numpy, I'd like to see a
>> > decision
>> > made sooner than later so that downstream libraries like SciPy and
>> > CuPy can
>> > act accordingly. I think norm='forward' is fine. If there're still
>> > people
>> > unhappy with it after my reply, I'd suggest norm='reverse'. It has
>> > the same
>> > meaning, but is less confusing (than 'inverse' or other choices on
>> > the
>> > table) to me.
>> >
>>
>> I expect "forward" is good (if I misread something please correct me),
>> and I think we can go ahead with it, sorry for the delay.  However, I
>> have send an email to scipy-dev, since we should give them at least a
>> heads-up, and if you do not mind, I would wait a few days to actually
>> merge (although we can also simply reverse, as long as CuPy does not
>> have a release with it).
>>
>> It might be nice to expand the kwarg docs slightly with a sentence for
>> each normalization mode?  Refering to `np.fft` docs is good, but if we
>> can squeeze in a short refresher and refer there for details/formula it
>> would be nicer.
>> I feel "forward" is very intuitive, but only after pointing out that it
>> is related to whether the fft or ifft has the normalization factor.
>>
>> Cheers,
>>
>> Sebastian
>>
>>
>> >
>> > Best,
>> > Leo
>> >
>> >
>> >
>> > --
>> > Sent from: http://numpy-discussion.10968.n7.nabble.com/
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@python.org
>> > https://mail.python.org/mailman/listinfo/numpy-discussion
>> >
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Those who don't understand recursion are doomed to repeat it
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] An alternative to vectorize that lets you access the array?

2020-07-12 Thread Andras Deak
On Sun, Jul 12, 2020 at 3:02 PM Ram Rachum  wrote:
>
> Hi everyone,
>
> Here's a problem I've been dealing with. I wonder whether NumPy has a tool 
> that will help me, or whether this could be a useful feature request.
>
> In the upcoming EuroPython 20200, I'll do a talk about live-coding a music 
> synthesizer. It's going to be a fun talk, I'll use the sounddevice module to 
> make a program that plays music. Do attend, or watch it on YouTube when it's 
> out :)
>
> There's a part in my talk that I could make simpler, and thus shave 3-4 
> minutes of cumbersome explanations. These 3-4 minutes matter a great deal to 
> me. But for that I need to do something with NumPy and I don't know whether 
> it's possible or not.
>
>
> The sounddevice library takes an ndarray of sound data and plays it. 
> Currently I use `vectorize` to produce that array:
>
> output_array = np.vectorize(f, otypes='d')(input_array)
>
> And I'd like to replace it with this code, which is supposed to give the same 
> output:
>
> output_array = np.ndarray(input_array.shape, dtype='d')
> for i, item in enumerate(input_array):
> output_array[i] = f(item)
>
> The reason I want the second version is that I can then have sounddevice 
> start playing `output_array` in a separate thread, while it's being 
> calculated. (Yes, I know about the GIL, I believe that sounddevice releases 
> it.)
>
> Unfortunately, the for loop is very slow, even when I'm not processing the 
> data on separate thread. I benchmarked it on both CPython and PyPy3, which is 
> my target platform. On CPython it's 3 times slower than vectorize, and on 
> PyPy3 it's 67 times slower than vectorize! That's despite the fact that the 
> Numpy documentation says "The `vectorize` function is provided primarily for 
> convenience, not for performance. The implementation is essentially a `for` 
> loop."
>
> So here are a few questions:
>
> 1. Is there something like `vectorize`, except you get to access the output 
> array before it's finished? If not, what do you think about adding that as an 
> option to `vectorize`?
>
> 2. Is there a more efficient way of writing the `for` loop I've written 
> above? Or any other kind of solution to my problem?

Hi Ram,

I find your description of the behaviour really surprising! My
experience with np.vectorize has been consistent with the
documentation's note. Can you please provide some more context?
  1. What shape is your array?
  2. How exactly did you compute the runtimes?
  3. How large runtimes are we talking? Are you sure you're not
measuring some kind of overhead?
  4. What kind of work does f do? This is mostly relevant for your
question about alternatives for your loop.

Unfortunately I don't believe it's possible or it would even _be_
possible to give access to half-done results of computations. As far
as I know even asynchronous libraries make you have to wait until some
result is done. So unless you chop up your array along the first
dimension and explicitly work with each slice independently, I'm
pretty sure this is not possible. Just imagine the wealth of possible
race conditions if you could have access to half-initialized arrays.

The only actionable suggestion I have for your loop is to replace the
`np.ndarray` call with one to `np.empty`. My impression has always
been that arrays should be instantiated with one of the helper
functions rather than directly from the type. Personally, I don't use
vectorize at all because I tend to find that it only misleads the
reader.
Regards,

András

>
> Thanks for your help,
> Ram Rachum.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do not understand what f2py is reporting

2020-11-02 Thread Andras Deak
On Sun, Nov 1, 2020 at 2:33 AM Samuel Dupree  wrote:
>
> I'm attempting to build wrappers around two Fortran routines. One is a
> Fortran 77 subroutine (see file gravity_derivs.f) that calls a Fortran
> 90 package that performs automatic differentiation (see file
> auto_deriv.f90).
>
> I'm running he Anaconda distribution for Python 3.7.6 on a Mac Pro
> (2019) under Mac OS X Catalina (ver. 10.15.6). The version of NumPy I'm
> running is 1.18.3. The commands I used to attempt the build are
> contained in the file auto_deriv_build. The messages output by f2py are
> captured in the file auto_derivs_build_report.txt.
>
> I don't understand the cause behind the error messages I got, so any
> advice would be welcomed.
>
> Sam Dupree.

Hi Sam,

I've got a partial solution.
I haven't used f2py yet but at least the error from your first `f2py`
call seems straightforward. Near the top:

Line #119 in gravity_derivs.f:"  integer * 4degree"
updatevars: no name pattern found for entity='*4degree'. Skipping.

This shows that the fortran code gets parsed as `(integer)
(*4degree)`. That can't be right. There might be a way to tell f2py to
do this right, but anyway I could make your code compile by replacing
every such declaration with `integer * 4 :: degree` etc (i.e. adding
double colons everywhere).
Once that's fixed your first f2py call raises another error:

Fatal Error: Cannot open module file ‘deriv_class.mod’ for reading
at (1): No such file or directory

I could generate these mod files by manually running `gfortran -c
auto_deriv.f90`. After that the .mod files appear and your first
`f2py` call will succed.
You can now `import gravity_derivs`, but of course this will lead to
an error because `auto_deriv` is not available in python.
Unfortunately your _second_` f2py` call also dies on `auto_deriv.f90`,
with such offending lines:

In: :auto_deriv:auto_deriv.f90:ad_auxiliary
get_parameters: got "invalid syntax (, line 1)" on '(/((i,
i=j,n), j=1,n)/)'

I'm guessing that again f2py can't parse that syntax.
My hunch is that if you can get f2py to work with `auto_deriv.f90` you
should first run that. This should hopefully generate the .mod files
after which the second call to `f2py` with `gravity_derivs.f` should
work. If `f2py` doesn't generate the .mod files you could at worst run
your fortran compiler yourself between the two calls to `f2py`.
Cheers,

András

> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] np.{bool,float,int} deprecation

2020-12-05 Thread Andras Deak
On Sun, Dec 6, 2020 at 12:31 AM Juan Nunez-Iglesias  wrote:
>
> Hi all,
>
> At the prodding [1] of Sebastian, I’m starting a discussion on the decision 
> to deprecate np.{bool,float,int}. This deprecation broke our prerelease 
> testing in scikit-image (which, hooray for rcs!), and resulted in a large 
> amount of code churn to fix [2].
>
> To be honest, I do think *some* sort of deprecation is needed, because for 
> the longest time I thought that np.float was what np.float_ actually is. I 
> think it would be worthwhile to move to *that*, though it’s an even more 
> invasive deprecation than the currently proposed one. Writing `x = 
> np.zeros(5, dtype=int)` is somewhat magical, because someone with a strict 
> typing mindset (there’s an increasing number!) might expect that this is an 
> array of pointers to Python ints. This is why I’ve always preferred to write 
> `dtype=np.int`, resulting in the current code churn.
>
> I don’t know what the best answer is, just sparking the discussion Sebastian 
> wants to see. ;) For skimage we’ve already merged a fix (even if it is one of 
> dubious quality, as Stéfan points out [3] ;), so I don’t have too much stake 
> in the outcome.

Hi Juan,

Let me start with a disclaimer that I'm an end user, and as such it's
very easy for me to be bold when it comes to deprecations :)

But I experienced the same thing that you describe in
https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739429373
:

> [I]t was very surprising to me when I found out that np.float is float. For 
> the longest time I thought that np.float was equivalent to "whatever the 
> default float value is on my platform", and considered it best practice to 
> use that instead of plain float. 😅 I think that is a common misconception.

And I'm pretty sure the vast majority of end users faces this. The
proper np.float32 and other types are intuitive enough that people
don't go out of their way to read the documentation in detail, and
it's highly unexpected that some `np.*` types are mere aliases.
Now, this should probably not be a problem as long as people only
stick these aliases into `dtype` keyword arguments, because that works
as expected (based on the wrong premise). But once you extrapolate
from the `dtype=np.int` behaviour to "`np.int` must be my native numpy
int type" you can easily get subtle bugs. For instance, you might
expect `isinstance(this_type, np.int)` to give you True if `this_type`
is the type of an item of an array with `dtype=np.int`.

To be fair I'm not sure that I've ever been bitten by this
personally... but once you're aware of the pitfall it seems really
ominous. I guess one helpful question is this: among all the code
churn needed to fix the breakage did you find any bugs that were
revealed by the deprecation? If that's the case (in scikit-image or
any other large downstream library) then that would be a good argument
for going forward with the deprecation.
Cheers,

András


> Juan.
>
> [1]: 
> https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739334463
> [2]: https://github.com/scikit-image/scikit-image/pull/5103
> [3]: 
> https://github.com/scikit-image/scikit-image/pull/5103#issuecomment-739368765
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Programmatically contracting multiple tensors

2021-03-12 Thread Andras Deak
On Sat, Mar 13, 2021 at 1:32 AM Eric Wieser  wrote:
>
> Einsum has a secret integer argument format that appears in the Examples 
> section of the `np.einsum` docs, but appears not to be mentioned at all in 
> the parameter listing.

It's mentioned (albeit somewhat cryptically) sooner in the Notes:

"einsum also provides an alternative way to provide the subscripts and
operands as einsum(op0, sublist0, op1, sublist1, ..., [sublistout]).
If the output shape is not provided in this format einsum will be
calculated in implicit mode, otherwise it will be performed
explicitly. The examples below have corresponding einsum calls with
the two parameter methods.

New in version 1.10.0."

Not that this helps much, because I definitely wouldn't understand
this API without the examples.
But I'm not sure _where_ this could be highlighted among the
parameters; after all this is all covered by the *operands parameter.

András



> Eric
>
> On Sat, 13 Mar 2021 at 00:25, Michael Lamparski  
> wrote:
>>
>> Greetings,
>>
>> I have something in my code where I can receive an array M of unknown 
>> dimensionality and a list of "labels" for each axis.  E.g. perhaps I might 
>> get an array of shape (2, 47, 3, 47, 3) with labels ['spin', 'atom', 
>> 'coord', 'atom', 'coord'].
>>
>> For every axis that is labeled "coord", I want to multiply in some rotation 
>> matrix R.  So, for the above example, this could be done with the following 
>> handwritten line:
>>
>> return np.einsum('Cc,Ee,abcde->abCdE', R, R, M)
>>
>> But since I want to do this programmatically, I find myself in the awkward 
>> situation of having to construct this string (and e.g. having to arbitrarily 
>> limit the number of axes to 26 or something like that).  Is there a more 
>> idiomatic way to do this that would let me supply integer labels for 
>> summation indices?  Or should I just bite the bullet and start generating 
>> strings?
>>
>> ---
>> Michael
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] How to get Boolean matrix for similar lists in two different-size numpy arrays of lists

2021-03-14 Thread Andras Deak
On Sun, Mar 14, 2021 at 8:35 PM Robert Kern  wrote:
>
> On Sun, Mar 14, 2021 at 3:06 PM Ali Sheikholeslam 
>  wrote:
>>
>> I have written a question in:
>> https://stackoverflow.com/questions/66623145/how-to-get-boolean-matrix-for-similar-lists-in-two-different-size-numpy-arrays-o
>> It was recommended by numpy to send this subject to the mailing lists.
>>
>> The question is as follows. I would be appreciated if you could advise me to 
>> solve the problem:
>>
>> At first, I write a small example of to lists:
>>
>> F = [[1,2,3],[3,2,7],[4,4,1],[5,6,3],[1,3,7]]  # (1*5) 5 lists
>> S = [[1,3,7],[6,8,1],[3,2,7]]  # (1*3) 3 lists
>>
>> I want to get Boolean matrix for the same 'list's in two F and S:
>>
>> [False, True, False, False, True]  #  (1*5)5 
>> Booleans for 5 lists of F
>>
>> By using IM = reduce(np.in1d, (F, S)) it gives results for each number in 
>> each lists of F:
>>
>> [ True  True  True  True  True  True False False  True False  True  True
>>   True  True  True]   # (1*15)
>>
>> By using IM = reduce(np.isin, (F, S)) it gives results for each number in 
>> each lists of F, too, but in another shape:
>>
>> [[ True  True  True]
>>  [ True  True  True]
>>  [False False  True]
>>  [False  True  True]
>>  [ True  True  True]]   # (5*3)
>>
>> The true result will be achieved by code IM = [i in S for i in F] for the 
>> example lists, but when I'm using this code for my two main bigger numpy 
>> arrays of lists:
>>
>> https://drive.google.com/file/d/1YUUdqxRu__9-fhE1542xqei-rjB3HOxX/view?usp=sharing
>>
>> numpy array: 3036 lists
>>
>> https://drive.google.com/file/d/1FrggAa-JoxxoRqRs8NVV_F69DdVdiq_m/view?usp=sharing
>>
>> numpy array: 300 lists
>>
>> It gives wrong answer. For the main files it must give 3036 Boolean, in 
>> which 'True' is only 300 numbers. I didn't understand why this get wrong 
>> answers?? It seems it applied only on the 3rd characters in each lists of F. 
>> It is preferred to use reduce function by the two functions, np.in1d and 
>> np.isin, instead of the last method. How could to solve each of the three 
>> above methods??
>
>
> Thank you for providing the data. Can you show a complete, runnable code 
> sample that fails? There are several things that could go wrong here, and we 
> can't be sure which is which without the exact code that you ran.
>
> In general, you may well have problems with the floating point data that you 
> are not seeing with your integer examples.
>
> FWIW, I would continue to use something like the `IM = [i in S for i in F]` 
> list comprehension for data of this size.

Although somewhat off-topic for the numpy aspect, for completeness'
sake let me add that you'll probably want to first turn your list of
lists `S` into a set of tuples, and then look up each list in `F`
converted to a tuple (`[tuple(lst) in setified_S for lst in F]`). That
would probably be a lot faster for large lists.

András



You aren't getting any benefit trying to convert to arrays and using
our array set operations. They are written for 1D arrays of numbers,
not 2D arrays (attempting to treat them as 1D arrays of lists) and
won't really work on your data.
>
> --
> Robert Kern
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] savetxt -> gzip: nondeterministic because of time stamp

2021-04-14 Thread Andras Deak
On Wed, Apr 14, 2021 at 10:36 PM Joachim Wuttke  wrote:
>
> If argument fname of savetxt(fname, X, ...) ends with ".gz" then
> array X is not only converted to text, but also compressed using gzip.
>
> The format gzip [1] has a timestamp. The Python module gzip.py [2]
> sets the timestamp according to an optional constructor argument
> "mtime". By default, the current time is used.
>
> This makes the file written by savetxt(*.gz, ...) non-deterministic.
> This is unexpected and confusing in a numerics context.

Related: same for np.savez https://github.com/numpy/numpy/issues/9439

András


> I let different versions of a program generate *.gz files, and ran
> the "diff" util over pairs of output files to check whether any bit
> had changed. To my surprise, confusion, and desperation, output
> always had changed, and kept changing when I ran unchanged versions
> of my program over and again. So I learned the hard way that the
> *.gz files contain a timestamp.
>
> Regarding the module gzip.py, I submitted a pull request to improve
> description of the optional argument mtime, and hint at the possible
> choice mtime = 0 that makes outputs deterministic [3].
>
> Regarding numpy, I'd propose a bolder measure:
> To let savetxt(fname, X, ...) store exactly the same information in
> compressed and uncompressed files, always invoke gzip with mtime = 0.
>
> I would like to follow up with a pull request, but I am unable to
> find out how numpy.savetxt is invoking gzip.
>
> Joachim
>
> [1] https://www.ietf.org/rfc/rfc1952.txt
> [2] https://docs.python.org/3/library/gzip.html
> [3] https://github.com/python/cpython/pull/25410
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Add count (and dtype) to packbits

2021-07-21 Thread Andras Deak
On Wed, Jul 21, 2021 at 2:40 PM Neal Becker  wrote:

> In my application I need to pack bits of a specified group size into
> integral values.
> Currently np.packbits only packs into full bytes.
> For example, I might have a string of bits encoded as a np.uint8
> vector with each uint8 item specifying a single bit 1/0.  I want to
> encode them 4 bits at a time into a np.uint32 vector.
>
> python code to implement this:
>
> ---
> def pack_bits (inp, bits_per_word, dir=1, dtype=np.int32):
> assert bits_per_word <= np.dtype(dtype).itemsize * 8
> assert len(inp) % bits_per_word == 0
> out = np.empty (len (inp)//bits_per_word, dtype=dtype)
> i = 0
> o = 0
> while i < len(inp):
> ret = 0
> for b in range (bits_per_word):
> if dir > 0:
> ret |= inp[i] << b
> else:
> ret |= inp[i] << (bits_per_word - b - 1)
> i += 1
> out[o] = ret
> o += 1
> return out
> ---
>

Can't you just `packbits` into a uint8 array and then convert that to
uint32? If I change `dtype` in your code from `np.int32` to `np.uint32` (as
you mentioned in your email) I can do this:

rng = np.random.default_rng()
arr = (rng.uniform(size=32) < 0.5).astype(np.uint8)
group_size = 4
original = pack_bits(arr, group_size, dtype=np.uint32)
new = np.packbits(arr.reshape(-1, group_size), axis=-1,
bitorder='little').ravel().astype(np.uint32)
print(np.array_equal(new, original))
# True

There could be edge cases where the result dtype is too small, but I
haven't thought about that part of the problem. I assume this would work as
long as `group_size <= 8`.

András


> It looks like unpackbits has a "count" parameter but packbits does not.
> Also would be good to be able to specify an output dtype.
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] spam on the mailing lists

2021-09-29 Thread Andras Deak
Hi All,

Today both of the python.org mailing lists I'm subscribed to (numpy and
scipy-dev) got the same kind of link shortener spam. I assume all the
mailing lists started getting these, and that these won't go away for a
while.

Is there any way to prevent these, short of moderating emails from new list
members? Assuming the engine even supports that. There aren't many emails,
especially from new members, and I can't think of other ways that ensure no
false positives in filtering.

Since maintainer time is precious, I can volunteer to moderate such emails
if needed.
Cheers,

András
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: spam on the mailing lists

2021-09-29 Thread Andras Deak
On Wed, Sep 29, 2021 at 11:15 AM Ralf Gommers 
wrote:

>
>
> On Wed, Sep 29, 2021 at 9:32 AM Andras Deak  wrote:
>
>> Hi All,
>>
>> Today both of the python.org mailing lists I'm subscribed to (numpy and
>> scipy-dev) got the same kind of link shortener spam. I assume all the
>> mailing lists started getting these, and that these won't go away for a
>> while.
>>
>
> I don't see these on
> https://mail.python.org/archives/list/numpy-discussion@python.org/, nor
> did I receive them (and I did check my spam folder). Do you see it in the
> archive, or do you understand why you do receive them?
>

Sorry for not being specific: they were sent as replies to the latest
thread on each list, see e.g. at the bottom (6th email, 5th reply) of
https://mail.python.org/archives/list/numpy-discussion@python.org/thread/BLCIC2WMJQ5VT6HJSUW4V5TNGQ36JQXI/

András



> Ralf
>
>
>> Is there any way to prevent these, short of moderating emails from new
>> list members? Assuming the engine even supports that. There aren't many
>> emails, especially from new members, and I can't think of other ways that
>> ensure no false positives in filtering.
>>
>> Since maintainer time is precious, I can volunteer to moderate such
>> emails if needed.
>> Cheers,
>>
>> András
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ralf.gomm...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: spam on the mailing lists

2021-09-29 Thread Andras Deak
On Wed, Sep 29, 2021 at 11:28 AM Andras Deak  wrote:

> On Wed, Sep 29, 2021 at 11:15 AM Ralf Gommers 
> wrote:
>
>>
>>
>> On Wed, Sep 29, 2021 at 9:32 AM Andras Deak 
>> wrote:
>>
>>> Hi All,
>>>
>>> Today both of the python.org mailing lists I'm subscribed to (numpy and
>>> scipy-dev) got the same kind of link shortener spam. I assume all the
>>> mailing lists started getting these, and that these won't go away for a
>>> while.
>>>
>>
>> I don't see these on
>> https://mail.python.org/archives/list/numpy-discussion@python.org/, nor
>> did I receive them (and I did check my spam folder). Do you see it in the
>> archive, or do you understand why you do receive them?
>>
>
> Sorry for not being specific: they were sent as replies to the latest
> thread on each list, see e.g. at the bottom (6th email, 5th reply) of
> https://mail.python.org/archives/list/numpy-discussion@python.org/thread/BLCIC2WMJQ5VT6HJSUW4V5TNGQ36JQXI/
>

Found the permalink: (warning, spam link there)
https://mail.python.org/archives/list/numpy-discussion@python.org/message/MWI6AKF4QNQ45532MVA3XOXYW5GDFL6O/



> András
>
>
>
>> Ralf
>>
>>
>>> Is there any way to prevent these, short of moderating emails from new
>>> list members? Assuming the engine even supports that. There aren't many
>>> emails, especially from new members, and I can't think of other ways that
>>> ensure no false positives in filtering.
>>>
>>> Since maintainer time is precious, I can volunteer to moderate such
>>> emails if needed.
>>> Cheers,
>>>
>>> András
>>> ___
>>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>>> To unsubscribe send an email to numpy-discussion-le...@python.org
>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>>> Member address: ralf.gomm...@gmail.com
>>>
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: deak.and...@gmail.com
>>
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: spam on the mailing lists

2021-09-29 Thread Andras Deak
On Wed, Sep 29, 2021 at 12:02 PM Ralf Gommers 
wrote:

>
>
> On Wed, Sep 29, 2021 at 11:33 AM Andras Deak 
> wrote:
>
>> On Wed, Sep 29, 2021 at 11:28 AM Andras Deak 
>> wrote:
>>
>>> On Wed, Sep 29, 2021 at 11:15 AM Ralf Gommers 
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Sep 29, 2021 at 9:32 AM Andras Deak 
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> Today both of the python.org mailing lists I'm subscribed to (numpy
>>>>> and scipy-dev) got the same kind of link shortener spam. I assume all the
>>>>> mailing lists started getting these, and that these won't go away for a
>>>>> while.
>>>>>
>>>>
>>>> I don't see these on
>>>> https://mail.python.org/archives/list/numpy-discussion@python.org/,
>>>> nor did I receive them (and I did check my spam folder). Do you see it in
>>>> the archive, or do you understand why you do receive them?
>>>>
>>>
>>> Sorry for not being specific: they were sent as replies to the latest
>>> thread on each list, see e.g. at the bottom (6th email, 5th reply) of
>>> https://mail.python.org/archives/list/numpy-discussion@python.org/thread/BLCIC2WMJQ5VT6HJSUW4V5TNGQ36JQXI/
>>>
>>
>> Found the permalink: (warning, spam link there)
>> https://mail.python.org/archives/list/numpy-discussion@python.org/message/MWI6AKF4QNQ45532MVA3XOXYW5GDFL6O/
>>
>
> Thanks!
>
>
>>>>
>>>>> Is there any way to prevent these, short of moderating emails from new
>>>>> list members? Assuming the engine even supports that. There aren't many
>>>>> emails, especially from new members, and I can't think of other ways that
>>>>> ensure no false positives in filtering.
>>>>>
>>>>
> We don't have admin access to the python.org lists, so this is a bit of a
> problem. We have never had a spam problem, so we can ask to block this user
> first. If it continues to happen, we may be able to moderate new subscriber
> emails, but we do need to ask for permissions first and I'm not sure we'll
> get them.
>

Unfortunately (but unsurprisingly) there are multiple accounts doing this
https://mail.python.org/archives/search?q=bit.ly&page=1&sort=date-desc
This is why I figured that an _a posteriori_ whack-a-mole against these
specific users might not be a feasible solution to the underlying problem.

András



> A better solution longer term is migrating to Discourse, which has far
> better moderation tools than Mailman and is also more approachable for
> people not used to mailing lists (which is most newcomers to open source).
> Migrating is a bit of a pain, but with the new CZI grant having a focus on
> improving the contributor experience, we should be able to do this.
>
> Cheers,
> Ralf
>
>
>
>>
>>>>> Since maintainer time is precious, I can volunteer to moderate such
>>>>> emails if needed.
>>>>> Cheers,
>>>>>
>>>>> András
>>>>> ___
>>>>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>>>>> To unsubscribe send an email to numpy-discussion-le...@python.org
>>>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>>>>> Member address: ralf.gomm...@gmail.com
>>>>>
>>>> ___
>>>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>>>> To unsubscribe send an email to numpy-discussion-le...@python.org
>>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>>>> Member address: deak.and...@gmail.com
>>>>
>>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ralf.gomm...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: spam on the mailing lists

2021-10-01 Thread Andras Deak
On Fri, Oct 1, 2021 at 4:27 PM Ilhan Polat  wrote:

> The reason why I mentioned GH discussions is that literally everybody who
> is engaged with the code, is familiar with the format, included in the
> codebase product and has replies in built unlike the Discourse (opinion is
> mine) useless flat discussion design where replies are all over the place
> just like the mailing list in case you are not using a tree view supporting
> client. Hence topic hijacking is one of the main usability difficulties of
> emails.
>
> The goal here is to have a coherent engagement with everyone not just
> within a small circle, such that there is indeed a discussion happening
> rather than a few people chiming in. It would be a nice analytics exercise
> to have how many active users using these lists. I'd say 20-25 max for
> contribs and team members which is really not much. I know some people are
> still using IRC and mailing lists but I wouldn't argue that these are the
> modern media to have proper engaging discussions. "Who said to whom" is the
> bread and butter of such discussions. And I do think that discourse is
> exactly the same thing with mailing lists with a slightly better UI while
> virtually everyone else in the world is doing replies.
>

(There are probably a lot of users like myself who follow the mailing list
discussions but rarely feel the need to speak up themselves. Not that this
says much either way in the discussion, just pointing it out).

I'm not intimately familiar with github discussions (I've only used it a
few times), but as far as I can tell it only has answers (or "comments")
and comments (or "replies") on answers, i.e. 2 levels of replies rather
than a flat single level of replies. If this is indeed the case then I'm
not sure it's that much better than a flat system, since when things really
get hairy then 2 levels are probably also insufficient to ensure "who said
to whom". The "clear replies" argument would hold stronger (in my
peanut-gallery opinion) for a medium that supports full reply trees like
many comment sections do on various websites.

András


> I would be willing to help with the objections raised since I have been
> using GH discussions for quite a while now and there are many tools
> available for administration of the discussions. For example,
>
>
> https://github.blog/changelog/2021-09-14-notification-emails-for-discussions/
>
> is a recent feature. I don't work for GitHub obviously and have nothing to
> do with them but the reasons I'm willing to hear about.
>
>
>
>
>
>
> On Fri, Oct 1, 2021 at 3:07 PM Matthew Brett 
> wrote:
>
>> Hi,
>>
>> On Fri, Oct 1, 2021 at 1:57 PM Rohit Goswami 
>> wrote:
>> >
>> > I guess then the approach overall would evolve to something like using
>> the mailing list to announce discourse posts which need input. Though I
>> would assume that the web interface essentially makes the mailing list
>> almost like discourse, even for new users.
>> >
>> > The real issue IMO is still the moderation efforts and additional
>> governance needed for maintaining discourse.
>>
>> Yes - that was what I meant.   I do see that mailing lists are harder
>> to moderate, in that once the email has gone out, it is difficult to
>> revoke.  So is the argument just that you *can* moderate on Discourse,
>> therefore you need to think about it more?  Do we have any reason to
>> think that more moderation will in fact be needed?  We've needed very
>> little so far on the mailing list, as far as I can see.
>>
>> Chers,
>>
>> Matthew
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ilhanpo...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: What happened to the numpy.random documentation?

2021-10-14 Thread Andras Deak
On Thursday, October 14, 2021, Joseph Fox-Rabinovitz <
jfoxrabinov...@gmail.com> wrote:

> I second that reinstating such a list would be extremely useful. My issue
> has been with the polynomial package, but the end result is the same.
>

There's a mostly relevant issue: https://github.com/numpy/numpy/issues/19420

András


>
>
> - Joe
>

> On Thu, Oct 14, 2021, 12:45 Melissa Mendonça  wrote:
>
>> Hi Paul,
>>
>> Do you think having a page with the flat list of routines back, in
>> addition to the explanations, would solve this?
>>
>> - Melissa
>>
>> On Thu, Oct 14, 2021 at 1:34 PM Paul M.  wrote:
>>
>>> Hi All,
>>>
>>> The documentation of Numpy's submodules  used to have a fairly standard
>>> structure as shown here in the 1.16 documentation:
>>>
>>>   https://docs.scipy.org/doc/numpy-1.16.1/reference/routines.random.html
>>>
>>> Now the same page in the API documentation looks like this:
>>>
>>>   https://numpy.org/doc/stable/reference/random/index.html
>>>
>>> While I appreciate the expository text in the new documentation about
>>> how the generators work, this new version is much less useful as a
>>> reference to the API.  It seems like it might fit better in the user manual
>>> rather than the API reference.
>>>
>>> From my perspective it seems like the new version of the documentation
>>> is harder to navigate in terms of finding information quickly (more
>>> scrolling, harder to get a bird's eye view of functions in various
>>> submodules, etc).
>>>
>>> Has anyone else had a similar reaction to the changes? I teach a couple
>>> of courses in scientific computing and bioinformatics and my students seem
>>> to also struggle to get a sense of what the different modules offer based
>>> on the new version of the documentation. For now, I'm referring them to the
>>> old (1.70) reference manuals as a better way to get acquainted with the
>>> libraries.
>>>
>>> Cheers,
>>> Paul Magwene
>>> ___
>>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>>> To unsubscribe send an email to numpy-discussion-le...@python.org
>>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>>> Member address: meliss...@gmail.com
>>>
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: jfoxrabinov...@gmail.com
>>
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Get a function definition/implementation hint similar to the one shown in pycharm.

2021-10-19 Thread Andras Deak
On Tue, Oct 19, 2021 at 7:26 AM  wrote:

> I've written the following python code snippet in pycharm:
> ```python
> import numpy as np
> from numpy import pi, sin
>
> a = np.array([1], dtype=bool)
> if np.in|vert(a) == ~a:
> print('ok')
> ```
> When putting the point/cursor in the above code snippet at the position
> denoted by `|`, I would like to see information similar to that provided by
> `pycharm`, as shown in the following screenshots:
>

Hi,

Could you explain exactly what you're asking about? Are you using pycharm,
but want to see similar tooltips with your custom (non-numpy-library) code?
Or do you want to see these numpy hints outside pycharm? If the latter,
what kind of IDE do you mean?

András


>
> https://user-images.githubusercontent.com/11155854/137619512-674e0eda-7564-4e76-af86-04a194ebeb8e.png
>
> https://user-images.githubusercontent.com/11155854/137619524-a0b584a3-1627-4612-ab1f-05ec1af67d55.png
>
> But I wonder if there are any other python packages/tools that can help me
> achieve this goal?
>
> Regards,
> HZ
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: dtype=(bool) vs dtype=bool

2021-10-19 Thread Andras Deak
On Tue, Oct 19, 2021 at 3:42 PM  wrote:

> See the following testing in IPython shell:
>
> In [6]: import numpy as np
>
> In [7]: a = np.array([1], dtype=(bool))
>
> In [8]: b = np.array([1], dtype=bool)
>
> In [9]: a
> Out[9]: array([ True])
>
> In [10]: b
> Out[10]: array([ True])
>
> It seems that dtype=(bool) and dtype=bool are both correct usages. If so,
> which is preferable?
>

This is not really a numpy issue: in Python itself `(bool)` is the exact
same thing as `bool`. Superfluous parentheses are just ignored. You could
use `dis.dis` to compare the two expressions and see that they compile to
the same bytecode.
So go with `dtype=bool`.

András



> Regards,
> HZ
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: dtype=(bool) vs dtype=bool

2021-10-19 Thread Andras Deak
On Tue, Oct 19, 2021 at 4:07 PM  wrote:

> > You could use `dis.dis` to compare the two expressions and see that they
> compile to the same bytecode.
>
> Do you mean the following:
>

Indeed, that is exactly what I meant. You don't even need the numpy import
for that. Since `bool` and `(bool)` are compiled to the same bytecode,
numpy isn't even aware that you put parentheses around `bool` in your code.



> In [1]: import numpy as np
> In [2]: from dis import dis
> In [7]: dis('bool')
>   1   0 LOAD_NAME0 (bool)
>   2 RETURN_VALUE
>
> In [8]: dis('(bool)')
>   1   0 LOAD_NAME0 (bool)
>   2 RETURN_VALUE
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Conversion from C-layout to Fortran-layout in Cython

2021-11-10 Thread Andras Deak
On Thursday, November 11, 2021, Ilhan Polat  wrote:

> I've asked this in Cython mailing list but probably I should also get some
> feedback here too.
>
> I have the following function defined in Cython and using flat memory
> pointers to hold n by n array data.
>
>
> cdef some_C_layout_func(double[:, :, ::1] Am) nogil: # ... cdef double *
> work1 = malloc(n*n*sizeof(double)) cdef double *work2 = 
> malloc(n*n*sizeof(double)) # ... # Lots of C-layout operations here # ...
> dgetrs('T', &n, &n, &work1[0], &n, &ipiv[0], &work2[0], &n, &info )
> dcopy(&n2, &work2[0], &int1, &Am[0, 0, 0], &int1) free(...)
>
>
>
>
>
>
>
>
>
> Here, I have done everything in C layout with work1 and work2 but I have
> to convert work2 into Fortran layout to be able to solve AX = B. A can be
> transposed in Lapack internally via the flag 'T' so the only obstacle I
> have now is to shuffle work2 which holds B transpose in the eyes of Fortran
> since it is still in C layout.
>
> If I go naively and make loops to get one layout to the other that
> actually spoils all the speed benefits from this Cythonization due to cache
> misses. In fact 60% of the time is spent in that naive loop across the
> whole function.
>
>
Sorry if this is a dumb question, but is this true whether or not you loop
over contiguous blocks of the input vs the output array? Or is the faster
of the two options still slower than the linsolve?

András


>
>  Same goes for the copy_fortran() of memoryviews.
>
> I have measured the regular NumPy np.asfortranarray()  and the performance
> is quite good enough compared to the actual linear solve. Hence whatever it
> is doing underneath I would like to reach out and do the same possibly via
> the C-API. But my C knowledge basically failed me around this line
> https://github.com/numpy/numpy/blob/8dbd507fb6c854b362c26a0dd056cd
> 04c9c10f25/numpy/core/src/multiarray/multiarraymodule.c#L1817
>
> I have found the SO post from https://stackoverflow.com/
> questions/45143381/making-a-memoryview-c-contiguous-fortran-contiguous
> but I am not sure if that is the canonical way to do it in newer Python
> versions.
>
> Can anyone show me how to go about it without interacting with Python
> objects?
>
> Best,
> ilhan
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Putting in `np.ma.ndenumerate` MaskedArray specific ndenumerate

2021-11-17 Thread Andras Deak
On Wed, Nov 17, 2021 at 7:39 PM Sebastian Berg 
wrote:

> Hi all,
>
> the `np.ndenumerate` does not work well for masked arrays (like many
> main namespace functions, it simply ignores/drops the mask).
>
> There is a PR (https://github.com/numpy/numpy/pull/20020) to add a
> version of it to `np.ma` (masked array specific).  And we thought it
> seemed reasonable and were planning on putting it in.
>
> This version skips all masked elements.  An alternative could be to
> return `np.ma.masked` for masked elements?
>
> So if anyone thinks that may be the better solution, please send a
> brief mail.
>

Would it be a bad idea to add a kwarg that specifies this behaviour (i.e.
offering both alternatives)? Assuming people might need the masked items to
be there under certain circumstances. Perhaps when zipping masked data with
dense data?

András



> (Personally, I don't have opinions on masked arrays for the most part.)
>
> Cheers,
>
> Sebastian
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Putting in `np.ma.ndenumerate` MaskedArray specific ndenumerate

2021-11-17 Thread Andras Deak
On Wed, Nov 17, 2021 at 8:35 PM Sebastian Berg 
wrote:

> On Wed, 2021-11-17 at 19:49 +0100, Andras Deak wrote:
> > On Wed, Nov 17, 2021 at 7:39 PM Sebastian Berg
> > 
> > wrote:
> >
> > > Hi all,
> > >
> > > the `np.ndenumerate` does not work well for masked arrays (like
> > > many
> > > main namespace functions, it simply ignores/drops the mask).
> > >
> > > There is a PR (https://github.com/numpy/numpy/pull/20020) to add a
> > > version of it to `np.ma` (masked array specific).  And we thought
> > > it
> > > seemed reasonable and were planning on putting it in.
> > >
> > > This version skips all masked elements.  An alternative could be to
> > > return `np.ma.masked` for masked elements?
> > >
> > > So if anyone thinks that may be the better solution, please send a
> > > brief mail.
> > >
> >
> > Would it be a bad idea to add a kwarg that specifies this behaviour
> > (i.e.
> > offering both alternatives)? Assuming people might need the masked
> > items to
> > be there under certain circumstances. Perhaps when zipping masked
> > data with
> > dense data?
> >
>
> Sure, if you agree the default should be skipping, I guess we are OK
> with adding it? ;)
>

I don't actually use masked arrays myself, nor ndenumerate, so I'm very
forgiving in this matter...
But if both use cases are plausible (_if_, although I can indeed imagine
that this is the case), supporting both seems straightforward. Considering
the pure python implementation it wouldn't be a problem to expose both
functionalities.

András



> Cheers,
>
> Sebastian
>
>
> > András
> >
> >
> >
> > > (Personally, I don't have opinions on masked arrays for the most
> > > part.)
> > >
> > > Cheers,
> > >
> > > Sebastian
> > > ___
> > > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > > To unsubscribe send an email to numpy-discussion-le...@python.org
> > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > > Member address: deak.and...@gmail.com
> > >
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: sebast...@sipsolutions.net
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Proposal for new function to determine if a float contains an integer

2021-12-31 Thread Andras Deak
On Fri, Dec 31, 2021 at 1:36 AM Joseph Fox-Rabinovitz <
jfoxrabinov...@gmail.com> wrote:

> Hi,
>
> I wrote a reference implementation for a C ufunc, `isint`, which returns
> True for integers and False for non-integers, found here:
> https://github.com/madphysicist/isint_ufunc. 
>

Shouldn't we keep the name of the stdlib float method?

>>> (3.0).is_integer()
True

See https://docs.python.org/3/library/stdtypes.html#float.is_integer

András



> The idea came from a Stack Overflow question of mine, which has gotten a
> fair number of views and even some upvotes:
> https://stackoverflow.com/q/35042128/2988730. The current "recommended"
> solution is to use ``((x % 1) == 0)``. This is slower and more cumbersome
> because of the math operations and the temporary storage. My version
> returns a single array of booleans with no intermediaries, and is between 5
> and 40 times faster, depending on the type and size of the input.
>
> If you are interested in taking a look, there is a suite of tests and a
> small benchmarking script that compares the ufunc against the modulo
> expression. The entire thing currently works with bit twiddling on an
> appropriately converted integer representation of the number. It assumes a
> standard IEEE754 representation for float16, float32, float64. The extended
> 80-bit float128 format gets some special treatment because of the explicit
> integer bit. Complex numbers are currently integers only if they are real
> and integral. Integer types (including bool) are always integers. Time and
> text raise TypeErrors, since their integerness is meaningless.
>
> If a consensus forms that this is something appropriate for numpy, I will
> need some pointers on how to package up C code properly. This was an
> opportunity for me to learn to write a basic ufunc. I am still a bit
> confused about where code like this would go, and how to harness numpy's
> code generation. I put comments in my .c and .h file showing how I would
> expect the generators to look, but I'm not sure where to plug something
> like that into numpy. It would also be nice to test on architectures that
> have something other than a 80-bit extended long double instead of a proper
> float128 quad-precision number.
>
> Please let me know your thoughts.
>
> Regards,
>
> - Joe
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Performance mystery

2022-01-19 Thread Andras Deak
On Wed, Jan 19, 2022 at 11:50 AM Francesc Alted  wrote:

>
>
> On Wed, Jan 19, 2022 at 7:33 AM Stefan van der Walt 
> wrote:
>
>> On Tue, Jan 18, 2022, at 21:55, Warren Weckesser wrote:
>> > expr = 'z.real**2 + z.imag**2'
>> >
>> > z = generate_sample(n, rng)
>>
>> 🤔 If I duplicate the `z = ...` line, I get the fast result throughout.
>> If, however, I use `generate_sample(1, rng)` (or any other value than `n`),
>> it does not improve matters.
>>
>> Could this be a memory caching issue?
>>
>
> I can also reproduce that, but only on my Linux boxes.  My MacMini does
> not notice the difference.
>
> Interestingly enough, you don't even need an additional call to
> `generate_sample(n, rng)`. If one use `z = np.empty(...)` and then do an
> assignment, like:
>
> z = np.empty(n, dtype=np.complex128)
> z[:] = generate_sample(n, rng)
>
> then everything runs at the same speed:
>

Just turning the view into a copy inside `generate_sample()` will also make
the difference go away, for probably the same reason as this version.

András




>
> numpy version 1.20.3
>
>  142.3667 microseconds
>  142.3717 microseconds
>  142.3781 microseconds
>
>  142.7593 microseconds
>  142.3579 microseconds
>  142.3231 microseconds
>
> As another data point, by doing the same operation but using numexpr I am
> not seeing any difference either, not even on Linux:
>
> numpy version 1.20.3
> numexpr version 2.8.1
>
>   95.6513 microseconds
>   88.1804 microseconds
>   97.1322 microseconds
>
>  105.0833 microseconds
>  100. microseconds
>  100.5654 microseconds
>
> [it is rather like a bit the other way around, the second iteration seems
> a hair faster]
> See the numexpr script below.
>
> I am totally puzzled here.
>
> """
> import timeit
> import numpy as np
> import numexpr as ne
>
>
> def generate_sample(n, rng):
> return rng.normal(scale=1000, size=2*n).view(np.complex128)
>
>
> print(f'numpy version {np.__version__}')
> print(f'numexpr version {ne.__version__}')
> print()
>
> rng = np.random.default_rng()
> n = 25
> timeit_reps = 1
>
> expr = 'ne.evaluate("zreal**2 + zimag**2")'
>
> z = generate_sample(n, rng)
> zreal = z.real
> zimag = z.imag
> for _ in range(3):
> t = timeit.timeit(expr, globals=globals(), number=timeit_reps)
> print(f"{1e6*t/timeit_reps:9.4f} microseconds")
> print()
>
> z = generate_sample(n, rng)
> zreal = z.real
> zimag = z.imag
> for _ in range(3):
> t = timeit.timeit(expr, globals=globals(), number=timeit_reps)
> print(f"{1e6*t/timeit_reps:9.4f} microseconds")
> print()
> """
>
> --
> Francesc Alted
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: ndarray repr - put shape & type first

2022-05-07 Thread Andras Deak
On Sat, May 7, 2022, at 11:36, Ilya Kamenshchikov wrote:
> Hello,
>
> I wanted to discuss a pain point I've experienced while debugging numpy 
> code. When dealing with e.g. transformed image arrays or other 
> non-trivial ndarrays in debugger, I'm swamped by a bunch of numbers in 
> their repr that don't help me at all. What I really care about is 
> *shape and dtype*, as they are essentially what distinguishes complete 
> type of numpy arrays.

Hi Ilya,

It sounds like you're not actually looking for the repr when you are debugging. 
If you just need the shape and dtype, why not just print those?

Even "mere" debuggers like pdb support command aliases, see e.g. 
https://docs.python.org/3/library/pdb.html#pdbcommand-alias . I'm sure any 
debugger or IDE worth its salt will let you define a helper function in some 
configuration file which you can use during debugging to print what you need.

András

>
> Example of not helpful output:
> ```
> array([[[0.49113064, 0.42102904, 0.62108876],
> [0.25435884, 0.18665439, 0.53790145],
> ```
>
> By counting the [ brackets, I can at least get the number of 
> dimensions, but usefulness stops here.
>
> Could repr of an array start with something like:
> `array(*shape=[32,32,3], dtype=float, data=*([[[0.49113064, ...)`
>
> I know one invariant that repr likes to keep is that you can post repr 
> and it should be the initialisation of the represented object, but 
> given that we replace long number sequence with ..., this is already 
> not the case.
>
> Short term: Perhaps a plugin / change to IDE could do what I ask, 
> please let me know if something like this already exists :)
>
> Long term: I think that behavior will be more useful than what is 
> currently there. It could also be conditionally there once we anyhow 
> need to use ... for too long arrays.
>
> Best Regards,
> --
> Ilya Kamen
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: ndarray shape permutation

2022-05-17 Thread Andras Deak
On Mon, May 16, 2022, at 17:54, Paul Korir wrote:
> Hellos,
> I would like to propose `numpy.ndarray.permute_shape()` 
> method to predictably permute the shape of an ndarray. In my opinion, 
> the current alternatives (`swapaxes`, `transform`, `moveaxes` and 
> friends) are counterintuitive and rely on referring to the axis 
> indices. It would be abundantly helpful to have something like reshape 
> but which moves the data around (unlike reshape, which doesn't). 

Hi Paul,

Could you please show a small example of how `permute_shape()` is meant to 
work? It's not clear to me what it's meant to do, especially in contrast to 
`transpose` and `swapaxes` etc. "Like reshape" and "permute" together 
specifically sounds confusing to me. From your description of the problem I'd 
have thought that `transpose` (with a potential `copy`) would do what you need, 
so I'm probably misunderstanding your use case. What would it do when applied 
to a 3d array of, say, shape (2, 3, 5)?

András

> Scenario: structural biology uses MRC files, which define a number of 
> fields that describe a 3D volume. There is a field which describes the 
> dimensions of the 3D image and another which associates each image axis 
> with a physical axis. There are six such assignments between the image 
> shape and the axes assignments (if we keep the shape fixed we can 
> assign XYZ, XZY, YXZ, YZX, ZXY, ZYX to the columns, rows and stacks) 
> and working out how to correctly transform the data is generally 
> non-trivial. I've written a package 
> (https://github.com/emdb-empiar/maptools) which does this but I use 
> swapaxes to reorient the 3D image. It would be so much easier to use 
> `numpy.ndarray.permute_shape()` instead.
> Any thoughts? Also, any helpful hints on how to get started with such a 
> contribution would be helpful.
> Paul
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Exponential function, sine function and cos function

2022-06-04 Thread Andras Deak
On Fri, Jun 3, 2022, at 23:54, Brinley Patterson wrote:
> Hi,
>
> By using the exponential equation:
>
> exp(x) = (sum{k=0}{n} 1/ k! ) ^ x
>
> the speed and accuracy of calculating exponent greatly increases. Plus 
> it makes it easier to use with imaginary numbers. I have the python 
> function code if you are interested to learn more about this. This 
> equation can then be used to calculate sine and cos more efficiently 
> using the exponent form of them both.

Hi,

From the equation you posted it sounds like your recommendation is to compute 
`e` from a truncated Taylor series, and then raise that number to the `x`th 
power. But we already know "the" double-precision value of `e`, a.k.a. 
`exp(1)`. So the recommended alternative would be more work than using 
double-precision `e` and then raising _that_ to the `x`th power.

So this makes me think that I missed your point. Could you please try 
elaborating on your recommendation?

András

>
> Kind regards,
>
> Brinley
> MSc Machine Learning
> BSc Mathematical Physics
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Feature request: function to get minimum and maximum values simultaneously (as a tuple)

2022-06-30 Thread Andras Deak
On Thu, Jun 30, 2022, at 22:23, Ewout ter Hoeven wrote:
> A function to get the minimum and maximum values of an array 
> simultaneously could be very useful, from both a convenience and 
> performance point of view. Especially when arrays get larger the 
> performance benefit could be significant, and even more if the array 
> doesn't fit in L2/L3 cache or even memory.

Hi,

There's an open issue asking for this feature: 
https://github.com/numpy/numpy/issues/9836

András

>
> There are many cases where not either the minimum or the maximum of an 
> array is required, but both. Think of clipping an array, getting it's 
> range, checking for outliers, normalizing, making a plot like a 
> histogram, etc.
>
> This function could be called aminmax() for example, and also be called 
> like ndarray.minmax(). It should return a tuple (min, max) with the 
> minimum and maximum values of the array, identical to calling 
> (ndarray.min(), ndarray.max()).
>
> With such a function, numpy.ptp() and the special cases of 
> numpy.quantile(a, q=[0,1]) and numpy.percentile(a, q=[0,100]) could 
> also potentially be speeded up, among others.
>
> Potentially argmin and argmax could get the same treatment, being 
> called argminmax().
>
> There is also a very extensive post on Stack Overflow (a bit old 
> already) with discussion and benchmarks: 
> https://stackoverflow.com/questions/12200580/numpy-function-for-simultaneous-max-and-min
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Proposal: Indexing by callables

2022-07-31 Thread Andras Deak
Hi,

For the sake of transparency, a short mailing list thread from November 2021 is 
here: 
https://mail.python.org/archives/list/numpy-discussion@python.org/thread/DNTC3A4CTVDYISVS57GXURJ6QP2PXPHK/
Corresponding (low-activity) feature request where you also commented: 
https://github.com/numpy/numpy/issues/20453

András

On Sat, Jul 30, 2022, at 16:42, Matteo Santamaria wrote:
> Hi all,
> 
> I’d like to open a discussion on supporting callables within 
> `np.ndarray.__getitem__`. The intent is to make long function-chaining 
> expressions more ergonomic by removing the need for an intermediary, 
> temporary value.
> 
> Instead of 
> 
> ```
> tmp = long_and_complicated_expression(arr)
> return tmp[tmp > 0]
> ```
> 
> we would allow
> 
> ```
> return long_and_complicated_expression(arr)[lambda x: x > 0]
> ```
> 
> This feature has long been supported by pandas’ .loc accessor, where 
> I’ve personally found it very valuable. In accordance with the pandas 
> implementation, the callable would be required to take only a single 
> argument.
> 
> In terms of semantics, it should always be the case that `arr[fn] == 
> arr[fn(arr)]`.
> 
> I do realize that expanding the API and supporting additional indexing 
> methods is not without cost, so I open the floor to anyone who’d like 
> to weigh in for/against the proposal. 
> 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Newbie's naive question about debug

2022-09-07 Thread Andras Deak
On Wed, Sep 7, 2022, at 07:55, 腾刘 wrote:
> Hello, everyone. I'm a newcomer here and looking forward to 
> contributing to Numpy core code in the future. 
>
> However, there is an obstacle right ahead of me that I don't know how 
> to figure out the corresponding relationship between Numpy python code 
> and its C implementation code. In another word, I really want to know 
> which part of Numpy core C code implements this simple numpy array 
> addition:
>
 a = np.array([1, 2, 3])
 b = np.array([2, 3, 4])
 c = a + b
>
> I have built Numpy from source in a virtual environment (via conda), 
> and I know the addition described above might be implemented in loop.c 
> or something like that from StackOverflow, but I want to find a general 
> approach to solve all these similar problems like: where are array 
> transpose implemented?  where are array reshape implemented? etc. So I 
> am seeking some tools like GDB and track python step by step. I *can 
> *use gdb to track python, but I can't set breakpoint in Numpy because 
> Numpy's C code isn't part of Python interpreter.
>
> So all in all, my final question is that, how can I debug and track 
> Numpy's C code and see which part is executed? I believe there must be 
> a method because the real developers will certainly need to debug.

Hi,

As a first pointer, you might find one of the first videos on the NumPy youtube 
channel (from a NumPy Newcomers’ Hour panel discussion) useful, in case you 
haven't seen it yet: "Find your way in the NumPy codebase :: Melissa Mendonca, 
Sebastian Berg, Tyler Reddy, Matti Picus" 
https://www.youtube.com/watch?v=mTWpBf1zewc . Others in the channel might also 
be relevant.

András



>
> Thanks in advance. And I'm quite new here, so I'm not sure whether I 
> should ask this kind of primitive and naive question here, since the 
> previous discussions seem to be advanced and I can't understand most of 
> them. 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Newbie's naive question about debug

2022-09-07 Thread Andras Deak
On Thu, Sep 8, 2022, at 03:39, 腾刘 wrote:
> Thanks so much! I will check these videos thoroughly. By the way, 
> another stupid question: does your reply means that my mail has been 
> accepted by the moderator and got permission into the mailing-list 
> (which means that everyone in the mailing-list will see this email) ?

That is correct. When in doubt, check the public archive: 
https://mail.python.org/archives/list/numpy-discussion@python.org/latest

András

>
>
> Andras Deak  于2022年9月7日周三 21:09写道:
>> On Wed, Sep 7, 2022, at 07:55, 腾刘 wrote:
>> > Hello, everyone. I'm a newcomer here and looking forward to 
>> > contributing to Numpy core code in the future. 
>> >
>> > However, there is an obstacle right ahead of me that I don't know how 
>> > to figure out the corresponding relationship between Numpy python code 
>> > and its C implementation code. In another word, I really want to know 
>> > which part of Numpy core C code implements this simple numpy array 
>> > addition:
>> >
>> >>>> a = np.array([1, 2, 3])
>> >>>> b = np.array([2, 3, 4])
>> >>>> c = a + b
>> >
>> > I have built Numpy from source in a virtual environment (via conda), 
>> > and I know the addition described above might be implemented in loop.c 
>> > or something like that from StackOverflow, but I want to find a general 
>> > approach to solve all these similar problems like: where are array 
>> > transpose implemented?  where are array reshape implemented? etc. So I 
>> > am seeking some tools like GDB and track python step by step. I *can 
>> > *use gdb to track python, but I can't set breakpoint in Numpy because 
>> > Numpy's C code isn't part of Python interpreter.
>> >
>> > So all in all, my final question is that, how can I debug and track 
>> > Numpy's C code and see which part is executed? I believe there must be 
>> > a method because the real developers will certainly need to debug.
>> 
>> Hi,
>> 
>> As a first pointer, you might find one of the first videos on the NumPy 
>> youtube channel (from a NumPy Newcomers’ Hour panel discussion) useful, in 
>> case you haven't seen it yet: "Find your way in the NumPy codebase :: 
>> Melissa Mendonca, Sebastian Berg, Tyler Reddy, Matti Picus" 
>> https://www.youtube.com/watch?v=mTWpBf1zewc . Others in the channel might 
>> also be relevant.
>> 
>> András
>> 
>> 
>> 
>> >
>> > Thanks in advance. And I'm quite new here, so I'm not sure whether I 
>> > should ask this kind of primitive and naive question here, since the 
>> > previous discussions seem to be advanced and I can't understand most of 
>> > them. 
>> > ___
>> > NumPy-Discussion mailing list -- numpy-discussion@python.org
>> > To unsubscribe send an email to numpy-discussion-le...@python.org
>> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> > Member address: deak.and...@gmail.com
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: 27rabbi...@gmail.com
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Advanced indexing doesn't follow the Principle of least astonishment

2022-12-31 Thread Andras Deak
On Thu, Dec 29, 2022, at 16:34, Robert Kern wrote:
> On Thu, Dec 29, 2022 at 8:50 AM Diogo Valada  
> wrote:
>> Hi all,
>> 
>> New to the mailing list, so I hope I'm creating a discussion in the right 
>> place.
>> 
>> Am I the only one that thinks that Advanced indexing in numpy doesn't follow 
>> the principle of minimum astonishment?
>> 
>> for example
>> 
>> ```python
>> a = np.random.rand(100, 100)
>> 
>> a[(2,4)] #this yields the element at [2,4]
>> a[[2,4]] #this yields the rows at position 2 and 4
>> a[1, (2,4)] #this yields the 2nd and 4th elements of row 1. (So actually 
>> does advanced indexing)
>> a[1, [2,4]] # Works the same way as the previous one.
>> ```
>> 
>> Worst of all, it's very easy for someone do a mistake and not notice it: it 
>> seems to me that the first method, a[(2,4)], should not be allowed, and 
>> instead only a[*(2,4)] should work.

To add to what Robert wrote, the other problem is that `a[*(2, 4)]` _does not_ 
work, it's a syntax error. And if you look at dicts for instance, `d[2, 4] = 
42` will give you a tuple-valued key (this is just a standard-library example 
of what Robert said about `__getitem__()`). So there's really no choice for 
NumPy here, this is Python syntax.

András

> How checked how it works in Julia (which has a similar syntax), and a[(2,4)] 
> would yield an error, which makes sense to me. Could it be an idea to 
> deprecate a[(2,4)]-like usages?
>
> No, that's not possible. In Python syntax, the comma `,` is what 
> creates the tuple, not the parentheses. So `a[(2,4)]` is exactly `a[2, 
> 4]`. `a[2, 4]` translates to `a.__getitem__((2, 4))`. So there's no way 
> for the array to know whether it got `a[2, 4]` or `a[(2, 4)]`.
>
> -- 
> Robert Kern
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: openblas location question

2023-05-29 Thread Andras Deak
On Mon, May 29, 2023, at 17:02, timesir wrote:
> Dear community,
>
> I want to find the location of the openblas library that Numpy calls. 

Hi,

I suspect you're tied to Python 3.6 due to some HPC cluster or similar, but in 
case you can upgrade to Python 3.8 or newer, NumPy 1.24 introduced 
show_runtime() that contains this information: 
https://numpy.org/doc/stable//reference/generated/numpy.show_runtime.html .
It's possible that installing `threadpoolctl` and running `python -m 
threadpoolctl -i numpy` works on older Python/NumPy as well to tell you the 
same information, I don't know.

András

> Then I did the following:
>
> [root@computer02 ~]#  python3
> Python 3.6.8 (default, Nov 16 2020, 16:55:22)
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 import numpy
 numpy.show_config()
> blas_mkl_info:
>   NOT AVAILABLE
> blis_info:
>   NOT AVAILABLE
> openblas_info:
> libraries = ['openblas', 'openblas']
> library_dirs = ['/usr/local/lib']
> language = c
> define_macros = [('HAVE_CBLAS', None)]
> blas_opt_info:
> libraries = ['openblas', 'openblas']
> library_dirs = ['/usr/local/lib']
> language = c
> define_macros = [('HAVE_CBLAS', None)]
> lapack_mkl_info:
>   NOT AVAILABLE
> openblas_lapack_info:
> libraries = ['openblas', 'openblas']
> library_dirs = ['/usr/local/lib']
> language = c
> define_macros = [('HAVE_CBLAS', None)]
> lapack_opt_info:
> libraries = ['openblas', 'openblas']
> library_dirs = ['/usr/local/lib']
> language = c
> define_macros = [('HAVE_CBLAS', None)]
>
> However, the ls /usr/local/lib folder is empty and I would like to know 
> how to see where openblas used by numpy is?
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: deak.and...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com