[Numpy-discussion] Re: Function that searches arrays for the first element that satisfies a condition

2023-10-31 Thread Lev Maximov
I've implemented such functions in Cython and packaged them into a library
called numpy_illustrated <https://pypi.org/project/numpy-illustrated/>

It exposes the following functions:

find(a, v)  # returns the index of the first occurrence of v in a
first_above(a, v)   # returns the index of the first element in a that is
strictly above v
first_nonzero(a)   # returns the index of the first nonzero element

They scan the array and bail out immediately once the match is found. Have
a significant performance gain if the element to be
found is closer to the beginning of the array. Have roughly the same speed
as alternative methods if the value is missing.

The complete signatures of the functions look like this:

find(a, v, rtol=1e-05, atol=1e-08, sorted=False, default=-1, raises=False)
first_above(a, v, sorted=False, missing=-1, raises=False)
first_nonzero(a, missing=-1, raises=False)

This covers the most common use cases and does not accept Python callbacks
because accepting them would nullify any speed gain
one would expect from such a function. A Python callback can be implemented
with Numba, but anyone who can write the callback
in Numba has no need for a library that wraps it into a dedicated function.

The library has a 100% test coverage. Code style 'black'. It should be easy
to add functions like 'first_below' if necessary.

A more detailed description of these functions can be found here
<https://betterprogramming.pub/the-numpy-illustrated-library-7531a7c43ffb?sk=8dd60bfafd6d49231ac76cb148a4d16f>
.

Best regards,
  Lev Maximov

On Tue, Oct 31, 2023 at 3:50 AM Dom Grigonis  wrote:

> I juggled a bit and found pretty nice solution using numba. Which is
> probably not very robust, but proves that such thing can be optimised while
> retaining flexibility. Check if it works for your use cases and let me know
> if anything fails or if it is slow compared to what you used.
>
> first_true_str = """def first_true(arr, n):result = np.full((n, 
> arr.shape[1]), -1, dtype=np.int32)for j in range(arr.shape[1]):k 
> = 0for i in range(arr.shape[0]):x = arr[i:i + 1, j]   
>  if cond(x):result[k, j] = ik += 1
> if k >= n:breakreturn result"""
>
> class FirstTrue:
> CONTEXT = {'np': np}
>
> def __init__(self, expr):
> self.expr = expr
> self.expr_ast = ast.parse(expr, mode='exec').body[0].value
> self.func_ast = ast.parse(first_true_str, mode='exec')
> self.func_ast.body[0].body[1].body[1].body[1].test = self.expr_ast
> self.func_cmp = compile(self.func_ast, filename="", mode="exec")
> exec(self.func_cmp, self.CONTEXT)
> self.func_nb = nb.njit(self.CONTEXT[self.func_ast.body[0].name])
>
> def __call__(self, arr, n=1, axis=None):
> # PREPARE INPUTS
> in_1d = False
> if axis is None:
> arr = np.ravel(arr)[:, None]
> in_1d = True
> elif axis == 0:
> if arr.ndim == 1:
> in_1d = True
> arr = arr[:, None]
> else:
> raise ValueError('axis ~in (None, 0)')
> res = self.func_nb(arr, n)
> if in_1d:
> res = res[:, 0]
> return res
>
> if __name__ == '__main__':
> arr = np.arange(125).reshape((5, 5, 5))
> ft = FirstTrue('np.sum(x) > 30')
> print(ft(arr, n=2, axis=0))
>
> [[1 0 0 0 0]
>  [2 1 1 1 1]]
>
> In [16]: %timeit ft(arr, 2, axis=0)1.31 µs ± 3.94 ns per loop (mean ± std. 
> dev. of 7 runs, 1,000,000 loops each)
>
> Regards,
> DG
>
> On 29 Oct 2023, at 23:18, rosko37  wrote:
>
> An example with a 1-D array (where it is easiest to see what I mean) is
> the following. I will follow Dom Grigonis's suggestion that the range not
> be provided as a separate argument, as it can be just as easily "folded
> into" the array by passing a slice. So it becomes just:
> idx = first_true(arr, cond)
>
> As Dom also points out, the "cond" would likely need to be a "function
> pointer" (i.e., the name of a function defined elsewhere, turning
> first_true into a higher-order function), unless there's some way to pass a
> parseable expression for simple cases. A few special cases like the first
> zero/nonzero element could be handled with dedicated options (sort of like
> matplotlib colors), but for anything beyond that it gets unwieldy fast.
>
> So let's say we have this:
> **
> def cond(x):
> return x>50
>
> search_arr = np.exp(np.arange(0,1000))
>
> pr

Re: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface

2021-02-15 Thread Lev Maximov
Hi Friedrich,

Try adding
view->suboffsets = NULL;
view->internal = NULL;
to Image_getbuffer

Best regards,
Lev

On Mon, Feb 15, 2021 at 10:57 PM Sebastian Berg 
wrote:

> On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote:
> > Hi,
> >
> > Am Do., 4. Feb. 2021 um 09:07 Uhr schrieb Friedrich Romstedt
> > :
> > > Am Mo., 1. Feb. 2021 um 09:46 Uhr schrieb Matti Picus <
> > > matti.pi...@gmail.com>:
> > > > Typically, one would create a complete example and then pointing
> > > > to the
> > > > code (as repo or pastebin, not as an attachment to a mail here).
> > >
> > > https://github.com/friedrichromstedt/bughunting-01
> >
> > Last week I updated my example code to be more slim.  There now
> > exists
> > a single-file extension module:
> >
> https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp
> > .
> > The corresponding test program
> >
> https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py
> > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as
> > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the
> > ``print``
> > statement contained in the test file is commented out.
> >
> > My hope to be able to fix my error myself by reducing the code to
> > reproduce the problem has not been fulfillled.  I feel that the
> > abovementioned test code is short enough to ask for help with it
> > here.
> > Any hint on how I could solve my problem would be appreciated very
> > much.
>
> I have tried it out, and can confirm that using debugging tools (namely
> valgrind), will allow you track down the issue (valgrind reports it
> from within python, running a python without debug symbols may
> obfuscate the actual problem; if that is the limiting you, I can post
> my valgrind output).
> Since you are running a linux system, I am confident that you can run
> it in valgrind to find it yourself.  (There may be other ways.)
>
> Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and
> ignore some errors e.g. when importing NumPy.
>
> Cheers,
>
> Sebastian
>
>
> >
> > There are some points which were not clarified yet; I am citing them
> > below.
> >
> > So far,
> > Friedrich
> >
> > > > - There are tools out there to analyze refcount problems. Python
> > > > has
> > > > some built-in tools for switching allocation strategies.
> > >
> > > Can you give me some pointer about this?
> > >
> > > > - numpy.asarray has a number of strategies to convert instances,
> > > > which
> > > > one is it using?
> > >
> > > I've tried to read about this, but couldn't find anything.  What
> > > are
> > > these different strategies?
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface

2021-02-16 Thread Lev Maximov
I've reproduced the error you've described and got rid of it without
valgrind.
Those two lines are enough to avoid the segfault.

But feel free to find it yourself :)

Best regards,
Lev

On Tue, Feb 16, 2021 at 5:02 PM Friedrich Romstedt <
friedrichromst...@gmail.com> wrote:

> Hello again,
>
> Am Mo., 15. Feb. 2021 um 16:57 Uhr schrieb Sebastian Berg
> :
> >
> > On Mon, 2021-02-15 at 10:12 +0100, Friedrich Romstedt wrote:
> > > Last week I updated my example code to be more slim.  There now
> > > exists
> > > a single-file extension module:
> > >
> https://github.com/friedrichromstedt/bughunting-01/blob/master/lib/bughuntingfrmod/bughuntingfrmod.cpp
> > > .
> > > The corresponding test program
> > >
> https://github.com/friedrichromstedt/bughunting-01/blob/master/test/2021-02-11_0909.py
> > > crashes "properly" both on Windows 10 (Python 3.8.2, numpy 1.19.2) as
> > > well as on Arch Linux (Python 3.9.1, numpy 1.20.0), when the
> > > ``print``
> > > statement contained in the test file is commented out.
> >
> > I have tried it out, and can confirm that using debugging tools (namely
> > valgrind), will allow you track down the issue (valgrind reports it
> > from within python, running a python without debug symbols may
> > obfuscate the actual problem; if that is the limiting you, I can post
> > my valgrind output).
> > Since you are running a linux system, I am confident that you can run
> > it in valgrind to find it yourself.  (There may be other ways.)
> >
> > Just remember to run valgrind with `PYTHONMALLOC=malloc valgrind` and
> > ignore some errors e.g. when importing NumPy.
>
> From running ``PYTHONMALLOC=malloc valgrind python3
> 2021-01-11_0909.py`` (with the preceding call of ``print`` in
> :file:`2021-01-11_0909.py` commented out) I found a few things:
>
> -   The call might or might not succeed.  It doesn't always lead to a
> segfault.
> -   "at 0x4A64A73: ??? (in /usr/lib/libpython3.9.so.1.0), called by
> 0x4A64914: PyMemoryView_FromObject (in /usr/lib/libpython3.9.so.1.0)",
> a "Conditional jump or move depends on uninitialised value(s)".  After
> one more block of valgrind output ("Use of uninitialised value of size
> 8 at 0x48EEA1B: ??? (in /usr/lib/libpython3.9.so.1.0)"), it finally
> leads either to "Invalid read of size 8 at 0x48EEA1B: ??? (in
> /usr/lib/libpython3.9.so.1.0) [...] Address 0x1 is not stack'd,
> malloc'd or (recently) free'd", resulting in a segfault, or just to
> another "Use of uninitialised value of size 8 at 0x48EEA15: ??? (in
> /usr/lib/libpython3.9.so.1.0)", after which the program completes
> successfully.
> -   All this happens within "PyMemoryView_FromObject".
>
> So I can only guess that the "uninitialised value" is compared to 0x0,
> and when it is different (e.g. 0x1), it leads via "Address 0x1 is not
> stack'd, malloc'd or (recently) free'd" to the segfault observed.
>
> I suppose I need to compile Python and numpy myself to see the debug
> symbols instead of the "???" marks? Maybe even with ``-O0``?
>
> Furthermore, the shared object belonging to my code isn't involved
> directly in any way, so the segfault possibly has to do with some data
> I am leaving "uninitialised" at the moment.
>
> Thanks for the other replies as well; for the moment I feel that going
> the valgrind way might teach me how to debug errors of this kind
> myself.
>
> So far,
> Friedrich
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Unreliable crash when converting using numpy.asarray via C buffer interface

2021-03-29 Thread Lev Maximov
I'm glad you sorted it out as the subject line sounded quite horrifying )

Best regards,
Lev

On Mon, Mar 29, 2021 at 2:54 PM Friedrich Romstedt <
friedrichromst...@gmail.com> wrote:

> Hi Matti, Sebastian and Lev,
>
> Am Mo., 15. Feb. 2021 um 18:50 Uhr schrieb Lev Maximov <
> lev.maxi...@gmail.com>:
> >
> > Try adding
> > view->suboffsets = NULL;
> > view->internal = NULL;
> > to Image_getbuffer
>
> finally I got it working easily using Lev's pointer cited above.  I
> didn't follow the valgrind approach furthermore, since I found it
> likely that it'd produce the same finding.
>
> This is just to let you know; I applied the fix several weeks ago.
>
> Many thanks,
> Friedrich
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Re: Wrapping up Google Season of Docs 2021

2021-12-24 Thread Lev Maximov
Hi Ralf,

I'm writing an article about numpy types – something I haven't covered in
the Numpy Illustrated.
Would you please have a look to confirm I haven't written anything
anti-climatic there? ;)
https://medium.com/@levmaximov/numpy-data-types-8f62cb57ea83

Best regards,
Lev

On Wed, Dec 1, 2021 at 3:14 AM Ralf Gommers  wrote:

>
>
> On Tue, Nov 30, 2021 at 3:18 PM Melissa Mendonça 
> wrote:
>
>> Hello, folks!
>>
>> As of today, our participation in the Google Season of Docs program for
>> 2021 has ended. You can see the case study detailing the work done and some
>> key results in the following link:
>>
>>
>> https://github.com/numpy/numpy/wiki/Google-Season-of-Docs-2021:-NumPy-Case-Study
>>
>> There is one final tutorial in review, but overall the project has been
>> successfully completed.
>>
>> I want to personally thank Mukulika Pahari for her hard work and
>> excellent contributions. She was able to quickly produce relevant
>> documentation on subjects that are not easy or simple. Well done! We hope
>> you stick around and continue working with us :)
>>
>
> Thanks for all the hard work so far Mukulika! From what I have seen -
> which I think is only a small part of everything you did - you did an
> amazing job!
>
> Cheers,
> Ralf
>
>
>
>> I also want to thank Ross Barnowski for co-mentoring and all the other
>> maintainers who helped with ideas and reviews and hope we can participate
>> again next year.
>>
>> Cheers!
>>
>> - Melissa
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: ralf.gomm...@googlemail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] An article on numpy data types

2021-12-25 Thread Lev Maximov
Hi everyone,

I'm almost done with the article about numpy types – something I haven't
covered in Numpy Illustrated.

Would someone please have a look to confirm I haven't written anything
anti-climatic there?

https://axil.github.io/numpy-data-types.html

--
Best regards,
Lev

PS Earlier today I've mistakenly sent an email with the wrong link.
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-26 Thread Lev Maximov
Dear Michael,

Thank you for your feedback!

I've fixed the x86_64 typo.

I'll think how to reformulate the int32 part. I work on debian x86_64 and
windows 10 64bit. Constructing an array with np.array([1,2,3]) as well as
np.array([1,2,3], dtype=np.int_) gives me int64 dtype on linux, and int32
on windows.

As suggested by Matti, I've put the rst source (and images) into a separate
github repository

https://github.com/axil/numpy-data-types

PRs are welcome. My primary concern is to exclude serious typos/mistakes
that might mislead/harm the readers if used.

My personal preference is towards explicit width types like np.int32, but
from reading the docs I have a feeling there's a trend of migrating towards
the c-style notation.

Best regards,
Lev

On Sun, Dec 26, 2021 at 7:05 PM Michael Siebert 
wrote:

> Dear Lev,
>
> thank you a lot! Something like this should be part of the Numpy
> documentation. I like the diagram, looks very nice! Also, I’ve opened an
> issue regarding data types
>
> https://github.com/numpy/numpy/issues/20662
>
> Some feedback from my side:
>
> 1. When calling numpy.array([1,2,3,4]) it gives me an int64 data type most
> of the time (two x86_64 systems, one arm64 system). The only time I’ve got
> int32 was on a Raspberry Pi, which is a software limitation, since the CPU
> is 64 bit and they have even replaced their so-far 32bit only Raspberry Pi
> Zero with a 64bit version (yes, one day Raspberry OS with 64 bit might
> actually become the default!). I don’t know what machine you are working
> on, but int64 should be the default.
> 2. x64 refers to the obsolete Intel Itanium architecture (mentioned once).
> Should be x86_64.
> 3. np.errstate looks nice, I could use that for my pull request as well.
>
> Many thanks & best regards, Michael
>
>
> On 25. Dec 2021, at 10:02, Lev Maximov  wrote:
>
> Hi everyone,
>
> I'm almost done with the article about numpy types – something I haven't
> covered in Numpy Illustrated.
>
> Would someone please have a look to confirm I haven't written anything
> anti-climatic there?
>
> https://axil.github.io/numpy-data-types.html
>
> --
> Best regards,
> Lev
>
> PS Earlier today I've mistakenly sent an email with the wrong link.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: michael.sieber...@gmail.com
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-26 Thread Lev Maximov
Hi, Friedrich

> There seems to be missing an "a" before "more".
Thank you. Fixed. This is a draft. It will be (more or less) professionally
proofread thereafter.

> on my machine it runs::
Which OS does your machine run on?

> FloatingPointError written instead of RuntimeWarning
This is most certainly a typo. Thanks.

> *only once*:
Good point! Added.

> So, unclarity resolved, but maybe I am not the only one stumbling over
this.
Ok, I'll think how to improve readability here.

> Maybe the idiom ``>>> c = numpy.int64(2 ** 63 - 1)`` can be used?
It was there in one of ther earlier versions of the article, but np.array
fitted the narrative thread better mostly this
very reason you provided: although It's good to know that scalars  can
constructed this way, noone does it in real-life
use cases.

Thank you for your feedback! Looking forward to reading the next part of
the reivew.

Best regards,
Lev


On Sun, Dec 26, 2021 at 8:45 PM Michael Siebert 
wrote:

> Hey Lev,
>
> I‘ve forgotten to mention my MacBook M1,
> it‘s also int64 there.
>
> Python on Windows is and is supposed to be, as far as I get it, a dying
> platform. A billion things are broken there (HDF comes to my mind) and it
> seems even Microsoft wants developers to move away from native Windows with
> their introduction of WSL (Windows Subsystem for Linux). Its latest
> version, WSL2 even comes with an actual Linux kernel and since Windows 11,
> it has support for graphical applications (Xorg) out of the box. With
> Visual Studio Code (also Microsoft) and it’s remote capabilities, one does
> not even feel a difference between developing in an Ubuntu in a WSL in
> Windows and an actual Ubuntu.
>
> Considering the „traditional“ C datatypes, fixed types and prioritizing
> them in Numpy documentation, that‘s what my issue (see below) is about. I
> think they have summarized it nicely in
>
> https://matt.sh/howto-c
>
> Best regards, Michael
>
> On 26. Dec 2021, at 13:49, Lev Maximov  wrote:
>
> 
> Dear Michael,
>
> Thank you for your feedback!
>
> I've fixed the x86_64 typo.
>
> I'll think how to reformulate the int32 part. I work on debian x86_64 and
> windows 10 64bit. Constructing an array with np.array([1,2,3]) as well as
> np.array([1,2,3], dtype=np.int_) gives me int64 dtype on linux, and int32
> on windows.
>
> As suggested by Matti, I've put the rst source (and images) into a
> separate github repository
>
> https://github.com/axil/numpy-data-types
>
> PRs are welcome. My primary concern is to exclude serious typos/mistakes
> that might mislead/harm the readers if used.
>
> My personal preference is towards explicit width types like np.int32, but
> from reading the docs I have a feeling there's a trend of migrating towards
> the c-style notation.
>
> Best regards,
> Lev
>
> On Sun, Dec 26, 2021 at 7:05 PM Michael Siebert <
> michael.sieber...@gmail.com> wrote:
>
>> Dear Lev,
>>
>> thank you a lot! Something like this should be part of the Numpy
>> documentation. I like the diagram, looks very nice! Also, I’ve opened an
>> issue regarding data types
>>
>> https://github.com/numpy/numpy/issues/20662
>>
>> Some feedback from my side:
>>
>> 1. When calling numpy.array([1,2,3,4]) it gives me an int64 data type
>> most of the time (two x86_64 systems, one arm64 system). The only time I’ve
>> got int32 was on a Raspberry Pi, which is a software limitation, since the
>> CPU is 64 bit and they have even replaced their so-far 32bit only Raspberry
>> Pi Zero with a 64bit version (yes, one day Raspberry OS with 64 bit might
>> actually become the default!). I don’t know what machine you are working
>> on, but int64 should be the default.
>> 2. x64 refers to the obsolete Intel Itanium architecture (mentioned
>> once). Should be x86_64.
>> 3. np.errstate looks nice, I could use that for my pull request as well.
>>
>> Many thanks & best regards, Michael
>>
>>
>> On 25. Dec 2021, at 10:02, Lev Maximov  wrote:
>>
>> Hi everyone,
>>
>> I'm almost done with the article about numpy types – something I haven't
>> covered in Numpy Illustrated.
>>
>> Would someone please have a look to confirm I haven't written anything
>> anti-climatic there?
>>
>> https://axil.github.io/numpy-data-types.html
>>
>> --
>> Best regards,
>> Lev
>>
>> PS Earlier today I've mistakenly sent an email with the wrong link.
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send a

[Numpy-discussion] Re: An article on numpy data types

2021-12-26 Thread Lev Maximov
Hi Michael,

> Python on Windows is and is supposed to be, as far as I get it, a dying
platform.
I would join Matti in thinking that it is a misconception.

Have you heard of the enormous daily updated unofficial repository
<https://www.lfd.uci.edu/~gohlke/pythonlibs/> of the binary windows
compilations of
almost 600 python libraries by Christoph Gohlke? (numpy and libs depending
on it are built with MKL there)
It is there for a reason.

If you look at the stats such as this one (Matti already mentioned them
while I was writing this text),

https://www.jetbrains.com/research/python-developers-survey-2018/
https://www.jetbrains.com/lp/python-developers-survey-2020/

you'll see (in addition to the fact that numpy is the #1 library in data
science ;) ) that in
the recent years the percentage of windows user among the developers is
quite high:
69% linux - 47% windows - 32% macos (2018)
68% linux - 48% windows - 29% macos (2020)
So it looks as if it is rather growing than dying.

This is due to the popularity of the above mentioned data science and AI,
which have skyrocketed in the
last 10 years. And the vast majority of data scientists work on windows.

Windows as a platform for developers as a whole is also quite flourishing
today.
According to the stackoverflow 2021 developer survey
<https://insights.stackoverflow.com/survey/2021#most-popular-technologies-op-sys>
45% of the respondents use Windows (25% linux, 25% macos).
Among the professional developers the numbers are 41% for windows, 30%
macos, 26% linux.

Also the primary audience of the tutorials like mine (as well as of
stackoverflow?) are windows users.
Linux users can easily figure things described there on their own, through
the docstrings, source code
or, as a last resort, through the docs )

>The more experienced the Python developers are, the more likely they are
to use Linux and macOS as development
> environments, and the less likely they are to choose Windows.
(from the same jetbrains survey of 2018)

I wouldn't like to go into holy wars, though. I'm equally literate in both
unix and windows (somewhat less in macos)
and in my opinion the interests of all the users of the the three operating
systems should be taken into account
in both the code of the library and the docs.

The documentation is sometimes pretty ignorant of mac/windows users, btw:
> Alias on this platform (Linux x86_64)
https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.int_
And what about the other platforms?

As for the particular issue of the difference in the default integer types,
in my opinion the default choice of int32 on windows for
array [1,2,3] fits the description

>" If not given, then the type will be determined as the minimum type
required to hold the objects in the sequence."
https://numpy.org/doc/stable/reference/generated/numpy.array.html

better than int64 on linux/macos.

Best regards,
Lev


On Sun, Dec 26, 2021 at 8:45 PM Michael Siebert 
wrote:

> Hey Lev,
>
> I‘ve forgotten to mention my MacBook M1,
> it‘s also int64 there.
>
> Python on Windows is and is supposed to be, as far as I get it, a dying
> platform. A billion things are broken there (HDF comes to my mind) and it
> seems even Microsoft wants developers to move away from native Windows with
> their introduction of WSL (Windows Subsystem for Linux). Its latest
> version, WSL2 even comes with an actual Linux kernel and since Windows 11,
> it has support for graphical applications (Xorg) out of the box. With
> Visual Studio Code (also Microsoft) and it’s remote capabilities, one does
> not even feel a difference between developing in an Ubuntu in a WSL in
> Windows and an actual Ubuntu.
>
> Considering the „traditional“ C datatypes, fixed types and prioritizing
> them in Numpy documentation, that‘s what my issue (see below) is about. I
> think they have summarized it nicely in
>
> https://matt.sh/howto-c
>
> Best regards, Michael
>
> On 26. Dec 2021, at 13:49, Lev Maximov  wrote:
>
> 
> Dear Michael,
>
> Thank you for your feedback!
>
> I've fixed the x86_64 typo.
>
> I'll think how to reformulate the int32 part. I work on debian x86_64 and
> windows 10 64bit. Constructing an array with np.array([1,2,3]) as well as
> np.array([1,2,3], dtype=np.int_) gives me int64 dtype on linux, and int32
> on windows.
>
> As suggested by Matti, I've put the rst source (and images) into a
> separate github repository
>
> https://github.com/axil/numpy-data-types
>
> PRs are welcome. My primary concern is to exclude serious typos/mistakes
> that might mislead/harm the readers if used.
>
> My personal preference is towards explicit width types like np.int32, but
> from reading the docs I have a feeling there's a trend of migrating towards
> the c-style notation.
>
> Best regards,
> 

[Numpy-discussion] Re: An article on numpy data types

2021-12-26 Thread Lev Maximov
Python 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64
bit (AMD64)] on win32

^^ this is relevant this is
not
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.int_


On Sun, Dec 26, 2021 at 11:42 PM Michael Siebert <
michael.sieber...@gmail.com> wrote:

> Okay, little modification to my last mail: many Android smartphones are
> still 32 bit, but according to
>
> https://www.androidauthority.com/arm-32-vs-64-bit-explained-1232065/
>
> from 2023 on, all (or at least many) new ARM processors will be 64 bit
> only.
>
> Apple‘s iPhone 64 bit only since quite a while already (September 2017,
> iOS 11 release).
>
> On 26. Dec 2021, at 17:31, Lev Maximov  wrote:
>
> 
> Hi Michael,
>
> > Python on Windows is and is supposed to be, as far as I get it, a dying
> platform.
> I would join Matti in thinking that it is a misconception.
>
> Have you heard of the enormous daily updated unofficial repository
> <https://www.lfd.uci.edu/~gohlke/pythonlibs/> of the binary windows
> compilations of
> almost 600 python libraries by Christoph Gohlke? (numpy and libs depending
> on it are built with MKL there)
> It is there for a reason.
>
> If you look at the stats such as this one (Matti already mentioned them
> while I was writing this text),
>
> https://www.jetbrains.com/research/python-developers-survey-2018/
> https://www.jetbrains.com/lp/python-developers-survey-2020/
>
> you'll see (in addition to the fact that numpy is the #1 library in data
> science ;) ) that in
> the recent years the percentage of windows user among the developers is
> quite high:
> 69% linux - 47% windows - 32% macos (2018)
> 68% linux - 48% windows - 29% macos (2020)
> So it looks as if it is rather growing than dying.
>
> This is due to the popularity of the above mentioned data science and AI,
> which have skyrocketed in the
> last 10 years. And the vast majority of data scientists work on windows.
>
> Windows as a platform for developers as a whole is also quite flourishing
> today.
> According to the stackoverflow 2021 developer survey
> <https://insights.stackoverflow.com/survey/2021#most-popular-technologies-op-sys>
> 45% of the respondents use Windows (25% linux, 25% macos).
> Among the professional developers the numbers are 41% for windows, 30%
> macos, 26% linux.
>
> Also the primary audience of the tutorials like mine (as well as of
> stackoverflow?) are windows users.
> Linux users can easily figure things described there on their own, through
> the docstrings, source code
> or, as a last resort, through the docs )
>
> >The more experienced the Python developers are, the more likely they are
> to use Linux and macOS as development
> > environments, and the less likely they are to choose Windows.
> (from the same jetbrains survey of 2018)
>
> I wouldn't like to go into holy wars, though. I'm equally literate in both
> unix and windows (somewhat less in macos)
> and in my opinion the interests of all the users of the the three
> operating systems should be taken into account
> in both the code of the library and the docs.
>
> The documentation is sometimes pretty ignorant of mac/windows users, btw:
> > Alias on this platform (Linux x86_64)
> https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.int_
> And what about the other platforms?
>
> As for the particular issue of the difference in the default integer
> types, in my opinion the default choice of int32 on windows for
> array [1,2,3] fits the description
>
> >" If not given, then the type will be determined as the minimum type
> required to hold the objects in the sequence."
> https://numpy.org/doc/stable/reference/generated/numpy.array.html
>
> better than int64 on linux/macos.
>
> Best regards,
> Lev
>
>
> On Sun, Dec 26, 2021 at 8:45 PM Michael Siebert <
> michael.sieber...@gmail.com> wrote:
>
>> Hey Lev,
>>
>> I‘ve forgotten to mention my MacBook M1,
>> it‘s also int64 there.
>>
>> Python on Windows is and is supposed to be, as far as I get it, a dying
>> platform. A billion things are broken there (HDF comes to my mind) and it
>> seems even Microsoft wants developers to move away from native Windows with
>> their introduction of WSL (Windows Subsystem for Linux). Its latest
>> version, WSL2 even comes with an actual Linux kernel and since Windows 11,
>> it has support for graphical applications (Xorg) out of the box. With
>> Visual Studio Code (also Microsoft) and it’s remote capabilities,

[Numpy-discussion] Re: An article on numpy data types

2021-12-26 Thread Lev Maximov
I've tried to take into account all the suggestions from this thread.

https://axil.github.io/numpy-data-types.html shows new version now and

https://github.com/axil/numpy-data-types/commit/14d9da053fd67e5569436faa1f58599c0cc8b380#diff-ed7002b439e9ac845f22357d822bac1444730fbdb6016d3ec9432297b9ec9f73
displays most of the changes.

As for the inheritance diagram, I think it is perfectly fine to add it to
the documentation as is,
except that I'd put back the 'void' type I've originally omitted to keep it
simple.

Btw is anyone aware why 'U' is missing from the np.typecode['Character']?

On Sun, Dec 26, 2021 at 11:57 PM Lev Maximov  wrote:

> Python 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64
> bit (AMD64)] on win32
>
>   ^^ this is relevant this
> is not
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy as np
> >>> np.int_
> 
>
> On Sun, Dec 26, 2021 at 11:42 PM Michael Siebert <
> michael.sieber...@gmail.com> wrote:
>
>> Okay, little modification to my last mail: many Android smartphones are
>> still 32 bit, but according to
>>
>> https://www.androidauthority.com/arm-32-vs-64-bit-explained-1232065/
>>
>> from 2023 on, all (or at least many) new ARM processors will be 64 bit
>> only.
>>
>> Apple‘s iPhone 64 bit only since quite a while already (September 2017,
>> iOS 11 release).
>>
>> On 26. Dec 2021, at 17:31, Lev Maximov  wrote:
>>
>> 
>> Hi Michael,
>>
>> > Python on Windows is and is supposed to be, as far as I get it, a dying
>> platform.
>> I would join Matti in thinking that it is a misconception.
>>
>> Have you heard of the enormous daily updated unofficial repository
>> <https://www.lfd.uci.edu/~gohlke/pythonlibs/> of the binary windows
>> compilations of
>> almost 600 python libraries by Christoph Gohlke? (numpy and libs
>> depending on it are built with MKL there)
>> It is there for a reason.
>>
>> If you look at the stats such as this one (Matti already mentioned them
>> while I was writing this text),
>>
>> https://www.jetbrains.com/research/python-developers-survey-2018/
>> https://www.jetbrains.com/lp/python-developers-survey-2020/
>>
>> you'll see (in addition to the fact that numpy is the #1 library in data
>> science ;) ) that in
>> the recent years the percentage of windows user among the developers is
>> quite high:
>> 69% linux - 47% windows - 32% macos (2018)
>> 68% linux - 48% windows - 29% macos (2020)
>> So it looks as if it is rather growing than dying.
>>
>> This is due to the popularity of the above mentioned data science and AI,
>> which have skyrocketed in the
>> last 10 years. And the vast majority of data scientists work on windows.
>>
>> Windows as a platform for developers as a whole is also quite flourishing
>> today.
>> According to the stackoverflow 2021 developer survey
>> <https://insights.stackoverflow.com/survey/2021#most-popular-technologies-op-sys>
>> 45% of the respondents use Windows (25% linux, 25% macos).
>> Among the professional developers the numbers are 41% for windows, 30%
>> macos, 26% linux.
>>
>> Also the primary audience of the tutorials like mine (as well as of
>> stackoverflow?) are windows users.
>> Linux users can easily figure things described there on their own,
>> through the docstrings, source code
>> or, as a last resort, through the docs )
>>
>> >The more experienced the Python developers are, the more likely they are
>> to use Linux and macOS as development
>> > environments, and the less likely they are to choose Windows.
>> (from the same jetbrains survey of 2018)
>>
>> I wouldn't like to go into holy wars, though. I'm equally literate in
>> both unix and windows (somewhat less in macos)
>> and in my opinion the interests of all the users of the the three
>> operating systems should be taken into account
>> in both the code of the library and the docs.
>>
>> The documentation is sometimes pretty ignorant of mac/windows users, btw:
>> > Alias on this platform (Linux x86_64)
>> https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.int_
>> And what about the other platforms?
>>
>> As for the particular issue of the difference in the default integer
>> types, in my opinion the default choice of int32 on windows for
>> array [1,2,3] fits the description
>>
>> >

[Numpy-discussion] Re: An article on numpy data types

2021-12-27 Thread Lev Maximov
> I'm surprised no one has mentioned it already: int and uint are reversed
in the first table.
Not anymore! I know I'm susceptible to this type of blunders )

Thank you for your kind attention!

I've made a few more fixes here and there and added a couple of
illustrations. Looks more or less finished to me.
Giving it away to the editors.

Best regards,
Lev


On Mon, Dec 27, 2021 at 5:17 AM Juan Nunez-Iglesias 
wrote:

> I'm surprised no one has mentioned it already: int and uint are reversed
> in the first table.
>
> I do agree with others that this is a great overview page that should be
> included in the numpy docs. Thanks Lev!
>
> Juan.
>
> On Sun, 26 Dec 2021, at 12:59 PM, Lev Maximov wrote:
>
> I've tried to take into account all the suggestions from this thread.
>
> https://axil.github.io/numpy-data-types.html shows new version now and
>
>
> https://github.com/axil/numpy-data-types/commit/14d9da053fd67e5569436faa1f58599c0cc8b380#diff-ed7002b439e9ac845f22357d822bac1444730fbdb6016d3ec9432297b9ec9f73
> displays most of the changes.
>
> As for the inheritance diagram, I think it is perfectly fine to add it to
> the documentation as is,
> except that I'd put back the 'void' type I've originally omitted to keep
> it simple.
>
> Btw is anyone aware why 'U' is missing from the np.typecode['Character']?
>
> On Sun, Dec 26, 2021 at 11:57 PM Lev Maximov 
> wrote:
>
> Python 3.9.7 (tags/v3.9.7:1016ef3, Aug 30 2021, 20:19:38) [MSC v.1929 64
> bit (AMD64)] on win32
>
>   ^^ this is relevant this
> is not
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy as np
> >>> np.int_
> 
>
> On Sun, Dec 26, 2021 at 11:42 PM Michael Siebert <
> michael.sieber...@gmail.com> wrote:
>
>
> Okay, little modification to my last mail: many Android smartphones are
> still 32 bit, but according to
>
> https://www.androidauthority.com/arm-32-vs-64-bit-explained-1232065/
>
> from 2023 on, all (or at least many) new ARM processors will be 64 bit
> only.
>
> Apple‘s iPhone 64 bit only since quite a while already (September 2017,
> iOS 11 release).
>
> On 26. Dec 2021, at 17:31, Lev Maximov  wrote:
>
> 
> Hi Michael,
>
> > Python on Windows is and is supposed to be, as far as I get it, a dying
> platform.
> I would join Matti in thinking that it is a misconception.
>
> Have you heard of the enormous daily updated unofficial repository
> <https://www.lfd.uci.edu/~gohlke/pythonlibs/> of the binary windows
> compilations of
> almost 600 python libraries by Christoph Gohlke? (numpy and libs depending
> on it are built with MKL there)
> It is there for a reason.
>
> If you look at the stats such as this one (Matti already mentioned them
> while I was writing this text),
>
> https://www.jetbrains.com/research/python-developers-survey-2018/
> https://www.jetbrains.com/lp/python-developers-survey-2020/
>
> you'll see (in addition to the fact that numpy is the #1 library in data
> science ;) ) that in
> the recent years the percentage of windows user among the developers is
> quite high:
> 69% linux - 47% windows - 32% macos (2018)
> 68% linux - 48% windows - 29% macos (2020)
> So it looks as if it is rather growing than dying.
>
> This is due to the popularity of the above mentioned data science and AI,
> which have skyrocketed in the
> last 10 years. And the vast majority of data scientists work on windows.
>
> Windows as a platform for developers as a whole is also quite flourishing
> today.
> According to the stackoverflow 2021 developer survey
> <https://insights.stackoverflow.com/survey/2021#most-popular-technologies-op-sys>
> 45% of the respondents use Windows (25% linux, 25% macos).
> Among the professional developers the numbers are 41% for windows, 30%
> macos, 26% linux.
>
> Also the primary audience of the tutorials like mine (as well as of
> stackoverflow?) are windows users.
> Linux users can easily figure things described there on their own, through
> the docstrings, source code
> or, as a last resort, through the docs )
>
> >The more experienced the Python developers are, the more likely they are
> to use Linux and macOS as development
> > environments, and the less likely they are to choose Windows.
> (from the same jetbrains survey of 2018)
>
> I wouldn't like to go into holy wars, though. I'm equally literate in both
> unix and windows (somewhat less in macos)
> and in my opinion the interests of all the users of the the three
> operating systems shou

[Numpy-discussion] Re: An article on numpy data types

2021-12-28 Thread Lev Maximov
>I think timedelta64 is missing.  Is that intentional?
Yes, thank you! It has stuck in my todo list. Still thinking on the best
way to include it.

Btw, does it make sense to include the masked arrays? I know Pandas uses
something
like a masked array for representing null values in the integer columns.
Does anyone use
NumPy masked arrays nowadays?


On Tue, Dec 28, 2021 at 3:11 AM Eric Firing  wrote:

> On 2021/12/27 9:32 AM, Lev Maximov wrote:
> >  > I'm surprised no one has mentioned it already: int and uint are
> > reversed in the first table.
> > Not anymore! I know I'm susceptible to this type of blunders )
> >
> > Thank you for your kind attention!
> >
> > I've made a few more fixes here and there and added a couple of
> > illustrations. Looks more or less finished to me.
> > Giving it away to the editors.
> >
> > Best regards,
> > Lev
> >
>
> Lev,
>
> I think timedelta64 is missing.  Is that intentional?
>
> Eric
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-28 Thread Lev Maximov
On Tue, Dec 28, 2021 at 9:12 PM Stefano Miccoli 
wrote:
Hi Stefano,


> Nice overview!
>
Thanks!


> Of course the first sentence should be “leap years”, which leads to my
> main point.
>
Sure, I've already fixed this one, it just hasn't found its way online yet.


> It makes no sense to claim “leap year support” without specifying the
> relevant calendar. Thus I would suggest to clearly state, from the very
> begin of this section, that
>
It is not a matter of formal definitions. Leap seconds are
uncompromisingly practical.
If you look at the wall clock on 1 Jan 1970 00:00 and then look at the same
clock today and measure the difference with atomic clock you won't get the
time delta that np.timedelta64 reports. There will be a difference of ~37
seconds.
One would expect that a library claiming to work with attoseconds would at
least count the seconds correctly )

Astropy library calculates

them properly:
"GPS Time. Seconds from 1980-01-06 00:00:00 UTC For example, 630720013.0 is
midnight on January 1, 2000."
>>> np.datetime64('2000-01-01', 's') - np.datetime64('1980-01-06', 's')
numpy.timedelta64(63072,'s')

Everything should be made as simple as possible but not simpler. Leap
seconds are an inherent part of the world we live in.

Eg this is how people deal with them currently: they have to parse times
like 23:59:60.209215 manually
https://stackoverflow.com/questions/21027639/python-datetime-not-accounting-for-leap-second-properly

- calendrical calculations are performed using a proleptic Gregorian
> calendar ,
> - Posix semantics is followed, i.e. each day comprises exactly 86400 SI
> seconds, thus ignoring the existence of leap seconds.
>
> I would also point out that this choice is consistent with python datetime.
>
But not consistent with python time ;) "Unlike the time module, the
datetime module does not support leap seconds."
• time.CLOCK_TAI
International Atomic Time
   The system must have a current leap second table in order for this to
give the correct answer. PTP or NTP software can maintain a leap second
table.
Availability: Linux.
New in version 3.9.


> As what regards the promised future support for leap seconds, I would not
> mention it, for now. In fact leap second support requires a leap second
> table, which is not available on all platforms supported by numpy.
> Therefore the leap second table should be bundled and updated with every
> numpy release with the very undesirable effect that older version (with
> outdated tables) would behave differently from newer ones.
>
The olson database is much larger yet it is updated on millions of
computers, phones and what not without causing extra difficulties
(except when the government unexpectedly decides to shift a region from one
TZ to another). This way developers have a choice whether
to work with naive datetimes (ok in a single timezone without
daylight-saving) or with timezone-aware ones (and take care about updating
the pytz).

This is how astropy deals with updating the table:
https://docs.astropy.org/en/stable/api/astropy.utils.iers.LeapSeconds.html
Pytz also has this table both inside the binary tz files and in a text
file: https://github.com/stub42/pytz/blob/master/tz/leap-seconds.list
which it in turn downloads from NIST
ftp://ftp.nist.gov/pub/time/leap-seconds.list
It is in the public domain, NIST updates this file regularly and it even
has an expiration date (presently it is 28 June 2022).
Activation of the 'leap-second-aware mode' could be made dependent on the
presence of the pytz mode and/or this expiration date.

I don't think having a non-default leap-second-aware mode would hurt
anyone, but I also wouldn't consider it a priority. I think when someone
needs them he'll make a patch and until that moment it is safe to have them
as 'proposed' )

I feel that leap seconds should be mentioned somewhere—in the article or in
the docs, because it limits practical precise usage of timedelta64 to
a period between 2021 and 2016 (last time when a leap second was injected).
A modest timespan for a library claiming to work with years upto 9.2e18
BC ;)

Thank you for your suggestions! I've included them into the article, plz
have a look at the updated version.

Best regards,
Lev

Stefano
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-28 Thread Lev Maximov
On Tue, Dec 28, 2021 at 3:43 PM Evgeni Burovski 
wrote:

> Very nice overview!
>
> One question and one suggestion:
>
> 1. Is integer wraparound guaranteed for signed ints, or is it an
> implementation detail? For unsigned ints, sure, it's straight from a C
> standard; what about signed types however.
>
Signed ints wraparound in just the same way as unsigned, both in C and in
NumPy. Added an illustration.


> 2. It'd be nice to explicitly stress that dtype=float corresponds to a C
> double, not a C float type. This frequently trips people trying to
> interface with C or Cython (in my experience)
>
Done, thanks!

Best regards,
Lev
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-28 Thread Lev Maximov
On Wed, Dec 29, 2021 at 12:45 AM Eric Firing  wrote:

> On 2021/12/27 10:09 PM, Lev Maximov wrote:
> > Btw, does it make sense to include the masked arrays? I know Pandas uses
> > something
> > like a masked array for representing null values in the integer columns.
> > Does anyone use
> > NumPy masked arrays nowadays?
> >
>
> Yes, numpy masked arrays are very much in use, but a masked array is a
> subclass of ndarray, not a different data type.

In pandas it is a different data type ('Int64' vs 'int64'), so it makes
perfect sense for me to mention
it in this article provided I missed it in the main Numpy Illustrated one.
I didn't have a chance to use
them on my own and I've heard from some people that they don't use
the masked array, for some
reason, too. Do you know if Pandas uses ma internally or they have their
own implementation?

Thanks and regards,
Lev
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: representation of valid float type range

2021-12-29 Thread Lev Maximov
• Short answer:

It's because
>>> f64_info.max - f64_info.min
inf

• Long answer:

linspace(a,b,n) tries to calculate the step by (b-a)/n and fails at (b-a).

You need to either
– split your range into two parts and then glue them back:
np.r_[np.linspace(f64_info.min, 0, 5), np.linspace(0, f64_info.max, 5)[1:]]

– or select a range that fits into float64:
np.linspace(f64_info.min/2, f64_info.max/2, 10)

– or select np.float128 as a dtype for linspace (linux/macos only):
np.linspace(np.float128(f64_info.min), np.float128(f64_info.max), 10)

Best regards,
Lev


On Wed, Dec 29, 2021 at 8:01 PM Sebastian Gurovich  wrote:

> Could it be you need to get a handle on the "epsilon machine"?
>
> On Wed, 29 Dec 2021, 9:21 am ,  wrote:
>
>> I am getting an interesting result, and I'm wondering if anyone would
>> care to give me some intuition of why.
>>
>> The example is simple enough, I want to get a range of values that are
>> representable by a type:
>>
>> ```python
>> f64_info = np.finfo(np.float64)
>> valid_range = np.linspace(
>> start=f64_info.min, stop=f64_info.max, num=10
>> )
>> valid_range => array([nan, inf, inf,
>>inf,
>>inf, inf, inf, inf,
>>inf, 1.79769313e+308])
>> ```
>>
>> The minimum value is representable by the type, I can see it:
>>
>> ```python
>> f64_info.min => -1.7976931348623157e+308
>> ```
>>
>> I thought that maybe the valid range cannot start with the minimun value,
>> so I've tried a few alternatives:
>>
>> ```python
>>
>> valid_range = np.linspace(
>> start=f64_info.min + f64_info.eps, stop=f64_info.max, num=10
>> )
>> valid_range => array([nan, inf, inf,
>>inf,
>>inf, inf, inf, inf,
>>inf, 1.79769313e+308])
>>
>>
>> valid_range = np.linspace(
>> start=f64_info.min + f64_info.tiny, stop=f64_info.max, num=10
>> )
>> valid_range => array([nan, inf, inf,
>>inf,
>>inf, inf, inf, inf,
>>inf, 1.79769313e+308])
>> ```
>>
>> I thought maybe the range is too wide, but I can do this:
>>
>> ```python
>> valid_range = np.linspace(
>> start=0, stop=f64_info.max, num=10
>> )
>> valid_range => array([0.e+000, 1.99743682e+307, 3.99487363e+307,
>> 5.99231045e+307,
>>7.98974727e+307, 9.98718408e+307, 1.19846209e+308,
>> 1.39820577e+308,
>>1.59794945e+308, 1.79769313e+308])
>>
>> ...
>>
>> valid_range = np.linspace(
>> start=f64_info.tiny, stop=f64_info.max, num=10
>> )
>> valid_range => array([2.22507386e-308, 1.99743682e+307, 3.99487363e+307,
>> 5.99231045e+307,
>>7.98974727e+307, 9.98718408e+307, 1.19846209e+308,
>> 1.39820577e+308,
>>1.59794945e+308, 1.79769313e+308])
>>
>> ...
>>
>> f32_info = np.finfo(np.float32)
>> valid_range = np.linspace(
>> start=f32_info.tiny, stop=f32_info.max, num=10, dtype=np.float32,
>> )
>> valid_range => array([1.1754944e-38, 3.7809150e+37, 7.5618299e+37,
>> 1.1342745e+38,
>>1.5123660e+38, 1.8904575e+38, 2.2685490e+38,
>> 2.6466405e+38,
>>3.0247320e+38, 3.4028235e+38], dtype=float32)
>>
>> ```
>>
>> I know that linear space is arbitrary, and perhaps not that useful. In
>> fact this is valid:
>>
>> ```python
>> valid_range = np.logspace(
>> start=f64_info.minexp, stop=f64_info.maxexp, num=10, base=2,
>> endpoint=False
>> )
>> valid_range => array([2.22507386e-308, 8.67124674e-247, 3.37923704e-185,
>> 1.31690901e-123,
>>5.13207368e-062, 2.e+000, 7.79412037e+061,
>> 3.03741562e+123,
>>1.18369915e+185, 4.61294681e+246])
>> ```
>>
>> But I'm still confused on why linear space is invalid
>>
>> Thanks!
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: seb...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-30 Thread Lev Maximov
On Wed, Dec 29, 2021 at 9:59 AM Charles R Harris 
wrote:

> On Tue, Dec 28, 2021 at 2:54 PM Warren Weckesser <
> warren.weckes...@gmail.com> wrote:
>
>> On 12/28/21, Lev Maximov  wrote:
>> > On Tue, Dec 28, 2021 at 3:43 PM Evgeni Burovski
>> > 
>> > wrote:
>> >
>> >> Very nice overview!
>> >>
>> >> One question and one suggestion:
>> >>
>> >> 1. Is integer wraparound guaranteed for signed ints, or is it an
>> >> implementation detail? For unsigned ints, sure, it's straight from a C
>> >> standard; what about signed types however.
>> >>
>> > Signed ints wraparound in just the same way as unsigned, both in C and
>> in
>> > NumPy. Added an illustration.
>>
>> Overflow of *signed* ints in the C language is *undefined behavior*.
>> In practice, most compilers might do what you expect, but the
>> wrap-around behavior is not guaranteed and should not be relied on.
>>
>> Warren
>>
>>
> There used to be one's complement machines. I expect Numpy will break
> badly if it is compiled for anything that doesn't use two's complement.
>

Chuck
>

Yes, according to C standard signed integer overflow is undefined behavior.

So, does NumPy guarantee wraparound for signed ints overflow?
(at least provided that the platform is two's complement)

There is an open issue
"Document behavior of casts and overflows for signed integer types" #17982
https://github.com/numpy/numpy/issues/17982
There is some discussion, but no definitive answer.

As a side note, Rust has both checked/unchecked wraparound arithmetic and
saturated arithmetic
as specialized methods:

pub const fn saturating_add(self, rhs: u32) -> u32
pub fn saturating_add_signed(self, rhs: i32) -> u32   (experimental)
pub const fn saturating_mul(self, rhs: u32) -> u32
etc.

Best wishes,
Lev

> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: An article on numpy data types

2021-12-30 Thread Lev Maximov
On Fri, Dec 31, 2021 at 12:12 AM Charles R Harris 
wrote:

>
>
> On Thu, Dec 30, 2021 at 4:12 AM Lev Maximov  wrote:
>
>> On Wed, Dec 29, 2021 at 9:59 AM Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>> On Tue, Dec 28, 2021 at 2:54 PM Warren Weckesser <
>>> warren.weckes...@gmail.com> wrote:
>>>
>>>> On 12/28/21, Lev Maximov  wrote:
>>>> > On Tue, Dec 28, 2021 at 3:43 PM Evgeni Burovski
>>>> > 
>>>> > wrote:
>>>> >
>>>> >> Very nice overview!
>>>> >>
>>>> >> One question and one suggestion:
>>>> >>
>>>> >> 1. Is integer wraparound guaranteed for signed ints, or is it an
>>>> >> implementation detail? For unsigned ints, sure, it's straight from a
>>>> C
>>>> >> standard; what about signed types however.
>>>> >>
>>>> > Signed ints wraparound in just the same way as unsigned, both in C
>>>> and in
>>>> > NumPy. Added an illustration.
>>>>
>>>> Overflow of *signed* ints in the C language is *undefined behavior*.
>>>> In practice, most compilers might do what you expect, but the
>>>> wrap-around behavior is not guaranteed and should not be relied on.
>>>>
>>>> Warren
>>>>
>>>>
>>> There used to be one's complement machines. I expect Numpy will break
>>> badly if it is compiled for anything that doesn't use two's complement.
>>>
>>
>> Chuck
>>>
>>
>> Yes, according to C standard signed integer overflow is undefined
>> behavior.
>>
>> So, does NumPy guarantee wraparound for signed ints overflow?
>> (at least provided that the platform is two's complement)
>>
>> There is an open issue
>> "Document behavior of casts and overflows for signed integer types" #17982
>> https://github.com/numpy/numpy/issues/17982
>> There is some discussion, but no definitive answer.
>>
>> As a side note, Rust has both checked/unchecked wraparound arithmetic and
>> saturated arithmetic
>> as specialized methods:
>>
>> pub const fn saturating_add(self, rhs: u32) -> u32
>> pub fn saturating_add_signed(self, rhs: i32) -> u32   (experimental)
>> pub const fn saturating_mul(self, rhs: u32) -> u32
>> etc.
>>
>>
> There have been discussions about overflow behavior.  The main problem is
> performance when there is no hardware support. There used to be
> architectures that offered that (VAX), but it has fallen out of favor.
> NumPy doesn't have an official policy that I know of, but it is currently
> pretty much two's complement with overflow wrap. Which is not to say that
> things will never change, but it isn't a priority.
>

Yes, that reflects my expectations. My primary concern right now is how to
formulate this to the readers of the article
so as not to mislead them. Can they rely on the wraparound for the
signed ints in NumPy? Or is it rather 'use at your own risk'?

Btw, is signed integer wrapping covered by regressions?

Lev


>
> Chuck
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: NumPy-Discussion Digest, Vol 183, Issue 33

2021-12-31 Thread Lev Maximov
Hey, Stefano!

The level of being pedantic is absolutely acceptable.

I don't question any of your arguments. They are all perfectly valid.

Except that I'd rather say it is ~29 seconds if measuring against 1970.
Leap seconds were introduced in 1972 and there were
a total of 27 seconds since then, but TAI time was ticking since 1958 and
gained 10 seconds by 1970 so it is approximately 0.83 second per year at
which gives approx 28.67 sec between today and 1970.
So 1970 is a bad choice of epoch if you want to introduce a
leap-second-aware datetime.
In GPS time they chose 1980. In TAI it is 1958, but that is somewhat worse
than 1980 because it is not immediately clear how to perform the conversion
timestamp<->timedelta between 1958 and 1970.

Something like 'proleptic gps time' would be needed to estimate the number
of leap seconds in the years before 1972 when they were introduced. Or
maybe to limit the leap-second timescale
to start at 1972 and not to accept any timestamps before that date.

The system that ignores the existence of the leap seconds has a right to
exist.
But it just has limited applicability.

np.datetime64 keeps time as a delta between the moment in time and a
predefined epoch.
Which standard does it use to translate this delta to human-readable time
in years,
months, and so on?

If it is UTC, then it must handle times like 2016-12-31 23:59:60, because
it is a valid UTC timestamp.
>>> np.datetime64('2016-12-31 12:59:60')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: Seconds out of range in datetime string "2016-12-31 12:59:60"

Datetime also fails (so far) to handle it:
>>> dt(2016,12,31,23,59,60)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: second must be in 0..59

But `time` works. Well, at least it doesn't raise an exception:
>>> t = time.struct_time((2016,12,31,12,59,60,0,0,0)); t
time.struct_time(tm_year=2016, tm_mon=12, tm_mday=31, tm_hour=12,
tm_min=59, tm_sec=60, tm_wday=0, tm_yday=0, tm_isdst=0)
>>> time.asctime(t)
'Mon Dec 31 12:59:60 2016'
>>> time.gmtime(calendar.timegm(t))
time.struct_time(tm_year=2017, tm_mon=1, tm_mday=1, tm_hour=1, tm_min=0,
tm_sec=0, tm_wday=6, tm_yday=1, tm_isdst=0)

Imagine a user that decides which library to use to store some (life
critical!) measurements taken every 100 ms. He looks at NumPy datetime64,
reads that it is capable of handling attosecods, and decides that it is a
perfect fit. Now imagine that on 31 Dec 2022 the World Government decided
to inject a leap second. The system will receive the announcement from the
NTC servers and will
prepare to replay this second twice. As soon as this moment chimes in he'll
run into a ValueError, which he won't notice because he's celebrating a New
Year :) And guess whom he'll blame? ;)

Actually the humanity has already got used to replaying timespans twice. It
happens every year in the countries that observe daylight saving time. And
the solution is to use a more linear scale than local time, namely, UTC.
But now turns out that UTC is not linear enough and it also has certain
timespans happening twice.

The solution once again is use a _really_ linear time which is TAI. I think
python 'time' library did a right thing to introduce time.CLOCK_TAI, after
all.

Astropy handles the UTC scale properly though:
>>> t = Time('2016-12-31 23:59:60')

>>> t0 = Time('2016-12-31 23:59:59')

>>> delta = t-t0

>>> delta.sec
0.99969589
>>> t0 + delta


So the solution for that particular person with regular intervals of time
is to use astropy. I mention it in the article.
I made some corrections to the text. I'd be grateful if you had a look and
pointed me to the particular sentences
that need improvement.

Best regards,
Lev


On Wed, Dec 29, 2021 at 6:54 PM Stefano Miccoli 
wrote:

> Lev, excuse me if I go in super pedantic mode, but your answer and the
> current text of the article fail to grasp an important point.
>
> 1) The proleptic Gregorian calendar is about leap *year* rules. It tracks
> days without making any assumption on the length of days. If we agree on
> using this calendar, dates like -0099-07-12 and 2021-12-29 are defined
> without ambiguity, and we can easily compute the number of days between
> these two dates.
>
> 2) Posix semantics is about the length of a day, and is based on the
> (utterly wrong) assumption that a mean solar day is constant and exactly
> 86400 SI seconds long. (For an authoritative estimate of historical length
> of day variations see  and the related
> papers  https://doi.org/10.1098/rspa.2020.0776>)
>
> Knowing assumption 1) is important when coding dates before 1582-10-15:
> e.g. 1582-10-04 Julian is 1582-10-14 proleptic Gregorian. Once we agree on
> the proleptic Gregorian calendar everything works as expected: time deltas
> expressed in days are correct.
>
> Knowing assumption 2) is important if we pretend to compute time deltas
> for date-time obj

[Numpy-discussion] Re: An article on numpy data types (Lev Maximov)

2022-01-01 Thread Lev Maximov
I've dual-published the article on github and medium under the title 'A
comprehensive guide to NumPy data types'.

Thank you all for your help and happy New Year!

Best regards,
Lev

On Sun, Jan 2, 2022 at 12:31 AM Stefano Miccoli 
wrote:

> First of all, happy new 2022 UTC year!
>
> Let my add just a very brief note to this discussion: I opened
> https://github.com/numpy/numpy/issues/20675 which addresses the
> shortcomings of the current documentation, which is in my opinion not
> sufficiently explicit in stating the datetime64 semantics. It is true that
> these are largely consistent with python ‘datetime.datetime’, but ‘explicit
> is better than implicit’. If nobody objects I will then open a doc-only PR
> adding a very short paragraph to the docs trying to explain the points
> discussed here.
>
> As what regards how much time UTC gained from 1970-01-01 up to today, you
> are right, it’s about 29 s. The UTC timescale was officially born in 1963
> but it can be traced back at least up to 1956/1958, see
> https://github.com/skyfielders/python-skyfield/issues/679 where this is
> discussed with reference to the timescales implemented in python-skyfield.
>
> Stefano
>
> On 31 Dec 2021, at 13:27, numpy-discussion-requ...@python.org wrote:
>
> Hey, Stefano!
>
> The level of being pedantic is absolutely acceptable.
>
> I don't question any of your arguments. They are all perfectly valid.
>
> Except that I'd rather say it is ~29 seconds if measuring against 1970.
> Leap seconds were introduced in 1972 and there were
> a total of 27 seconds since then, but TAI time was ticking since 1958 and
> gained 10 seconds by 1970 so it is approximately 0.83 second per year at
> which gives approx 28.67 sec between today and 1970.
> So 1970 is a bad choice of epoch if you want to introduce a
> leap-second-aware datetime.
> In GPS time they chose 1980. In TAI it is 1958, but that is somewhat worse
> than 1980 because it is not immediately clear how to perform the conversion
> timestamp<->timedelta between 1958 and 1970.
>
> Something like 'proleptic gps time' would be needed to estimate the number
> of leap seconds in the years before 1972 when they were introduced. Or
> maybe to limit the leap-second timescale
> to start at 1972 and not to accept any timestamps before that date.
>
> The system that ignores the existence of the leap seconds has a right to
> exist.
> But it just has limited applicability.
>
> np.datetime64 keeps time as a delta between the moment in time and a
> predefined epoch.
> Which standard does it use to translate this delta to human-readable time
> in years,
> months, and so on?
>
> If it is UTC, then it must handle times like 2016-12-31 23:59:60, because
> it is a valid UTC timestamp.
> >>> np.datetime64('2016-12-31 12:59:60')
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: Seconds out of range in datetime string "2016-12-31 12:59:60"
>
> Datetime also fails (so far) to handle it:
> >>> dt(2016,12,31,23,59,60)
> Traceback (most recent call last):
>   File "", line 1, in 
> ValueError: second must be in 0..59
>
> But `time` works. Well, at least it doesn't raise an exception:
> >>> t = time.struct_time((2016,12,31,12,59,60,0,0,0)); t
> time.struct_time(tm_year=2016, tm_mon=12, tm_mday=31, tm_hour=12,
> tm_min=59, tm_sec=60, tm_wday=0, tm_yday=0, tm_isdst=0)
> >>> time.asctime(t)
> 'Mon Dec 31 12:59:60 2016'
> >>> time.gmtime(calendar.timegm(t))
> time.struct_time(tm_year=2017, tm_mon=1, tm_mday=1, tm_hour=1, tm_min=0,
> tm_sec=0, tm_wday=6, tm_yday=1, tm_isdst=0)
>
> Imagine a user that decides which library to use to store some (life
> critical!) measurements taken every 100 ms. He looks at NumPy datetime64,
> reads that it is capable of handling attosecods, and decides that it is a
> perfect fit. Now imagine that on 31 Dec 2022 the World Government decided
> to inject a leap second. The system will receive the announcement from the
> NTC servers and will
> prepare to replay this second twice. As soon as this moment chimes in
> he'll run into a ValueError, which he won't notice because he's celebrating
> a New Year :) And guess whom he'll blame? ;)
>
> Actually the humanity has already got used to replaying timespans twice.
> It happens every year in the countries that observe daylight saving time.
> And the solution is to use a more linear scale than local time, namely,
> UTC. But now turns out that UTC is not linear enough and it also has
> certain timespans happening twice.
>
> The solution once again is use a _really_ linear time which is TAI. I
> think python 'time' library did a right thing to introduce time.CLOCK_TAI,
> after all.
>
> Astropy handles the UTC scale properly though:
> >>> t = Time('2016-12-31 23:59:60')
> 
> >>> t0 = Time('2016-12-31 23:59:59')
> 
> >>> delta = t-t0
> 
> >>> delta.sec
> 0.99969589
> >>> t0 + delta
> 
>
> So the solution for that particular person with regular intervals of time
> is to use astropy. I mention it in the article.

[Numpy-discussion] Re: An article on numpy data types (Lev Maximov)

2022-01-01 Thread Lev Maximov
Dear Mark,

They are GitHub
<https://axil.github.io/a-comprehensive-guide-to-numpy-data-types.html> and
Medium
<https://betterprogramming.pub/a-comprehensive-guide-to-numpy-data-types-8f62cb57ea83?sk=a417246f0cf9e24aca734525711299d3>
('friend
link').

Best regards,
Lev

On Sun, Jan 2, 2022 at 9:43 AM Dr. Mark Alexander Mikofski PhD <
mikof...@berkeley.edu> wrote:

> Dear Lev, thanks for this! Can you please post links to the GitHub &
> Medium articles? Sorry for my ignorance. Best Regards! Mark
>
> On Saturday, January 1, 2022, Lev Maximov  wrote:
>
>> I've dual-published the article on github and medium under the title 'A
>> comprehensive guide to NumPy data types'.
>>
>> Thank you all for your help and happy New Year!
>>
>> Best regards,
>> Lev
>>
>> On Sun, Jan 2, 2022 at 12:31 AM Stefano Miccoli <
>> stefano.micc...@polimi.it> wrote:
>>
>>> First of all, happy new 2022 UTC year!
>>>
>>> Let my add just a very brief note to this discussion: I opened
>>> https://github.com/numpy/numpy/issues/20675 which addresses the
>>> shortcomings of the current documentation, which is in my opinion not
>>> sufficiently explicit in stating the datetime64 semantics. It is true that
>>> these are largely consistent with python ‘datetime.datetime’, but ‘explicit
>>> is better than implicit’. If nobody objects I will then open a doc-only PR
>>> adding a very short paragraph to the docs trying to explain the points
>>> discussed here.
>>>
>>> As what regards how much time UTC gained from 1970-01-01 up to today,
>>> you are right, it’s about 29 s. The UTC timescale was officially born in
>>> 1963 but it can be traced back at least up to 1956/1958, see
>>> https://github.com/skyfielders/python-skyfield/issues/679 where this is
>>> discussed with reference to the timescales implemented in python-skyfield.
>>>
>>> Stefano
>>>
>>> On 31 Dec 2021, at 13:27, numpy-discussion-requ...@python.org wrote:
>>>
>>> Hey, Stefano!
>>>
>>> The level of being pedantic is absolutely acceptable.
>>>
>>> I don't question any of your arguments. They are all perfectly valid.
>>>
>>> Except that I'd rather say it is ~29 seconds if measuring against 1970.
>>> Leap seconds were introduced in 1972 and there were
>>> a total of 27 seconds since then, but TAI time was ticking since 1958
>>> and gained 10 seconds by 1970 so it is approximately 0.83 second per year
>>> at which gives approx 28.67 sec between today and 1970.
>>> So 1970 is a bad choice of epoch if you want to introduce a
>>> leap-second-aware datetime.
>>> In GPS time they chose 1980. In TAI it is 1958, but that is somewhat
>>> worse than 1980 because it is not immediately clear how to perform the
>>> conversion timestamp<->timedelta between 1958 and 1970.
>>>
>>> Something like 'proleptic gps time' would be needed to estimate the
>>> number of leap seconds in the years before 1972 when they were introduced.
>>> Or maybe to limit the leap-second timescale
>>> to start at 1972 and not to accept any timestamps before that date.
>>>
>>> The system that ignores the existence of the leap seconds has a right to
>>> exist.
>>> But it just has limited applicability.
>>>
>>> np.datetime64 keeps time as a delta between the moment in time and a
>>> predefined epoch.
>>> Which standard does it use to translate this delta to human-readable
>>> time in years,
>>> months, and so on?
>>>
>>> If it is UTC, then it must handle times like 2016-12-31 23:59:60,
>>> because it is a valid UTC timestamp.
>>> >>> np.datetime64('2016-12-31 12:59:60')
>>> Traceback (most recent call last):
>>>   File "", line 1, in 
>>> ValueError: Seconds out of range in datetime string "2016-12-31 12:59:60"
>>>
>>> Datetime also fails (so far) to handle it:
>>> >>> dt(2016,12,31,23,59,60)
>>> Traceback (most recent call last):
>>>   File "", line 1, in 
>>> ValueError: second must be in 0..59
>>>
>>> But `time` works. Well, at least it doesn't raise an exception:
>>> >>> t = time.struct_time((2016,12,31,12,59,60,0,0,0)); t
>>> time.struct_time(tm_year=2016, tm_mon=12, tm_mday=31, tm_hour=12,
>>> tm_min=59, tm_sec=60, tm_wday=0, tm_yday=0, tm_isdst=0)
>>> >>> tim

[Numpy-discussion] Re: Numpy array

2022-01-27 Thread Lev Maximov
Hi,

I believe this question fits Stack Overflow better.

Here're SO guidelines on how to create a minimal reproducible example:
https://stackoverflow.com/help/minimal-reproducible-example

If you're new to NumPy I'd recommend this visual guide:
https://betterprogramming.pub/numpy-illustrated-the-visual-guide-to-numpy-3b1d4976de1d

Best regards,
Lev

On Thu, Jan 27, 2022 at 8:19 PM  wrote:

> Hi, i am new to numpy. This is first time i am using numpy.
>
> https://github.com/mspieg/dynamical-systems/blob/master/Bifurcations.ipynb
>
> This code i found to create bifurcation graph. There is section of code
> which i am not able to understand
>
> vr[stable], vx[stable]
>
> vr and vx is array of 150 elements.
> unstable and stable array also has 150 elements
>
> But vr[stable], vx[stable] becomes 75 elements in the array. Which i am
> not able to umderstand how 150 elements in array drops to 75
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Numpy array

2022-01-27 Thread Lev Maximov
Yes, this page has a ton of interesting stuff, too! I gave just one link
because it might take time to
choose between 10 excellent tutorials. It's just my personal preference :)

Btw, is it possible to automate the detection of dead links on that page?

Lev

On Thu, Jan 27, 2022 at 11:25 PM Rohit Goswami 
wrote:

> Agreed, however, the NumPy learn section of the official documentation
> <https://numpy.org/learn/> is probably a better place to point to (though
> your article is justifiably also linked from there).
> --
>
> Rohit
>
> On 27 Jan 2022, at 16:15, Lev Maximov wrote:
>
> Hi,
>
> I believe this question fits Stack Overflow better.
>
> Here're SO guidelines on how to create a minimal reproducible example:
> https://stackoverflow.com/help/minimal-reproducible-example
>
> If you're new to NumPy I'd recommend this visual guide:
>
> https://betterprogramming.pub/numpy-illustrated-the-visual-guide-to-numpy-3b1d4976de1d
>
> Best regards,
> Lev
>
> On Thu, Jan 27, 2022 at 8:19 PM  wrote:
>
>> Hi, i am new to numpy. This is first time i am using numpy.
>>
>> https://github.com/mspieg/dynamical-systems/blob/master/Bifurcations.ipynb
>>
>> This code i found to create bifurcation graph. There is section of code
>> which i am not able to understand
>>
>> vr[stable], vx[stable]
>>
>> vr and vx is array of 150 elements.
>> unstable and stable array also has 150 elements
>>
>> But vr[stable], vx[stable] becomes 75 elements in the array. Which i am
>> not able to umderstand how 150 elements in array drops to 75
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: lev.maxi...@gmail.com
>>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>
> Member address: rgosw...@quansight.com
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: request to remove the numpy-aarch64 package from PyPI

2022-01-31 Thread Lev Maximov
Here's a story about how malicious pypi packages help break into corporate
networks.
It is not necessarily the goal this particular person was aiming for. Just
a side note.
"Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of
Other Companies"
https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610

Best regards,
Lev

On Sun, Jan 30, 2022 at 6:48 PM Ralf Gommers  wrote:

>
>
> On Mon, Jun 14, 2021 at 3:22 AM Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Sun, Jun 13, 2021 at 10:47 AM Ralf Gommers 
>> wrote:
>>
>>> Hi all,
>>>
>>> FYI, I noticed this package that claimed to be maintained by us:
>>> https://pypi.org/project/numpy-aarch64/. That's not ours, so I tried to
>>> contact the author (no email provided, but guessed the same username on
>>> GitHub) and asked to remove it:
>>> https://github.com/tomasriv/DNA_Sequence/issues/1.
>>>
>>> There are a very large number of packages with "numpy" in the name on
>>> PyPI, and there's no way we can audit/police that effectively, but if it's
>>> a rebuild that pretends like it's official then I think it's worth doing
>>> something about. It could contain malicious code for all we know.
>>>
>>>
>> That is a pretty misleading package description, would have fooled me if
>> I didn't know better. I didn't get the impression it was malicious, but
>> still . .
>>
>
> Hard to know whether it was malicious or not.
>
> I finally filed a PyPI issue to hand over the package to me so I can
> delete the wheel and replace the README:
> https://github.com/pypa/pypi-support/issues/1635
>
> Cheers,
> Ralf
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Dropping the pdf documentation.

2022-05-23 Thread Lev Maximov
What do you guys think of the chm format ("windows help")? This offline
documentation format is shipped with all python releases (eg
https://www.python.org/downloads/release/python-3913/).
It is simple to build from a hierarchy of html files, it is downloadable,
searchable, bookmarkable, has index, supports hyperlinks, can be opened on
linux as well.

One downside of it is that recent Windows versions (=Windows 10) block the
"execution" of this file if downloaded from "untrusted source" (=internet),
so it needs a checkbox in file properties
to lift this "security block".

Afaik, NumPy used to ship docs in this format many years ago, but then
dropped its support.

Best regards,
Lev

On Mon, May 23, 2022 at 1:33 PM Ralf Gommers  wrote:

>
>
> On Mon, May 23, 2022 at 6:51 AM Matti Picus  wrote:
>
>>
>> On 23/5/22 01:51, Rohit Goswami wrote:
>> >
>> > Being very hard to read should not be reason enough to stop generating
>> > them. In places with little to no internet connectivity often the PDF
>> > documentation is invaluable.
>> >
>> > I personally use the PDF documentation both on my phone and e-reader
>> > when I travel simply because it is more accessible and has better
>> > search capabilities.
>> >
>> > It is true that SciPy has removed them, but that doesn't necessarily
>> > mean we need to follow suit. Especially relevant (IMO) is that large
>> > parts of the NumPy documentation still make sense when read
>> > sequentially (going back to when it was at some point partially kanged
>> > from Travis' book).
>> >
>> > I'd be happy to spend time (and plan to) working on fixing concrete
>> > issues other than straw-man and subjective arguments.
>> >
>> > Personally I'd like to see the NumPy documentation have PDFs in a
>> > fashion where each page / chapter can be downloaded individually.
>> >
>> > -- Rohit
>> >
>> > P.S.: If we have CI timeout issues, for the PDF docs we could also
>> > have a dedicated repo and only build for releases.
>> >
>> > P.P.S: FWIW the Python docs are also still distributed in PDF form.
>> >
>> > On 22 May 2022, at 21:41, Stephan Hoyer wrote:
>> >
>> > +1 let’s drop the PDF docs. They are already very hard to read.
>> >
>> > On Sun, May 22, 2022 at 1:06 PM Charles R Harris
>> >  wrote:
>> >
>> > Hi All,
>> >
>> > This is a proposal to drop the generation of pdf documentation
>> > and only generate the html version. This is a one way change
>> > due to the difficulty maintaining/fixing the pdf versions. See
>> > minimal discussion here
>> > <
>> https://github.com/numpy/numpy/issues/21557#issuecomment-1133920412>.
>> >
>> > Chuck
>> >
>>
>> Thanks Rohit for the offer to take on this project.
>>
>> I don't think we should block the release on the existence of PDF
>> documentation. It is a "nice to have", not a hard requirement.
>>
>>
>> One strategy to discover problems with the PDF builds in CI would be to
>> add a weekly build of PDF.
>>
>
> That would just mean more CI maintenance/breakage, that the same folks who
> always take care of CI issues inevitably are going to have to look at.
>
> I'm +1 for removing pdf builds, they are not worth the maintainer effort -
> we shouldn't put them in CI, and they break at release time too often. It
> will remain possible for interested users to rebuild the docs themselves -
> and we can/will accept patches for docstring issues that trip up the pdf
> but not the html build. That's the same support level we have for other
> things that we do not run in CI.
>
> When we removed the SciPy pdf docs, the one concern was that there was no
> longer an offline option (by Juan, a very knowledgeable user and occasional
> contributor). So I suspect that most of the pdf downloads are for users who
> want that offline option, but we don't tell them that html+zip is the
> preferred one.
>
> Another benefit of removal is to slim down our dev Docker images a lot -
> right now the numpy-dev image is 300 MB larger than the scipy-dev one
> because of the inclusion of TeX Live.
>
> Cheers,
> Ralf
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Numpy with eigen c++ binding

2023-06-11 Thread Lev Maximov
It looks as though pybind11 can serve as a bridge between NumPy and Eigen:
https://pybind11.readthedocs.io/en/stable/advanced/cast/eigen.html


On Sun, Jun 11, 2023 at 2:39 AM Matti Picus  wrote:

> On 6/6/23 06:46, darshan patel wrote:
>
> > it seems like numpy is moving toward c++ implementation, so is there any
> plan to have eigen c++ library also inline with numpy to get better
> performance?
> > is there any ongoing work happening around this?
>
> NumPy does not currently use eigen and I am not aware of any plans to do
> so.
>
> Matti
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lev.maxi...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com