[Numpy-discussion] Formally accept NEP 51: Changing the Representation of NumPy Scalars
Hi all, I would like to formally propose accepting NEP 51. Without any concern voiced, we will consider it accepted within 7 days. As a reminder, this is to change the representation of NumPy scalars to be consistent and include the type name. That means the following representations: np.float64(6.4) -> np.float64(6.4) np.float64(np.nan) -> np.float64(nan) rather than just `6.4` or `nan`. All scalars would follow this exact pattern of `np.(value)`. There are some further details, for these please check the full NEP: https://numpy.org/neps/nep-0051-scalar-representation.html For those interested in more details, a few notes: * To implement the NEP, we need to update NumPy docs. I plan to automate this (mostly) and such automation should also help others. (I will make a brief note of this in the NEP.) Help with this automation would be greatly appreciated, since this is its own project. * I am not sure that the underscored versions `np.str_` and `np.bool_` will be the correct names for long. If we adjust them, then this would propagate to the NEP. * There are a few implementation details in the NEP, I don't mind adjusting them. But do wish to be pragmatic about progressing if there is no clearly formulated alternative. * Clearly we can always adjust the printing conventions, e.g. whether to include the `np.` or whether NaN's should be `np.float64(nan)` or not. But bike-sheds happening now have a much better chance of being heard :). 1. The current NEP states that we use `np.str_` and `np.bytes_`. There is some chance that the top-level names could be changed, in that case the representation would change accordingly. (I consider this an adjustment we can do without the NEP.) 2. To properly implement the NEP, we need to automate some of the documentation changes necessary. This should also enable downstream to do the same or at least have a blueprint as a starting point. (Help with this work is greatly appreciated, since it is its own small project to hook into the doctest utilities.) I plan on adding a brief note on about helping with doc updates to NEP when accepting it. Ross was planning to add a table of changed examples, although I don't think that is necessary for accepting. Cheers, Sebastian On Fri, 2022-10-28 at 10:54 +0200, Sebastian Berg wrote: > Hi all, > > As mentioned earlier, I would like to propose changing the > representation of scalars in NumPy. Discussion and ideas on changes > are much appreciated! > > The main change is to show scalars as: > > * `np.float64(3.0)` instead of just `3.0` > * `np.True_` instead of `True` > * `np.void((3, 5), dtype=[('a', ' `(3, 5)` > * Use `np.` rather than `numpy.` for datetime/timedelta. > > This way it is clear for users that they are dealing with NumPy > scalars > which behave different from Python scalars. > The `str()` that is given when using `print()` and the way arrays are > shown will be unchanged. > > The NEP draft can be found here: > > https://numpy.org/neps/nep-0051-scalar-representation.html > > and it includes more details and related changes. > > The implementation is largely finished and can be found here: > > https://github.com/numpy/numpy/pull/22449 > > W are fairly late in the release cycle and the change should not > block > other things. So, the aim is to merge it early in the next release > cycle. That way downstream has time to fix documentation is wanted. > > Depending on how discussion goes, I hope to formally propose the NEP > fairly soon, so that the merging the implementation doesn't need to > wait on NEP approval. > > Cheers, > > Sebastian > > > > > On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote: > > > > TL;DR: NumPy scalars representation is e.g. `34.3` instead of > > `float32(34.3)`. So the representation is missing the type > > information. What are your thoughts on changing that? > > > > > > Hi all, > > > > I am thinking about the next steps for NEP 50 (The NEP wants to fix > > the > > NumPy promotion rules, especially with respect to scalars): > > > > https://numpy.org/neps/nep-0050-scalar-promotion.html > > > > In relation to that, there was one point that Stéfan brought up > > previously. > > > > The NumPy scalars (representation) currently print as numbers: > > > > >>> np.float32(34.3) > > 34.3 > > >>> np.uint8(5) > > 5 > > > > That can already be confusing now. However, it gets more > > problematic > > if NEP 50 is introduced since the behavior between a Python `34.3` > > and > > `np.float32(34.3)` would differ more than it does now (please refer > > to > > the NEP). > > > > The change would be that we should print as: > > > > float64(34.3) (or similar?) > > > > This Email is mainly to ask for any feedback or concern on such a > > change. I suspect we may have to write a very brief NEP about it. > > > > If there is little concern, maybe we could move forward such a > >
[Numpy-discussion] Re: Adding bit_count ufunc
Thanks for bringing this up again. The Python method exists and it seems like relatively basic functionality. Overall, I am slightly in favor of adding the ufunc. So if nobody voices an opinion that it doesn't seem a good fit for NumPy, I would be happy to move forward with it. - Sebastian PS: One of my main concern would be if we were to add many bitwise functions, in which case a `bitwise` namespace might be nice. But I am not convinced that should stop us here. On Thu, 2022-11-24 at 15:56 +, Doug Turnbull wrote: > 👋Long time numpy user, and big fan of all your work > > TL; DR - there's a `bit_count` method on numpy scalars, and Python > ints, I'm advocating for a `ufunc` `bit_count` as has been > implemented in this PR: > > https://github.com/numpy/numpy/pull/21429 > > A number of people have requested this as a numpy feature and there's > a lot of great discussion at this issue > > https://github.com/numpy/numpy/issues/16325 > > A bit count comes up in certain similarity situations (like hamming > distance). This involves an xor with numpy arrays and a bit count, > where the bit count is currently the bottleneck by a few orders of > magnitude in my local benchmarking. Currently there's a number of not > particularly fast, workarounds for doing this with numpy arrays, like > the bit-twiddling solutions here > > https://stackoverflow.com/a/68943135/8123 > https://stackoverflow.com/a/109025/8123 > > So I'd love to be able to continue the work of bit_count -> numpy > array ufunc to get performance gains at the array level 🙏 > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Adding bit_count ufunc
On Fri, Nov 25, 2022 at 08:09:02PM +0100, Sebastian Berg wrote: > Thanks for bringing this up again. The Python method exists and it > seems like relatively basic functionality. > > Overall, I am slightly in favor of adding the ufunc. So if nobody > voices an opinion that it doesn't seem a good fit for NumPy, I would be > happy to move forward with it. > > - Sebastian > > > PS: One of my main concern would be if we were to add many bitwise > functions, in which case a `bitwise` namespace might be nice. But I am > not convinced that should stop us here. Technically speaking, bitwise_and, birwise_or, bitwise_xor and bitwise_not already exist and popcount is widely spread, it already has its compiler builtin under the name of __builtin_popcount ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Questions on contributing
Hi. I would like to help contribute with the code. I would like to know where to get started. There are a couple features in mind that I feel is missing and would be helpful. ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Questions on contributing
Thank you for your interest in contributing to NumPy, Kode! Here is a quick overview of all the ways one can contribute to the project: https://numpy.org/contribute/ If you’d like to focus on the source code, have a look at the issues that are up to 12 months old in the issue tracker ( https://github.com/numpy/numpy/issues) and see if there is anything you could fix. The ones labeled “documentation” and “bug” would be the best place to start. With bugs, reproducing a bug and documenting it in a comment (below the issue) would be of tremendous help. For general Python development in NumPy, modules like `numpy.ma` and `numpy.polynomial` are pure Python and are fairly approachable. On Fri, Nov 25, 2022 at 9:35 PM Kode D. Creer wrote: > Hi. I would like to help contribute with the code. I would like to know > where to get started. There are a couple features in mind that I feel is > missing and would be helpful. > -- Cheers, Inessa Inessa Pawson Contributor Experience Lead | NumPy https://numpy.org/ GitHub: inessapawson ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com