[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8
On Mon, Oct 7, 2024, at 06:04, Rohit Goswami wrote: > I second Matti's comments about the validity of endorsing things we don't > implement. I don't think it is possible to make ecosystem-wide recommendation that will fit each project like a glove. At best, we can try to come together as a community, make sound recommendations, and accept that there will be exceptions depending on circumstances. And those exceptions may well apply to NumPy. E.g., being at the bottom of the stack, the NumPy project may recommend the drop schedules from SPEC0 for other projects, but may implement a different strategy to ensure wider compatibility. > Also, personally I really dislike the keys to castle spec, because I'm > generally against having yearly check in reviews and such. The SPECs are living documents, and are constructed based on input from the community. It would therefore be good to better understand your concern. Is it with the sentence "Review permissions regularly (say, every year) to maintain minimal permissions."? Having written that SPEC, to me that obviously feels like a fairly pragmatic, low-cost recommendation; but perhaps there are better ways to accomplish the same goal. An issue on https://github.com/scientific-python/specs or the thread at https://discuss.scientific-python.org/t/spec-6-keys-to-the-castle/777/2 could be good venues for further discussion. Best regards, Stéfan ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8
On Tue, Oct 8, 2024 at 8:36 AM Nathan via NumPy-Discussion < numpy-discussion@python.org> wrote: > > Since the legacy RNG interface cannot be deprecated and we encourage > downstream to use it in tests according to the text of NEP 19, I'm not sure > about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or > are you saying that we have now reached the point where we can update NEP > 19 to encourage moving away from the legacy interface? > We have already always encouraged people to move away from the legacy interface in their APIs. SPEC 7 recommends a principled way for downstream projects to implement that move. NEP 19 acknowledged that sometimes one might still have a use case for creating a legacy RandomState object and calling it in their tests to generate test data (but not otherwise pass that RandomState object to the code under test), but that's not what SPEC 7 addresses. NEP 19 doesn't really actively recommend the use of RandomState for this purpose, just acknowledges that it's a valid use case that numpy will continue to support even while we push for the exclusive use of Generator inside of library/program code. NEP 19 doesn't need an update for us to endorse SPEC 7 (whether it needs one, separately, to clarify its intent is another question). -- Robert Kern ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Expected behavior of np.array(..., copy=True)
Can anyone shed some light on the expected behavior of code using array(..., copy=True) with pandas objects? We ran into this in statsmodels and I think there are probably plenty of places where we explicitly call array(..., copy=True) and think we should have a totally independent copy of the data. One workaround is to use np.require(...,requirements="O") but it would help to understand the expected behavior. Here is a simple example: import numpy as np import pandas as pd weeks = 2 now = pd.to_datetime('2024-01-01') testdata = pd.DataFrame(columns=['dates', 'values']) rg = np.random.default_rng(0) testdata['dates'] = pd.date_range(start=now, periods=weeks * 7, freq='D') testdata['values']=rg.integers(0, 100, size=(weeks * 7)) values = testdata['values'] print("*"*10, " Before ", "*"*10) print(values.head()) arr = np.array(values, copy=True) arr.sort() print("*"*10, " After ", "*"*10) print(values.head()) print("*"*10, " Flags ", "*"*10) print(arr.flags) This produces ** Before ** 085 163 251 326 430 Name: values, dtype: int64 ** After ** 0 1 1 4 2 7 317 426 Name: values, dtype: int64 ** Flags ** C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : False WRITEABLE : True ALIGNED : True WRITEBACKIFCOPY : False Thanks, Kevin ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8
Thanks for clarifying! In that case I think endorsing SPEC 7 makes sense. On Tue, Oct 8, 2024 at 3:08 PM Robert Kern wrote: > On Tue, Oct 8, 2024 at 8:36 AM Nathan via NumPy-Discussion < > numpy-discussion@python.org> wrote: > >> >> Since the legacy RNG interface cannot be deprecated and we encourage >> downstream to use it in tests according to the text of NEP 19, I'm not sure >> about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or >> are you saying that we have now reached the point where we can update NEP >> 19 to encourage moving away from the legacy interface? >> > > We have already always encouraged people to move away from the legacy > interface in their APIs. SPEC 7 recommends a principled way for downstream > projects to implement that move. > > NEP 19 acknowledged that sometimes one might still have a use case for > creating a legacy RandomState object and calling it in their tests to > generate test data (but not otherwise pass that RandomState object to the > code under test), but that's not what SPEC 7 addresses. NEP 19 doesn't > really actively recommend the use of RandomState for this purpose, just > acknowledges that it's a valid use case that numpy will continue to support > even while we push for the exclusive use of Generator inside of > library/program code. NEP 19 doesn't need an update for us to endorse SPEC > 7 (whether it needs one, separately, to clarify its intent is another > question). > > -- > Robert Kern > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Expected behavior of np.array(..., copy=True)
On Tue, 2024-10-08 at 09:34 +0100, Kevin Sheppard via NumPy-Discussion wrote: > Can anyone shed some light on the expected behavior of code using > array(..., copy=True) with pandas objects? We ran into this in > statsmodels > and I think there are probably plenty of places where we explicitly > call > array(..., copy=True) and think we should have a totally independent > copy > of the data. One workaround is to use > np.require(...,requirements="O") but > it would help to understand the expected behavior. This is a bug in pandas, which added but ignored `copy=` in their `__array__` methods: https://github.com/pandas-dev/pandas/issues/57739 It probably makes sense bumping there. NumPy could assume it isn't correctly implemented, but that would be weird also... (Objects that did nothing would get correct behavior but with a DeprecationWarning.) - Sebastian > > Here is a simple example: > > import numpy as np > import pandas as pd > > weeks = 2 > now = pd.to_datetime('2024-01-01') > testdata = pd.DataFrame(columns=['dates', 'values']) > rg = np.random.default_rng(0) > testdata['dates'] = pd.date_range(start=now, periods=weeks * 7, > freq='D') > testdata['values']=rg.integers(0, 100, size=(weeks * 7)) > > values = testdata['values'] > print("*"*10, " Before ", "*"*10) > print(values.head()) > arr = np.array(values, copy=True) > arr.sort() > print("*"*10, " After ", "*"*10) > print(values.head()) > print("*"*10, " Flags ", "*"*10) > print(arr.flags) > > This produces > > ** Before ** > 0 85 > 1 63 > 2 51 > 3 26 > 4 30 > Name: values, dtype: int64 > ** After ** > 0 1 > 1 4 > 2 7 > 3 17 > 4 26 > Name: values, dtype: int64 > ** Flags ** > C_CONTIGUOUS : True > F_CONTIGUOUS : True > OWNDATA : False > WRITEABLE : True > ALIGNED : True > WRITEBACKIFCOPY : False > > Thanks, > Kevin > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8
Regarding thread safety - that's not a problem. At least for Python 3.13, the GIL is temporarily re-enabled during imports. That won't necessarily be true in the future, but separately CPython also uses per-module locks on import, so there shouldn't be any issues with threads simultaneously importing submodules. It looks like we already implement lazy-loading for e.g. linalg, fft, random, and other submodules. Does that lazy-loading mechanism conform to the SPEC? If not, should it? The keys to the castle SPEC makes sense to me, I'm fine with endorsing it. I believe that all of NumPy's online accounts are already spread out over multiple maintainers, so presumably we don't actually need to do much here to implement it? Since the legacy RNG interface cannot be deprecated and we encourage downstream to use it in tests according to the text of NEP 19, I'm not sure about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or are you saying that we have now reached the point where we can update NEP 19 to encourage moving away from the legacy interface? From the text of NEP 19 regarding the legacy RNG interface: > This NEP does not propose that these requirements remain in perpetuity. After we have experience with the new PRNG subsystem, we can and should revisit these issues in future NEPs. I don't have a problem with SPEC 8, although I suspect there might be a fair bit of work to get NumPy's CI to match the suggestions in the SPEC. On Tue, Oct 8, 2024 at 2:08 PM Joren Hammudoglu via NumPy-Discussion < numpy-discussion@python.org> wrote: > Is SPEC 1 thread-safe enough for py313+nogil? > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: nathan12...@gmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8
Is SPEC 1 thread-safe enough for py313+nogil? ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com