[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8

2024-10-08 Thread Stefan van der Walt via NumPy-Discussion
On Mon, Oct 7, 2024, at 06:04, Rohit Goswami wrote:
> I second Matti's comments about the validity of endorsing things we don't 
> implement. 

I don't think it is possible to make ecosystem-wide recommendation that will 
fit each project like a glove. At best, we can try to come together as a 
community, make sound recommendations, and accept that there will be exceptions 
depending on circumstances. And those exceptions may well apply to NumPy. E.g., 
being at the bottom of the stack, the NumPy project may recommend the drop 
schedules from SPEC0 for other projects, but may implement a different strategy 
to ensure wider compatibility.

> Also, personally I really dislike the keys to castle spec, because I'm 
> generally against having yearly check in reviews and such.

The SPECs are living documents, and are constructed based on input from the 
community. It would therefore be good to better understand your concern. Is it 
with the sentence "Review permissions regularly (say, every year) to maintain 
minimal permissions."? Having written that SPEC, to me that obviously feels 
like a fairly pragmatic, low-cost recommendation; but perhaps there are better 
ways to accomplish the same goal. An issue on 
https://github.com/scientific-python/specs or the thread at 
https://discuss.scientific-python.org/t/spec-6-keys-to-the-castle/777/2 could 
be good venues for further discussion.

Best regards,
Stéfan
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8

2024-10-08 Thread Robert Kern via NumPy-Discussion
On Tue, Oct 8, 2024 at 8:36 AM Nathan via NumPy-Discussion <
numpy-discussion@python.org> wrote:

>
> Since the legacy RNG interface cannot be deprecated and we encourage
> downstream to use it in tests according to the text of NEP 19, I'm not sure
> about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or
> are you saying that we have now reached the point where we can update NEP
> 19 to encourage moving away from the legacy interface?
>

 We have already always encouraged people to move away from the legacy
interface in their APIs. SPEC 7 recommends a principled way for downstream
projects to implement that move.

NEP 19 acknowledged that sometimes one might still have a use case for
creating a legacy RandomState object and calling it in their tests to
generate test data (but not otherwise pass that RandomState object to the
code under test), but that's not what SPEC 7 addresses. NEP 19 doesn't
really actively recommend the use of RandomState for this purpose, just
acknowledges that it's a valid use case that numpy will continue to support
even while we push for the exclusive use of Generator inside of
library/program code. NEP 19 doesn't need an update for us to endorse SPEC
7 (whether it needs one, separately, to clarify its intent is another
question).

-- 
Robert Kern
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Expected behavior of np.array(..., copy=True)

2024-10-08 Thread Kevin Sheppard via NumPy-Discussion
Can anyone shed some light on the expected behavior of code using
array(..., copy=True) with pandas objects? We ran into this in statsmodels
and I think there are probably plenty of places where we explicitly call
array(..., copy=True) and think we should have a totally independent copy
of the data. One workaround is to use np.require(...,requirements="O") but
it would help to understand the expected behavior.

Here is a simple example:

import numpy as np
import pandas as pd

weeks = 2
now = pd.to_datetime('2024-01-01')
testdata = pd.DataFrame(columns=['dates', 'values'])
rg = np.random.default_rng(0)
testdata['dates'] = pd.date_range(start=now, periods=weeks * 7, freq='D')
testdata['values']=rg.integers(0, 100, size=(weeks * 7))

values = testdata['values']
print("*"*10, " Before ", "*"*10)
print(values.head())
arr = np.array(values, copy=True)
arr.sort()
print("*"*10, " After ", "*"*10)
print(values.head())
print("*"*10, " Flags ", "*"*10)
print(arr.flags)

This produces

**  Before  **
085
163
251
326
430
Name: values, dtype: int64
**  After  **
0 1
1 4
2 7
317
426
Name: values, dtype: int64
**  Flags  **
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

Thanks,
Kevin
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8

2024-10-08 Thread Nathan via NumPy-Discussion
Thanks for clarifying! In that case I think endorsing SPEC 7 makes sense.

On Tue, Oct 8, 2024 at 3:08 PM Robert Kern  wrote:

> On Tue, Oct 8, 2024 at 8:36 AM Nathan via NumPy-Discussion <
> numpy-discussion@python.org> wrote:
>
>>
>> Since the legacy RNG interface cannot be deprecated and we encourage
>> downstream to use it in tests according to the text of NEP 19, I'm not sure
>> about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or
>> are you saying that we have now reached the point where we can update NEP
>> 19 to encourage moving away from the legacy interface?
>>
>
>  We have already always encouraged people to move away from the legacy
> interface in their APIs. SPEC 7 recommends a principled way for downstream
> projects to implement that move.
>
> NEP 19 acknowledged that sometimes one might still have a use case for
> creating a legacy RandomState object and calling it in their tests to
> generate test data (but not otherwise pass that RandomState object to the
> code under test), but that's not what SPEC 7 addresses. NEP 19 doesn't
> really actively recommend the use of RandomState for this purpose, just
> acknowledges that it's a valid use case that numpy will continue to support
> even while we push for the exclusive use of Generator inside of
> library/program code. NEP 19 doesn't need an update for us to endorse SPEC
> 7 (whether it needs one, separately, to clarify its intent is another
> question).
>
> --
> Robert Kern
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Expected behavior of np.array(..., copy=True)

2024-10-08 Thread Sebastian Berg
On Tue, 2024-10-08 at 09:34 +0100, Kevin Sheppard via NumPy-Discussion
wrote:
> Can anyone shed some light on the expected behavior of code using
> array(..., copy=True) with pandas objects? We ran into this in
> statsmodels
> and I think there are probably plenty of places where we explicitly
> call
> array(..., copy=True) and think we should have a totally independent
> copy
> of the data. One workaround is to use
> np.require(...,requirements="O") but
> it would help to understand the expected behavior.


This is a bug in pandas, which added but ignored `copy=` in their
`__array__` methods:
https://github.com/pandas-dev/pandas/issues/57739

It probably makes sense bumping there.  NumPy could assume it isn't
correctly implemented, but that would be weird also...
(Objects that did nothing would get correct behavior but with a
DeprecationWarning.)


- Sebastian


> 
> Here is a simple example:
> 
> import numpy as np
> import pandas as pd
> 
> weeks = 2
> now = pd.to_datetime('2024-01-01')
> testdata = pd.DataFrame(columns=['dates', 'values'])
> rg = np.random.default_rng(0)
> testdata['dates'] = pd.date_range(start=now, periods=weeks * 7,
> freq='D')
> testdata['values']=rg.integers(0, 100, size=(weeks * 7))
> 
> values = testdata['values']
> print("*"*10, " Before ", "*"*10)
> print(values.head())
> arr = np.array(values, copy=True)
> arr.sort()
> print("*"*10, " After ", "*"*10)
> print(values.head())
> print("*"*10, " Flags ", "*"*10)
> print(arr.flags)
> 
> This produces
> 
> **  Before  **
> 0    85
> 1    63
> 2    51
> 3    26
> 4    30
> Name: values, dtype: int64
> **  After  **
> 0 1
> 1 4
> 2 7
> 3    17
> 4    26
> Name: values, dtype: int64
> **  Flags  **
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : True
>   OWNDATA : False
>   WRITEABLE : True
>   ALIGNED : True
>   WRITEBACKIFCOPY : False
> 
> Thanks,
> Kevin
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8

2024-10-08 Thread Nathan via NumPy-Discussion
Regarding thread safety - that's not a problem. At least for Python 3.13,
the GIL is temporarily re-enabled during imports. That won't necessarily be
true in the future, but separately CPython also uses per-module locks on
import, so there shouldn't be any issues with threads simultaneously
importing submodules.

It looks like we already implement lazy-loading for e.g. linalg, fft,
random, and other submodules. Does that lazy-loading mechanism conform to
the SPEC? If not, should it?

The keys to the castle SPEC makes sense to me, I'm fine with endorsing it.
I believe that all of NumPy's online accounts are already spread out over
multiple maintainers, so presumably we don't actually need to do much here
to implement it?

Since the legacy RNG interface cannot be deprecated and we encourage
downstream to use it in tests according to the text of NEP 19, I'm not sure
about the text in SPEC 7 that talks about deprecating using legacy RNGs. Or
are you saying that we have now reached the point where we can update NEP
19 to encourage moving away from the legacy interface? From the text of NEP
19 regarding the legacy RNG interface:

> This NEP does not propose that these requirements remain in perpetuity.
After we have experience with the new PRNG subsystem, we can and should
revisit these issues in future NEPs.

I don't have a problem with SPEC 8, although I suspect there might be a
fair bit of work to get NumPy's CI to match the suggestions in the SPEC.



On Tue, Oct 8, 2024 at 2:08 PM Joren Hammudoglu via NumPy-Discussion <
numpy-discussion@python.org> wrote:

> Is SPEC 1 thread-safe enough for py313+nogil?
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: nathan12...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Endorsing SPECs 1, 6, 7, and 8

2024-10-08 Thread Joren Hammudoglu via NumPy-Discussion
Is SPEC 1 thread-safe enough for py313+nogil?
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com