[Numpy-discussion] Support for Multiple Interpreters (Subinterpreters) in numpy
Hi all, CPython has supported multiple interpreters (in the same process) for a long time, but only through the C-API. I'm working on exposing that functionality to Python code (see PEP 554), aiming for 3.12. I expect that users will find the feature useful (particularly with a per-interpreter GIL--see PEP 684) and that it will be used a lot more over the coming years. This has the potential to impact extension module projects, especially large ones like numpy, which is why I'm reaching out to you. Use of multiple interpreters depends on isolation between them. When an extension module is imported in multiple interpreters, it is loaded separately into a new module object in each. Extensions often store module data/state in C globals, which means the the multiple instances end up sharing data. This causes problems, more so once we have one GIL per interpreter. Over the years we have added machinery to help extensions get the necessary isolation, moving away from global variables. This includes PEPs 384, 3121, and 489. This has culminated in the guide you can find in PEP 630. Note that nothing should change when only a single interpreter is in use (basically the status quo). With PEP 684, importing an incompatible extension outside the main (initial) interpreter will now be an ImportError. (Currently the behavior is undefined and too often results in hard-to-debug failures and crashes.) Thus extension module maintainers do have the option to *not* support multiple interpreters. Unfortunately, that doesn't mean their users won't pester them about adding support. We all recognize how that dynamic can be draining on a project. The potential burden on maintainers is a serious factor for these upcoming changes. numpy is likely to be affected more than any other project. That's why I'm starting this thread. PEP 684 discusses all of the above. What I'm after with this thread is: * to make sure the numpy maintainers are clear on what interpreter isolation requires of the project * a clear picture of what changes numpy would need (and how much work that would be) * feedback on what the CPython team can do to minimize that work (incl. adding new C APIs) I'm fine with having the discussion here, but I will probably create a new category on discuss.python.org for a variety of similar threads related to multiple interpreters and supporting them. Having our discussion there may lead to more participation from more CPython core devs than just me. Do you have any preference for or against any particular venue? Thanks! -eric ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Support for Multiple Interpreters (Subinterpreters) in numpy
On Tue, Aug 23, 2022 at 3:47 AM Sebastian Berg wrote: > What is the status of immortality? None of these seem forbidding on > first sight, so long that we can get the state everywhere. > Having immortal object seems convenient, but probably not particularly > necessary. The current proposal for immortal objects (PEP 683) will be going to the steering council soon. However, it only applies to the CPython runtime (internally). We don't have plans right now for a public API to make an object immortal. (That would be a separate proposal.) If isolating the extension, a la PEP 630, isn't feasible in the short term, we would certainly be open to discussing alternatives (incl. immortal objects). > One other thing I am not quite sure about right now is GIL grabbing. > `PyGILState_Ensure()` will continue to work reliably? > This used to be one of my main worries. It is also something we can > fix-up (pass through additional information), but where a fallback > seems needed. Compatibility of the GIL state API with subinterpreters has been a long-standing bug. [1] That will be fixed. Otherwise, PyGILState_Ensure() should work correctly. -eric [1] https://github.com/python/cpython/issues/59956 ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Support for Multiple Interpreters (Subinterpreters) in numpy
On Tue, Aug 23, 2022 at 6:01 AM Petr Viktorin wrote: > And if the CPython API is lacking, it would be best to solve that in > CPython. +1 In some ways, new CPython APIs would be the most important artifacts of this discussion. We want to minimize the effort it takes to support multiple interpreters. So we definitely want to know what we could provide that would help. > Per-interpreter GIL is an *additional* step. I believe it will need its > own opt-in mechanism. But subinterpreter support is a prerequisite for it. Yeah, that is an evolving point of discussion in PEP 684. -eric ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Support for Multiple Interpreters (Subinterpreters) in numpy
On Wed, Aug 24, 2022 at 4:42 AM Petr Viktorin wrote: > On 23. 08. 22 16:19, Sebastian Berg wrote: > > Our public C-API is currently exported as a single static struct into > > the library loading NumPy. If types depend on the interpreter, it > > would seem we need to redo the whole mechanism? > > Right, sounds like it needs to be a dynamically allocated struct. > In the interim, one instance of the struct is static: that's the one > used for anything that doesn't support multiple interpreters yet, and > also as the module state in one “main” module object. (That would be the > first module to be loaded, and until everything switches over, it'd get > an unpaired incref to become “immortal” and leak at exit.) > > > Further, many of the functions would need to be adapted. We might be > > able to hack that the API looks the same [1]. However, it cannot be > > ABI compatible, so we would need a whole new API table/export mechnism > > and some sort of shim to allow compiling against older NumPy versions > > but using it with all versions (otherwise we need 2+ years of > > patience). > > Having one static “main” module state in the interim would also help here. > > > Of course there might be a point in saying that most C-API use is > > initially not subinterpreter ready, but it does seem like a pretty huge > > limitation... > > A huge limitation, but it might be a good way to break up the work to > make it more manageable :) FWIW, in CPython there's a similar issue. We currently expose static pointers to all the builtin exceptions in the C-API. Even worse, we expose the object *values* for all the static types and the several singletons. On top of that, these are all exposed in the limited API (stable ABI). As a result, moving to one each per interpreter is messy. PEP 684 talks about the possible solutions. The simplest for us is to make all those objects immortal. However, in some cases we also have to do an interpreter-specific lookup internally. I expect you would have to do similar where/when compatibility remains essential. -eric ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com