[Cython] Python subinterpreters support problem in v0.29

2018-12-11 Thread Ricardo Dias
Hi Cython developers,

In the recent Cython 0.29 version was introduced a commit [1] that
hinders the usage of python subinterpreters.

I discovered this the hard way when suddenly a component I was working
on started to crash. The component in question is the ceph-mgr daemon
from the Ceph project [2].

Python subinterpreters are the basic building block for the
plugin/module architecture of ceph-mgr. Each "manager module" runs in
its own python subinterpreter. Furthermore, all python bindings for the
client libraries of Ceph, such as librados, librbd, libcephfs, and
librgw, are implemented as Cython modules, and in the particular case of
librados, all ceph-mgr plugin modules import the rados Cython module
upon initialization.

In practice, with Cython 0.29 we can only load one module, because the
following modules will refuse to load.

After discovering this issue, we "temporarily" prevent the issue by
restricting the version of Cython as a dependency [3]. But we don't want
to keep this restriction indefinitely and would prefer a fix from the
Cython side.

Do you think it's feasible to implement a flag to disable the safe guard
introduced in [1]? That way we could re-enable subinterpreters at our
own risk.


[1]
https://github.com/cython/cython/commit/7e27c7cd51a2f048cd6d3c246740cd977f8d2e50
[2] https://github.com/ceph/ceph
[3] https://github.com/ceph/ceph/pull/25328

-- 
Ricardo Dias
Senior Software Engineer - Storage Team
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284
(AG Nürnberg)





signature.asc
Description: OpenPGP digital signature
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Python subinterpreters support problem in v0.29

2018-12-11 Thread Ricardo Dias


On 11/12/18 19:39, Stefan Behnel wrote:
> Ricardo Dias schrieb am 10.12.18 um 14:42:
>> In the recent Cython 0.29 version was introduced a commit [1] that
>> hinders the usage of python subinterpreters.
>>
>> I discovered this the hard way when suddenly a component I was working
>> on started to crash. The component in question is the ceph-mgr daemon
>> from the Ceph project [2].
>>
>> Python subinterpreters are the basic building block for the
>> plugin/module architecture of ceph-mgr. Each "manager module" runs in
>> its own python subinterpreter. Furthermore, all python bindings for the
>> client libraries of Ceph, such as librados, librbd, libcephfs, and
>> librgw, are implemented as Cython modules, and in the particular case of
>> librados, all ceph-mgr plugin modules import the rados Cython module
>> upon initialization.
>>
>> In practice, with Cython 0.29 we can only load one module, because the
>> following modules will refuse to load.
>>
>> After discovering this issue, we "temporarily" prevent the issue by
>> restricting the version of Cython as a dependency [3]. But we don't want
>> to keep this restriction indefinitely and would prefer a fix from the
>> Cython side.
>>
>> Do you think it's feasible to implement a flag to disable the safe guard
>> introduced in [1]? That way we could re-enable subinterpreters at our
>> own risk.
>>
>> [1]
>> https://github.com/cython/cython/commit/7e27c7cd51a2f048cd6d3c246740cd977f8d2e50
>> [2] https://github.com/ceph/ceph
>> [3] https://github.com/ceph/ceph/pull/25328
> 
> My guess is that your modules just silently leaked object references and
> memory with the previous Cython versions. That is why we now inserted a
> guard that detects cases where the module init function is executed
> multiple times, which would overwrite the state of the previous run. The
> shared library of an extension module is only loaded once, so any global C
> state is shared for the entire process, regardless of how often CPython
> calls the module init function.

I assume that the problem with subinterpreters occurs when a cython
module declares some static/global variables, which might cause
undesirable side-effects upon module loading in several subinterpreters.

I believe the cython modules that we develop in Ceph do not declared any
global state, and therefore the modules have been working good when
loaded by several subinterpreters.

> 
> I am surprised that your setup didn't crash in any way. Could you explain a
> bit more how you are using this feature? Are the different subinterpreters
> running in parallel or sequentially? The ceph repo looks huge. Any pointers
> where I should start looking?

The subinterpreters are run in parallel. Basically we have a single
process, the ceph-mgr daemon that creates a subinterpreter per each mgr
plugin (a plugin is basically a pure python module) that it finds in a
specific location.
All these plugins import the "rados" cython module to be able to talk
with the Ceph cluster.

The C++ code that manages the subinterpreters can be found at:

https://github.com/ceph/ceph/tree/master/src/mgr

More specifically in the files PyModule.* PyModuleRegistry.*:

https://github.com/ceph/ceph/blob/master/src/mgr/PyModule.cc#L324

> 
> I actually wonder if we could at least support sequential usages through
> the module cleanup mechanism. Once a module is cleaned up and all global
> objects freed, calling the module init function again should be ok.>
> Apart from that, here is the feature ticket for module specific global state:
> 
> https://github.com/cython/cython/issues/2343
> 
> Stefan
> ___
> cython-devel mailing list
> cython-devel@python.org
> https://mail.python.org/mailman/listinfo/cython-devel
> 

-- 
Ricardo Dias
Senior Software Engineer - Storage Team
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284
(AG Nürnberg)



signature.asc
Description: OpenPGP digital signature
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel