Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
Hi Stefan, thank you for your extremely quick reply! On 04/22/2013 07:31 AM, Stefan Behnel wrote: >> But this did not give me the crucial hint: Currently, any references to >> Python objects *may* be invalid at the time dealloc is called. This is >> because Cython generates a tp_clear function for each extension type >> that calls Py_CLEAR for each object reference. > Correct, together with your complete analysis (thanks for writing it up). > For objects participating in a reference cycle, the cyclic garbage > collector will try to break the cycle at an arbitrary object, which may or > may not be yours. Thus the unpredictable segfaults. Perhaps I am misreading the CPython source code but it appears to me that each and every object in the reference cycle will get its tp_clear called. http://hg.python.org/cpython/file/9c0a677dbbc0/Modules/gcmodule.c#l782 Ah, okay, I get it. The loop will terminate when it reaches an object that actually breaks the cycle in tp_clear: The decref will cascade over all objects in the cycle. > Thanks for bringing this up. Your use case is not uncommon, and it's worth > making it easier to handle. I am glad that you see it this way. I also think that this will happen often, but of course I was not sure if you would agree. > I also agree that setting fields to None is probably worse for the innocent > user than just setting them to NULL. IIRC, we chose to set everything to > None because that's something users can handle in their code. You can test > an attribute for None in your __dealloc__ code, but you can't test it for > NULL. Hmm, okay, that's a good point that I have missed. Wouldn't it be possible to allow the special check "foo is NULL" for any Python object? Of course Cython usually shields the developer from the possibility that foo actually becomes NULL so why bother. :-) > OTOH, I'm not sure there are many use cases for gracefully handling a > cleared reference in __dealloc__(). If there really is something to clean > up, then being able to test for None won't usually help much. It will > prevent a crash, but also the proper cleanup. Agreed. > That would be a class decorator. Totally makes sense to me. In fact, a > decorator to disable GC for a type would also make sense. That would be a great feature. After all, traversing the Python objects in my example is of no use as they can not create a cycle. >> *Exempt a single attribute:* >> >> cdef class Slots: >> >> cdef noclear Context context > I would like to avoid adding a separate modifier here. This could also be > handled in a class decorator that lists the excluded attributes. My > intuition tells me that by far the most common use cases are to exclude > either the entire type or exactly one attribute (as in your case, and the > lxml package has a similar need). I believe your intuition matches reality. :-) A class decorator supporting a list of excluded attributes would be fine! > Cython could also adopt a policy of automatically excluding attributes > referencing types that are known to be safe, e.g. basic builtin types. No > big savings, but it may drop the need for a useless tp_clear() entirely in > some cases, and that might have an impact on the GC cleanup time, as the GC > might succeed in breaking the cycle more quickly. Reconsidering One could even think about building a graph of possible object relationships based on the type declaration for the Python attributes. In the example, Slot refers only to Slots and Slots only to Context, so these can't build a cycle. > Another possibility (which I think we considered back in the old days) > would be to automatically exclude fields from tp_clear() that are being > syntactically referenced in a corresponding __dealloc__() method. However, > that's obviously error prone and won't catch all cases (e.g. usage in > subtypes, calling cleanup functions from __dealloc__, "if DEBUG: > print(self.attr)", ...), so I guess we'd rather not go that route. Explicit > is definitely better than implicit here. But a good idea would be to warn if __dealloc__ actually references a Python attribute that tp_clear could have cleared, with a pointer to the class decorator that exempts the attribute/instance from tp_clear. Thanks and greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
Hi Stefan, On 04/22/2013 01:50 PM, Stefan Behnel wrote: > Exactly - an extremely special case. It's an inherent property of the > Cython language that user visible references are never NULL. (minus > evil C code, obviously.) Suits me. >> a good idea would be to warn if __dealloc__ actually references a >> Python attribute that tp_clear could have cleared, with a pointer to the >> class decorator that exempts the attribute/instance from tp_clear. > That's a good idea. > > Any help with this is appreciated. :) How can I help? If you want, I can attempt to create a patch and ask you if I don't make any progress. I do not have a good grip at Cython sourcecode yet, so if you can give me a head start by pointing out the locations where I can inject the new behaviour this would be most welcome. About the object graph analysis I do not know if that is easily added. I thought it might be because Cython already has some type inference in place!? Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
On 04/23/2013 08:01 AM, Stefan Behnel wrote: > Greg Ewing, 23.04.2013 01:16: >> Only if subclassing of Slot and Context are forbidden. > Right. Subtypes of a non-GC type can happily add attributes and start > supporting cyclic garbage collection, including Python subclasses. So this > only applies to "final" types and builtins. Not entirely uncommon, > especially in the "keep this private thing alive until all referrers have > died" use case, but I's say this restriction drops the priority a bit. > Does Cython have an equivalent of the "final class DoNoExtend { ... }" of the Java world? In any case, Greg has a good point. I seriously did not think about it just because I did not plan to derive from those classes. Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Unit testing Cython extensions
p.linux-x86_64-2.7/foo.o -L/opt/dynasdk/loco2-precise/lib -lpython2.7 -o /home/torsten/foo/foo.so torsten@sharokan:~/foo$ PYTHONPATH=`pwd` py.test test session starts platform linux2 -- Python 2.7.3 -- pytest-2.3.4 plugins: cov, capturelog collected 1 items tests/test_foo.py F = FAILURES == _ test_foo __ def test_foo(): pprint.pprint(foo.__file__) > assert foo.it_works() E assert () E+ where = foo.it_works tests/test_foo.py:5: AssertionError -- Captured stdout -- '/opt/dynasdk/loco2-precise/lib/python2.7/site-packages/foo-0.0-py2.7-linux-x86_64.egg/foo.so' = 1 failed in 0.02 seconds == Unfortunately, it doesn't. The unit tests actually uses the installed version of the library, which we only want to replace after our tests pass. Let's try it another way: By creating a virtualenv just for our tests, we should be fine: torsten@sharokan:~/foo$ virtualenv --system-site-packages fooenv New python executable in fooenv/bin/python Please make sure you remove any previous custom paths from your /home/torsten/.pydistutils.cfg file. Installing setuptoolsdone. Installing pip...done. torsten@sharokan:~/foo$ . fooenv/bin/activate (fooenv)torsten@sharokan:~/foo$ pip install --upgrade . Unpacking /home/torsten/foo Running setup.py egg_info for package from file:///home/torsten/foo Downloading/unpacking Cython from http://pypi.python.org/packages/source/C/Cython/Cython-0.19.tar.gz#md5=76989337dee4cf7afdcb5cde514423f8 (from foo==0.0) Downloading Cython-0.19.tar.gz (1.4MB): 270kB downloaded ... Good start, but I don't want to recreate the whole thing inside the virtualenv fooenv. This would pull Cython, numpy, scipy, paramiko and more. Any hint how to ensure that we are testing the local version of that extension instead of the installed one? Thanks, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
Hi Stefan, sorry for the delay, I was in the US for parental leave. I had hoped to find some time to work on the promised patch over there, but I fell short. On 04/22/2013 02:28 PM, Stefan Behnel wrote: > Please do. Just ask back on this list if there's anything that's not > clear to you. I attached my current (trivial) patch. Currently I only support a decorator @cython.noclear cdef class ... to inhibit generation of tp_clear. Before I continue with this approach I am wondering about the API. Is noclear usable as a name? I think not because nobody will know what it is talking about. But I do not know how to press the information "do not generate the tp_clear slot which will clear references to break reference cycles during GC" into a short name. Perhaps something along the lines of "@cython.gc_keep_references" or "@cython.exempt_from_gc_cycle_breaker"!? How should the decorator to completely disable GC for a class be called? @cython.nogc? @cython.refcounted (because the class will only support cleanup via reference counting)? Any input appreciated. Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz >From 48cc65e658ac8f831661c5a880cf43154c919011 Mon Sep 17 00:00:00 2001 From: Torsten Landschoff Date: Wed, 10 Jul 2013 23:58:08 +0200 Subject: [PATCH] Inhibit tp_clear generation via class decorator. --- Cython/Compiler/ModuleNode.py |3 +- Cython/Compiler/Options.py|3 ++ Cython/Compiler/TypeSlots.py |9 - tests/run/noclear.h | 64 tests/run/noclear.pyx | 82 + 5 files changed, 159 insertions(+), 2 deletions(-) create mode 100644 tests/run/noclear.h create mode 100644 tests/run/noclear.pyx diff --git a/Cython/Compiler/ModuleNode.py b/Cython/Compiler/ModuleNode.py index ce6c304..791fc8e 100644 --- a/Cython/Compiler/ModuleNode.py +++ b/Cython/Compiler/ModuleNode.py @@ -1009,7 +1009,8 @@ class ModuleNode(Nodes.Node, Nodes.BlockNode): self.generate_dealloc_function(scope, code) if scope.needs_gc(): self.generate_traverse_function(scope, code, entry) -self.generate_clear_function(scope, code, entry) +if not scope.directives.get('noclear', False): +self.generate_clear_function(scope, code, entry) if scope.defines_any(["__getitem__"]): self.generate_getitem_int_function(scope, code) if scope.defines_any(["__setitem__", "__delitem__"]): diff --git a/Cython/Compiler/Options.py b/Cython/Compiler/Options.py index 795ae86..f09e3e3 100644 --- a/Cython/Compiler/Options.py +++ b/Cython/Compiler/Options.py @@ -96,6 +96,7 @@ directive_defaults = { 'final' : False, 'internal' : False, 'profile': False, +'noclear': False, 'linetrace': False, 'infer_types': None, 'infer_types.verbose': False, @@ -201,6 +202,7 @@ directive_types = { 'ccall' : None, 'cclass' : None, 'returns' : type, +'noclear': bool, 'set_initial_path': str, 'freelist': int, 'c_string_type': one_of('bytes', 'str', 'unicode'), @@ -214,6 +216,7 @@ for key, val in directive_defaults.items(): directive_scopes = { # defaults to available everywhere # 'module', 'function', 'class', 'with statement' 'final' : ('cclass', 'function'), +'noclear' : ('cclass',), 'internal' : ('cclass',), 'autotestdict' : ('module',), 'autotestdict.all' : ('module',), diff --git a/Cython/Compiler/TypeSlots.py b/Cython/Compiler/TypeSlots.py index 0413e4a..7bc9fb8 100644 --- a/Cython/Compiler/TypeSlots.py +++ b/Cython/Compiler/TypeSlots.py @@ -331,6 +331,13 @@ class GCDependentSlot(InternalMethodSlot): return InternalMethodSlot.slot_code(self, scope) +class GCClearReferencesSlot(GCDependentSlot): +def slot_code(self, scope): +if scope.directives.get('noclear', False): +return "0" +return GCDependentSlot.slot_code(self, scope) + + class ConstructorSlot(InternalMethodSlot): # Descriptor for tp_new and tp_dealloc. @@ -753,7 +760,7 @@
Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
Hi Stefan, On 07/14/2013 10:39 AM, Stefan Behnel wrote: > Torsten Landschoff, 11.07.2013 00:10: >> I attached my current (trivial) patch. Currently I only support a decorator >> >> @cython.noclear >> cdef class ... >> >> to inhibit generation of tp_clear. > Thanks, looks ok to me. Please open a pull request on github for it. I'd like to remove some code duplication I introduced in my changes and add the feature to exclude specific attributes from clearing. >> Before I continue with this approach I am wondering about the API. Is >> noclear usable as a name? I think not because nobody will know what it >> is talking about. > I think that name is as good as any. It could be "no_gc_clear", if you find > that clearer. It hints at GC, at least. I think that should be part of the > name somehow. Agreed. So there will be the following decorator: @cython.no_gc_clear class ... to disable generation of the clear method and @cython.no_gc class ... to disable GC entirely. Okay for you? I'd also like to support @cython.no_gc_clear("keepthis", "keepthat") to omit specific attributes from clear handling. I am not sure though how to declare that decorator in directive_types. Would something like this work: directive_types = { 'final' : bool, # final cdef classes and methods # ... 'no_gc_clear': one_of('bool', 'list'), # ... } ?? How would I tell the parser that it should accept the strings as *args of the decorator (if you know what I mean)? > Please also document it somewhere in the extension types doc section (in > the user guide): > > https://github.com/cython/cython/blob/master/docs/src/userguide/extension_types.rst Thanks for the pointer, I was wondering about the right documentation set to change. Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-4519587 Fax: +49-(0)351-4519561 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Surprising behaviour wrt. generated tp_clear and tp_dealloc functions
Hi again, On 07/14/2013 10:39 AM, Stefan Behnel wrote: > Torsten Landschoff, 11.07.2013 00:10: >> I attached my current (trivial) patch. Currently I only support a decorator >> >> @cython.noclear >> cdef class ... >> >> to inhibit generation of tp_clear. > Thanks, looks ok to me. Please open a pull request on github for it. I reworked it a bit, pull request is here: https://github.com/cython/cython/pull/248 Sorry it took so long, real life takes it's toll ;-) > I think that name is as good as any. It could be "no_gc_clear", if you > find that clearer. It hints at GC, at least. That's what I did. > I think that should be part of the name somehow. Please also document > it somewhere in the extension types doc section (in the user guide): > https://github.com/cython/cython/blob/master/docs/src/userguide/extension_types.rst Done. Please advise if you think it needs improvement - I have a hard time to describe complex facts in concise form. >> How should the decorator to completely disable GC for a class be called? >> @cython.nogc? > +1 I did not yet implement this. Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-312002-10 Fax: +49-(0)351-312002-29 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] PEP 442: safe object finalisation
Hi Stefan, On 08/03/2013 06:24 PM, Stefan Behnel wrote: > CPython 3.4 will have a new way to handle object finalisation. > > http://www.python.org/dev/peps/pep-0442/ > > I think we should switch the current deallocation code to make use of it. > That should fix the problem we currently have with user provided > __dealloc__() code that touches Python objects. It would be Py3.4+ > specific, though. I read the PEP and was wondering about the performance of this part: 1. *The CI is traversed again to determine if it is still isolated. If it is determined that at least one object in CI is now reachable from outside the CI, this collection is aborted and the whole CI is resurrected. Otherwise, proceed.* AFAICT it will not suffice to traverse the CI here. Instead, all active objects must be traversed which seems to be quite expensive. Does this mean that you will not pull my gc_no_clear changes from https://github.com/cython/cython/pull/248 in favor of this better fix? Or should I update it so that gc_no_clear only has an affect on Python < 3.4? Greetings, Torsten -- DYNAmore Gesellschaft fuer Ingenieurdienstleistungen mbH Torsten Landschoff Office Dresden Tel: +49-(0)351-312002-10 Fax: +49-(0)351-312002-29 mailto:torsten.landsch...@dynamore.de http://www.dynamore.de DYNAmore Gesellschaft für FEM Ingenieurdienstleistungen mbH Registration court: Stuttgart, HRB 733694 Managing director: Prof. Dr. Karl Schweizerhof, Dipl.-Math. Ulrich Franz ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel