[Cython] Hooking tp_clear()

2018-09-06 Thread Jeroen Demeyer

Hello,

Cython's __dealloc__ special method is meant to deal with cleaning up 
instances of cdef classes. However, this hooks tp_dealloc() and does not 
have meaningful access to Python attributes, since those might have been 
cleared by tp_clear().


I have a concrete use case where I want something like __dealloc__ but 
*before* Python attributes are cleared. So this really belongs in 
tp_clear().


Using a PyObject* attribute in the cdef class with manual reference 
counting is not a solution since this attribute could genuinely occur in 
a reference cycle.


So I would suggest to support a __clear__ special method, which would 
then be called both by tp_clear() and tp_dealloc(). It's important to 
note that this should be idempotent: it will be called at least once 
before Python attributes are cleared but it may also be called later.


PS: I never really understood the technical difference between 
tp_clear() and tp_dealloc(). It seems to me that these serve a very 
similar purpose: why can't the garbage collector just call tp_dealloc()?



Jeroen.
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Hooking tp_clear()

2018-09-06 Thread Stefan Behnel
Jeroen Demeyer schrieb am 06.09.2018 um 22:54:
> Cython's __dealloc__ special method is meant to deal with cleaning up
> instances of cdef classes. However, this hooks tp_dealloc() and does not
> have meaningful access to Python attributes, since those might have been
> cleared by tp_clear().
> 
> I have a concrete use case where I want something like __dealloc__ but
> *before* Python attributes are cleared. So this really belongs in tp_clear().
> 
> Using a PyObject* attribute in the cdef class with manual reference
> counting is not a solution since this attribute could genuinely occur in a
> reference cycle.
> 
> So I would suggest to support a __clear__ special method, which would then
> be called both by tp_clear() and tp_dealloc(). It's important to note that
> this should be idempotent: it will be called at least once before Python
> attributes are cleared but it may also be called later.

Maybe you actually want "tp_finalize"?

https://www.python.org/dev/peps/pep-0442/

Cython moves "__del__" methods there in Py3.4+.


> PS: I never really understood the technical difference between tp_clear()
> and tp_dealloc(). It seems to me that these serve a very similar purpose:
> why can't the garbage collector just call tp_dealloc()?

The problem are reference cycles, in which there definitely is a life
reference to the object *somewhere* else. Thus, the GC cannot simply
deallocate the object, it must try to delete the references instead. This
is what "tp_clear" is used for, it clears all references that an object
inside of a reference cycle has towards other objects (or at least those
that can participate in that cycle). This will (hopefully) trigger a
cascade of deallocations along the cycle. If that isn't enough, and there
is still a cycle, then the clearing needs to be repeated until all
references to the last object in the cycle are cleared.

AFAIR, tp_clear() is *only* called by the cyclic garbage collector and not
during normal refcounting deallocation. The GC process is: tp_visit() to
detect cycles, tp_clear() to break them. tp_dealloc() is then only called
indirectly by the normal refcounting cleanup, not directly by the GC.

Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Hooking tp_clear()

2018-09-06 Thread Greg Ewing

Jeroen Demeyer wrote:

I have a concrete use case where I want something like __dealloc__ but 

> *before* Python attributes are cleared. So this really belongs in tp_clear().

Are you sure you can't do it in __del__?  From what I gather,
the presence of __del__ no longer prevents cyclic garbage
collection.

I never really understood the technical difference between 
tp_clear() and tp_dealloc(). It seems to me that these serve a very 
similar purpose: why can't the garbage collector just call tp_dealloc()?


tp_dealloc is the inverse of tp_alloc -- its purpose is to
free the memory occupied by the object. This must not be done
until there are no more references to the object.

tp_clear is used to break reference cycles. After calling it,
there may still be references to the object from other objects
in the cycle, so tp_dealloc can't be done at that point.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel