On 10/25/2011 09:33 AM, Stefan Behnel wrote:
mark florisson, 24.10.2011 21:50:
This is in response to
http://groups.google.com/group/cython-users/browse_thread/thread/bcbc5fe0e329224f

and http://trac.cython.org/cython_trac/ticket/498 , and some of the
previous discussion on cython.parallel.

Basically I think we should have something more powerful than 'cdef
borrowed CdefClass obj', something that also doesn't rely on new
syntax.

We will still need borrowed reference support in the compiler
eventually, whether we make it a language feature or not.


What if we support acquisition counting for every instance of a cdef
class? In Python and Cython GIL mode you use reference counting, and
in Cython nogil mode and for structs attributes, array dtypes etc you
use acquisition counting. This allows you to pass around cdef objects
without the GIL and use their nogil methods. If the acquisition count
is greater than 1, the acquisition count owns a reference to the
object. If it reaches 0 you discard your owned reference (you can
simply acquire the GIL if you don't have it) and when you increment
from zero you obtain it. Perhaps something like libatomic could be
used to efficiently implement this.

Where would you store that count? In the object struct? That would
increase the size of each instance.


The advantages are:

1) allow users to pass around cdef typed objects in nogil mode
2) allow cdef typed objects in as struct attributes or array elements
3) make it easy to implement things like memoryviews (already done but
would have been a lot easier), cython.parallel.async/future objects,
cython.parallel.mutex objects and possibly other things in the future

Would it really be easier? You can already call cdef methods in nogil
mode, AFAIR.


We should then allow a syntax like

with mycdefobject:
...

to lock the object in GIL or nogil mode (like java's 'synchronized').
For objects that already have __enter__ and __exit__ you could support
something like 'with cython.synchronized(mycdefobject): ...' instead.
Or perhaps you should always require cython.synchronized (or
cython.parallel.synchronized).

The latter, I sure hope.


In addition to nogil methods a user may provide special cdef nogil
methods, i.e.

cdef int __len__(self) nogil:
...

which would provide a Cython as well as a Python implementation for
the function (with automatic cpdef behaviour), so you could use it in
both contexts.

That can already be done for final types, simply by adding cpdef
behaviour to all special methods. That would also fix ticket #3, for
example.

Note that the DefNode refactoring is still pending, it would help here.


There are two options for assignment semantics to a struct attribute
or array element:
- decref the old value (this implies always initializing the
pointers to NULL first)
- don't decref the old value (the user has to manually use 'del')

I think 1) is more definitely consistent with how everything else works.

Yes.


All of this functionality should also get a sane C API (to be provided
by cython.h). You'd get a Cy_INCREF(obj, have_gil)/Cy_DECREF() etc.
Every class using this functionality is a subclass of CythonObject
(that contains a PyObject + an acquisition count + a lock). Perhaps if
the user is subclassing something other than object we could allow the
user to specify custom __cython_(un)lock__ and
__cython_acquisition_count__ methods and fields.

Now, building on top of this functionality, Cython could provide
built-in nogil-compatible types, like lists, dicts and maybe tuples
(as a start). These will by default not lock for operations to allow
e.g. one thread to iterate over the list and another thread to index
it without lock contention and other general overhead. If one thread
is somehow changing the size of the list, or writing to indices that
another thread is reading from/writing to, the results will of course
be undefined unless the user synchronizes on the object. So it would
be the user's responsibility. The acquisition counting itself will
always be thread-safe (i.e., it will be atomic if possible, otherwise
it will lock).

It's probably best to not enable this functionality by default as it
would be more expensive to instantiate objects, but it could be
supported through a cdef class decorator and a general directive.

It's well known that this would be expensive. One of the approaches that
tried to get rid of the GIL in CPython introduced fine grained locking,
and it turned out to be substantially slower, AFAIR by a factor of two.

I'd gladly take a factor two (or even four) slowdown of CPython code any day to get rid of the GIL :-). The thing is, sometimes one has 48 cores and consider a 10x speedup better than nothing...

Dag Sverre
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to