[Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Wes McKinney
I've been using

ndarray[uint8_t, cast=True] bool_arr

to work with dtype=bool arrays in Cython lately. When testing using
Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
1.6.1. This is with Cython 0.15.1.

Any advice or do I have to (very unhappily) work around this?

thanks,
Wes
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Dag Sverre Seljebotn

On 10/24/2011 09:26 PM, Wes McKinney wrote:

I've been using

ndarray[uint8_t, cast=True] bool_arr

to work with dtype=bool arrays in Cython lately. When testing using
Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
1.6.1. This is with Cython 0.15.1.

Any advice or do I have to (very unhappily) work around this?


Is this a recent bug in Cython? Try to bisect the the Cython release 
(and if it turns out to be Cython, possible commit).


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Wes McKinney
On Mon, Oct 24, 2011 at 3:37 PM, Dag Sverre Seljebotn
 wrote:
> On 10/24/2011 09:26 PM, Wes McKinney wrote:
>>
>> I've been using
>>
>> ndarray[uint8_t, cast=True] bool_arr
>>
>> to work with dtype=bool arrays in Cython lately. When testing using
>> Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
>> in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
>> 1.6.1. This is with Cython 0.15.1.
>>
>> Any advice or do I have to (very unhappily) work around this?
>
> Is this a recent bug in Cython? Try to bisect the the Cython release (and if
> it turns out to be Cython, possible commit).
>
> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>

I'll check the HEAD revision and bisect if I can, don't have a lot of
time-- it's just strange that it's Python 2.5 only.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Wes McKinney
On Mon, Oct 24, 2011 at 2:40 PM, Wes McKinney  wrote:
> On Mon, Oct 24, 2011 at 3:37 PM, Dag Sverre Seljebotn
>  wrote:
>> On 10/24/2011 09:26 PM, Wes McKinney wrote:
>>>
>>> I've been using
>>>
>>> ndarray[uint8_t, cast=True] bool_arr
>>>
>>> to work with dtype=bool arrays in Cython lately. When testing using
>>> Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
>>> in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
>>> 1.6.1. This is with Cython 0.15.1.
>>>
>>> Any advice or do I have to (very unhappily) work around this?
>>
>> Is this a recent bug in Cython? Try to bisect the the Cython release (and if
>> it turns out to be Cython, possible commit).
>>
>> Dag Sverre
>> ___
>> cython-devel mailing list
>> cython-devel@python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>>
>
> I'll check the HEAD revision and bisect if I can, don't have a lot of
> time-- it's just strange that it's Python 2.5 only.
>

For some reason I can't build Cython (0.15.1 or git HEAD) with mingw32:

C:\cython>python setup.py install
running install
running build
running build_py
running build_ext
building 'Cython.Compiler.Scanning' extension
C:\MinGW\bin\gcc.exe -mno-cygwin -mdll -O -Wall -IC:\Python25\include -IC:\Pytho
n25\PC -c Cython\Compiler\Scanning.c -o build\temp.win32-2.5\Release\cython\comp
iler\scanning.o
Cython\Compiler\Scanning.c:13340: error: initializer element is not constant
Cython\Compiler\Scanning.c:13340: error: (near initialization for `__pyx_CyFunct
ionType_type.tp_call')
error: command 'gcc' failed with exit status 1

C:\cython>

I've half a mind to drop Python 2.5 support in pandas over this...
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Acquisition counted cdef classes

2011-10-24 Thread mark florisson
Hey,

This is in response to
http://groups.google.com/group/cython-users/browse_thread/thread/bcbc5fe0e329224f
and http://trac.cython.org/cython_trac/ticket/498 , and some of the
previous discussion on cython.parallel.

Basically I think we should have something more powerful than 'cdef
borrowed CdefClass obj', something that also doesn't rely on new
syntax.

What if we support acquisition counting for every instance of a cdef
class? In Python and Cython GIL mode you use reference counting, and
in Cython nogil mode and for structs attributes, array dtypes etc you
use acquisition counting. This allows you to pass around cdef objects
without the GIL and use their nogil methods. If the acquisition count
is greater than 1, the acquisition count owns a reference to the
object. If it reaches 0 you discard your owned reference (you can
simply acquire the GIL if you don't have it) and when you increment
from zero you obtain it. Perhaps something like libatomic could be
used to efficiently implement this.

The advantages are:

1) allow users to pass around cdef typed objects in nogil mode
2) allow cdef typed objects in as struct attributes or array elements
3) make it easy to implement things like memoryviews (already done but
would have been a lot easier), cython.parallel.async/future objects,
cython.parallel.mutex objects and possibly other things in the future

We should then allow a syntax like

with mycdefobject:
...

to lock the object in GIL or nogil mode (like java's 'synchronized').
For objects that already have __enter__ and __exit__ you could support
something like 'with cython.synchronized(mycdefobject): ...' instead.
Or perhaps you should always require cython.synchronized (or
cython.parallel.synchronized).

In addition to nogil methods a user may provide special cdef nogil methods, i.e.

cdef int __len__(self) nogil:
...

which would provide a Cython as well as a Python implementation for
the function (with automatic cpdef behaviour), so you could use it in
both contexts.

There are two options for assignment semantics to a struct attribute
or array element:
- decref the old value (this implies always initializing the
pointers to NULL first)
- don't decref the old value (the user has to manually use 'del')

I think 1) is more definitely consistent with how everything else works.

All of this functionality should also get a sane C API (to be provided
by cython.h). You'd get a Cy_INCREF(obj, have_gil)/Cy_DECREF() etc.
Every class using this functionality is a subclass of CythonObject
(that contains a PyObject + an acquisition count + a lock). Perhaps if
the user is subclassing something other than object we could allow the
user to specify custom __cython_(un)lock__ and
__cython_acquisition_count__ methods and fields.

Now, building on top of this functionality, Cython could provide
built-in nogil-compatible types, like lists, dicts and maybe tuples
(as a start). These will by default not lock for operations to allow
e.g. one thread to iterate over the list and another thread to index
it without lock contention and other general overhead. If one thread
is somehow changing the size of the list, or writing to indices that
another thread is reading from/writing to, the results will of course
be undefined unless the user synchronizes on the object. So it would
be the user's responsibility. The acquisition counting itself will
always be thread-safe (i.e., it will be atomic if possible, otherwise
it will lock).

It's probably best to not enable this functionality by default as it
would be more expensive to instantiate objects, but it could be
supported through a cdef class decorator and a general directive.

Of course one may still use non-cdef borrowed objects, by simply
casting to a PyObject *.

Thoughts?

Mark
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Dag Sverre Seljebotn

On 10/24/2011 09:40 PM, Wes McKinney wrote:

On Mon, Oct 24, 2011 at 3:37 PM, Dag Sverre Seljebotn
  wrote:

On 10/24/2011 09:26 PM, Wes McKinney wrote:


I've been using

ndarray[uint8_t, cast=True] bool_arr

to work with dtype=bool arrays in Cython lately. When testing using
Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
1.6.1. This is with Cython 0.15.1.

Any advice or do I have to (very unhappily) work around this?


Is this a recent bug in Cython? Try to bisect the the Cython release (and if
it turns out to be Cython, possible commit).

Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel



I'll check the HEAD revision and bisect if I can, don't have a lot of
time-- it's just strange that it's Python 2.5 only.


So the difference between Python 2.5 and 2.6 is that in 2.5 the 
__getbuffer__ in numpy.pxd will be called, whereas in Python 2.6, NumPy 
is able to do the job itself. (PEP 3118)


Which means...that there's likely a bug in __getbuffer__ in numpy.pxd. 
You can debug


If you do have time, that's the place to start inserting print 
statements etc. to debug this.


It's difficult to say more without a copy&paste directly from your terminal.

Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Acquisition counted cdef classes

2011-10-24 Thread Greg Ewing

mark florisson wrote:

These will by default not lock for operations to allow
e.g. one thread to iterate over the list and another thread to index
it without lock contention and other general overhead.


I don't think that's safe. You can't say "I'm not modifying
this, so I don't need to lock it" because there may be another
thread that *is* in the midst of modifying it.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Buffer interface to boolean arrays with cast=True on Python 2.5 failing

2011-10-24 Thread Wes McKinney
On Mon, Oct 24, 2011 at 4:09 PM, Dag Sverre Seljebotn
 wrote:
> On 10/24/2011 09:40 PM, Wes McKinney wrote:
>>
>> On Mon, Oct 24, 2011 at 3:37 PM, Dag Sverre Seljebotn
>>   wrote:
>>>
>>> On 10/24/2011 09:26 PM, Wes McKinney wrote:

 I've been using

 ndarray[uint8_t, cast=True] bool_arr

 to work with dtype=bool arrays in Cython lately. When testing using
 Python 2.5 / NumPy 1.6.1 on Windows, I'm getting "unknown dtype code
 in numpy.pxd (0)". Everything works fine with Python 2.6/2.7 and NumPy
 1.6.1. This is with Cython 0.15.1.

 Any advice or do I have to (very unhappily) work around this?
>>>
>>> Is this a recent bug in Cython? Try to bisect the the Cython release (and
>>> if
>>> it turns out to be Cython, possible commit).
>>>
>>> Dag Sverre
>>> ___
>>> cython-devel mailing list
>>> cython-devel@python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>
>> I'll check the HEAD revision and bisect if I can, don't have a lot of
>> time-- it's just strange that it's Python 2.5 only.
>
> So the difference between Python 2.5 and 2.6 is that in 2.5 the
> __getbuffer__ in numpy.pxd will be called, whereas in Python 2.6, NumPy is
> able to do the job itself. (PEP 3118)
>
> Which means...that there's likely a bug in __getbuffer__ in numpy.pxd. You
> can debug
>
> If you do have time, that's the place to start inserting print statements
> etc. to debug this.
>
> It's difficult to say more without a copy&paste directly from your terminal.
>
> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>

I need pandas to build off of a released version of Cython so I am
just going to have to work around this by doing taking views of
boolean arrays as np.uint8. I wouldn't mind dropping Python 2.5
support altogether but some people might not like that.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Acquisition counted cdef classes

2011-10-24 Thread mark florisson
On 24 October 2011 22:03, Greg Ewing  wrote:
> mark florisson wrote:
>>
>> These will by default not lock for operations to allow
>> e.g. one thread to iterate over the list and another thread to index
>> it without lock contention and other general overhead.
>
> I don't think that's safe. You can't say "I'm not modifying
> this, so I don't need to lock it" because there may be another
> thread that *is* in the midst of modifying it.

Oh yes you're definitely right, that was silly of me. I suppose every
operation needs to lock. This can still be useful though, to allow
more fine-grained parallelism. Then it would be more efficient to use
arrays or memoryviews with acquisition counted objects, and the
dicts/lists/tuples etc for cases where you just need more fine-grained
locking and can deal with that overhead.

> --
> Greg
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Acquisition counted cdef classes

2011-10-24 Thread mark florisson
On 24 October 2011 22:03, Greg Ewing  wrote:
> mark florisson wrote:
>>
>> These will by default not lock for operations to allow
>> e.g. one thread to iterate over the list and another thread to index
>> it without lock contention and other general overhead.
>
> I don't think that's safe. You can't say "I'm not modifying
> this, so I don't need to lock it" because there may be another
> thread that *is* in the midst of modifying it.

I was really thinking of the case where you instantiate it in Cython
and then do some parallel work, in which case you're the only user.
But you can't assume that in general.

> --
> Greg
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Acquisition counted cdef classes

2011-10-24 Thread Robert Bradshaw
On Mon, Oct 24, 2011 at 2:52 PM, mark florisson
 wrote:
> On 24 October 2011 22:03, Greg Ewing  wrote:
>> mark florisson wrote:
>>>
>>> These will by default not lock for operations to allow
>>> e.g. one thread to iterate over the list and another thread to index
>>> it without lock contention and other general overhead.
>>
>> I don't think that's safe. You can't say "I'm not modifying
>> this, so I don't need to lock it" because there may be another
>> thread that *is* in the midst of modifying it.
>
> I was really thinking of the case where you instantiate it in Cython
> and then do some parallel work, in which case you're the only user.
> But you can't assume that in general.

It could be useful to assert for a chunk of code that a given object
is read-only and will not be mutated for the duration of the context
(programmer error and strange crash/data corruption if it is). E.g.

with nogil, assert_frozen(my_dict):
a = (my_dict[key]).c_attribute
[...]

All references obtained could be borrowed. Perhaps we could even
enforce this for cdef classes (but perhaps not consistently enough,
and perhaps that would make things even more confusing). Just a
thought.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel