John Ehresman, 26.05.2011 22:02:
On 5/26/11 3:27 AM, Stefan Behnel wrote:
I think this means that the current unicode optimizations aren't used
when
variables are declared as str and a python 3 runtime is used. Should all
unicode optimizations support str eventually?

Yes.

Well, minus those that are not portable. For example, the return type of
indexing and iteration is the C type "Py_UCS4" for unicode, but the
Python type "str" (i.e. bytes/unicode) for "str". I also didn't take a
thorough look through the C-API functions for the str type in Py2 and
Py3. Things certainly become more ugly when trying to optimise Python
code into C for both platforms, than when leaving things at the Python
type level.

Would it work for these methods to return Py_UCS4 in all 3 cases (unicode,
bytes, str)?

There are two sides to this: what the C compiler eventually sees and what Cython makes of the types internally. Letting Cython assume that the result is Py_UCS4 is incorrect in the Py2 case. Amongst other problems, it would make the value turn into a unicode string when coercing to a Python object.

In the bytes case, the multibyte int would simply be cast to char if
that was what it was assigned to but the value wouldn't be above 255 in
any case.

Sure it could, "str" is unicode in Py3, so you get a Unicode string with all possible values, e.g. when using unicode escapes.


The case I worry about is losing optimizations w/ a Python3 runtime if str
is used rather than unicode.

You should expect that. If you want optimised code, use a suitable type.

Stefan
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to