[Cython] Cython+PyPy benchmarks
Hi, I set up a Jenkins job to run a couple of (simple) benchmarks comparing Cython's current performance under CPython and PyPy. Note that these are C-API intensive benchmarks by design. https://sage.math.washington.edu:8091/hudson/job/cython-devel-cybenchmarks-pypy/lastSuccessfulBuild/artifact/bench_chart.html Basically, PyPy's cpyext is currently about 100-200x slower than CPython's native C-API for these kinds of benchmarks. That's because it hasn't been optimised in any way, correctness and completeness are still the main goals in its development (and they're not there yet). The one major performance issue in cpyext is currently the creation and deallocation of the PyObject representation for each object, which obviously has a huge impact on everything. I profiled the nbody benchmark and it showed that almost 80% of the runtime is currently spent in creating and discarding PyObject instances. Here's the call graph: http://cython.org/callgrind-pypy-nbody.png The up side of this is that there is likely a lot of low hanging fruit in cpyext (plus some more tweaks in Cython), given that no optimisation at all has been done so far. It shouldn't be too hard to drop the factor substantially. I also think we should add a couple of more C-ish benchmarks to see how much overhead there really is for less C-API intensive code. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython-users] C++: how to handle failures of 'new'?
On 3 July 2012 20:15, Robert Bradshaw wrote: > On Tue, Jul 3, 2012 at 11:43 AM, Dag Sverre Seljebotn > wrote: >> On 07/03/2012 08:23 PM, Robert Bradshaw wrote: >>> >>> On Tue, Jul 3, 2012 at 11:11 AM, Stefan Behnel >>> wrote: Robert Bradshaw, 03.07.2012 19:58: > > On Tue, Jul 3, 2012 at 9:38 AM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 03.07.2012 18:11: >>> >>> On 07/03/2012 09:14 AM, Stefan Behnel wrote: I don't know what happens if a C++ exception is not being caught, but I guess it would simply crash the application. That's a bit more visible than >>> >>> >>> Yep. >>> just printing a warning when a Python exception is being ignored due to a missing declaration. It's really unfortunate that our documentation didn't even mention the need for this, because it's not immediately obvious that Cython won't handle errors in "new", and testing for memory errors isn't quite what people commonly do in their test suites. Apart from that, I agree, users have to take care to properly declare the API they are using. >>> >>> >>> Is there any time you do NOT want a "catch (...) {}" block? I can't >>> see a >>> C++ exception propagating to Python-land doing anything useful ever. >> >> >> That would have been my intuition, too. > > > If it's actually embedded, with the main driver in C++, one might want > it to propagate up. But what kind of a propagation would that be? On the way out, it could induce anything, from side effects to resource leaks to crashes, depending on what the state of the surrounding code is. It would leave the whole system in an unpredictable state. I cannot imagine anyone really wanting this. >>> So shouldn't we just make --cplus turn *all* external functions and >>> methods >>> (whether C-like or C++-like) into "except +"? (Or keep except+ for >>> manual >>> translation, but always have a catch(...)". >>> >>> Performance overhead is the only reason I can think of to not do this, >>> although IIRC C++ catch blocks are only dealt with during stack >>> unwinds and >>> doesn't cost anything/much (?) when they're not triggered. >>> >>> "except -1" should then actually mean both; "except + except -1". So >>> it's >>> more a question of just adding catch(...) *everywhere*, than making >>> "except >>> +" the default. >> >> >> I have no idea if there is a performance impact, but if there isn't, >> always >> catching all exceptions sounds like a reasonable thing to do. After >> all, we >> have no support for catching C++ exceptions on user side. > > > This is a bit like following every C call with "except *" (though the > performance ratios are unclear). It just seems a lot to wrap every > single line of a non-trivial C++ using function with try..catch > blocks. >> >> >> It seems "a lot" of just what exactly? Generated code? Binary size? Time >> spent in GCC parser? > > All of the above. And we should take a look at the runtime overhead > (which is hopefully nil, but who knows.) > >> Though I guess one might want to try to pull out the try-catch to at least >> only one per code line rather than one per SimpleCallNode. > > Or even higher, if possible. It's still a lot. Why would you have to do that? Can't you just insert a try/catch per try/except or try/finally block, or if absent, the function body. That will still work with the way temporaries are cleaned up. (It should also be implemented for parallel/prange sections). >> "except *" only has a point when calling functions using the CPython API, >> but most external C functions are pure C, not CPython-API-using-functions. >> OTOH, all external C++ functions are C++ :-) > > Fair point. > >> (Also, if we wrote Cython from scratch now I'm pretty sure the "except *" >> defaults would be a tad different.) > > For sure. > But if users are correct about their declarations, we'd end up with the same thing. I think it's worth a try. >>> >>> >>> Most C++ code (that I've ever run into) doesn't use exceptions, >>> because exception handling is so broken in C++ anyways. >> >> >> Except for the fact that any code touching "new" could be raising >> exceptions? That propagates. > > I would guess most of the time people don't bother catching these and > let the program die, as there's often no sane recovery (the same as > MemoryErrors in Python, though I guess C++ is less often used from an > event loop). > >> There is a lot of C++ code out there using exceptions. I'd guess that both >> mathematical code and Google-written code is unlike most C++ code out there >> :-) Many C++ programmers go on and on about RAII and auto_ptrs and so
Re: [Cython] [cython-users] C++: how to handle failures of 'new'?
mark florisson, 05.07.2012 20:47: > On 3 July 2012 20:15, Robert Bradshaw wrote: >> On Tue, Jul 3, 2012 at 11:43 AM, Dag Sverre Seljebotn >>> Though I guess one might want to try to pull out the try-catch to at least >>> only one per code line rather than one per SimpleCallNode. >> >> Or even higher, if possible. It's still a lot. > > Why would you have to do that? Can't you just insert a try/catch per > try/except or try/finally block, or if absent, the function body. That > will still work with the way temporaries are cleaned up. (It should > also be implemented for parallel/prange sections). My first reaction was, "sure, smart idea". It certainly sounds like a good idea to unify the exception handling between C++ and Python into the same syntactic structures. But does it allow to handle different declarations for multiple C++ functions that get called? E.g. "except +" for one and "except +MemoryError" for another, but both called in the same try-whatever block? That would just lead to nested try-except blocks, I guess, thus making the outer exception clauses mostly a fallback for exceptions that users forgot to declare or couldn't properly handle for some reason. I think it's worth a try to see if it works. BTW, is there a reason we can't allow users to declare C++ exceptions in their .pxd files, and then support catching them in Python try-except syntax? Just verbatimly translating them to the C++ structures, based on the type of the exception that gets caught? (Although, given the discussion so far, maybe try-finally is more important than try-catch, and the former can't know when it needs to be mapped into C++ code and when not ...) Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython-users] C++: how to handle failures of 'new'?
mark florisson wrote: >On 3 July 2012 20:15, Robert Bradshaw wrote: >> On Tue, Jul 3, 2012 at 11:43 AM, Dag Sverre Seljebotn >> wrote: >>> On 07/03/2012 08:23 PM, Robert Bradshaw wrote: On Tue, Jul 3, 2012 at 11:11 AM, Stefan Behnel wrote: > > Robert Bradshaw, 03.07.2012 19:58: >> >> On Tue, Jul 3, 2012 at 9:38 AM, Stefan Behnel wrote: >>> >>> Dag Sverre Seljebotn, 03.07.2012 18:11: On 07/03/2012 09:14 AM, Stefan Behnel wrote: > > I don't know what happens if a C++ exception is not being >caught, but > I > guess it would simply crash the application. That's a bit more > visible than Yep. > just printing a warning when a Python exception is being >ignored due > to a > missing declaration. It's really unfortunate that our >documentation > didn't > even mention the need for this, because it's not immediately >obvious > that > Cython won't handle errors in "new", and testing for memory >errors > isn't > quite what people commonly do in their test suites. > > Apart from that, I agree, users have to take care to properly >declare > the > API they are using. Is there any time you do NOT want a "catch (...) {}" block? I >can't see a C++ exception propagating to Python-land doing anything useful >ever. >>> >>> >>> That would have been my intuition, too. >> >> >> If it's actually embedded, with the main driver in C++, one might >want >> it to propagate up. > > > But what kind of a propagation would that be? On the way out, it >could > induce anything, from side effects to resource leaks to crashes, > depending > on what the state of the surrounding code is. It would leave the >whole > system in an unpredictable state. I cannot imagine anyone really >wanting > this. > > So shouldn't we just make --cplus turn *all* external functions >and methods (whether C-like or C++-like) into "except +"? (Or keep except+ >for manual translation, but always have a catch(...)". Performance overhead is the only reason I can think of to not >do this, although IIRC C++ catch blocks are only dealt with during stack unwinds and doesn't cost anything/much (?) when they're not triggered. "except -1" should then actually mean both; "except + except >-1". So it's more a question of just adding catch(...) *everywhere*, than >making "except +" the default. >>> >>> >>> I have no idea if there is a performance impact, but if there >isn't, >>> always >>> catching all exceptions sounds like a reasonable thing to do. >After >>> all, we >>> have no support for catching C++ exceptions on user side. >> >> >> This is a bit like following every C call with "except *" (though >the >> performance ratios are unclear). It just seems a lot to wrap >every >> single line of a non-trivial C++ using function with try..catch >> blocks. >>> >>> >>> It seems "a lot" of just what exactly? Generated code? Binary size? >Time >>> spent in GCC parser? >> >> All of the above. And we should take a look at the runtime overhead >> (which is hopefully nil, but who knows.) >> >>> Though I guess one might want to try to pull out the try-catch to at >least >>> only one per code line rather than one per SimpleCallNode. >> >> Or even higher, if possible. It's still a lot. > >Why would you have to do that? Can't you just insert a try/catch per >try/except or try/finally block, or if absent, the function body. That >will still work with the way temporaries are cleaned up. (It should >also be implemented for parallel/prange sections). One disadvantage is that you don't get source code line for the .pyx file in the stack trace. Which is often exactly the information you are looking for (even worse, since C++ stack isn't in the stack trace, the lineno for what seems like the ' ultimate cause' is not there). Having to surround statements with try/except just to pinpoint which one is raising the exception would be incredibly irritating. Dag > >>> "except *" only has a point when calling functions using the CPython >API, >>> but most external C functions are pure C, not >CPython-API-using-functions. >>> OTOH, all external C++ functions are C++ :-) >> >> Fair point. >> >>> (Also, if we wrote Cython from scratch now I'm pretty sure the >"except *" >>> defaults would be a tad different.) >> >> For sure. >> > But if users are correct about their declarations, we'd end up >with the > same thing. I think it's worth a try. Most C++ code (that I've ever run into) doesn't use exceptions, because exception handling is so
Re: [Cython] [cython-users] C++: how to handle failures of 'new'?
On 5 July 2012 21:46, Dag Sverre Seljebotn wrote: > > > mark florisson wrote: > >>On 3 July 2012 20:15, Robert Bradshaw wrote: >>> On Tue, Jul 3, 2012 at 11:43 AM, Dag Sverre Seljebotn >>> wrote: On 07/03/2012 08:23 PM, Robert Bradshaw wrote: > > On Tue, Jul 3, 2012 at 11:11 AM, Stefan Behnel > wrote: >> >> Robert Bradshaw, 03.07.2012 19:58: >>> >>> On Tue, Jul 3, 2012 at 9:38 AM, Stefan Behnel wrote: Dag Sverre Seljebotn, 03.07.2012 18:11: > > On 07/03/2012 09:14 AM, Stefan Behnel wrote: >> >> I don't know what happens if a C++ exception is not being >>caught, but >> I >> guess it would simply crash the application. That's a bit more >> visible than > > > Yep. > >> just printing a warning when a Python exception is being >>ignored due >> to a >> missing declaration. It's really unfortunate that our >>documentation >> didn't >> even mention the need for this, because it's not immediately >>obvious >> that >> Cython won't handle errors in "new", and testing for memory >>errors >> isn't >> quite what people commonly do in their test suites. >> >> Apart from that, I agree, users have to take care to properly >>declare >> the >> API they are using. > > > Is there any time you do NOT want a "catch (...) {}" block? I >>can't > see a > C++ exception propagating to Python-land doing anything useful >>ever. That would have been my intuition, too. >>> >>> >>> If it's actually embedded, with the main driver in C++, one might >>want >>> it to propagate up. >> >> >> But what kind of a propagation would that be? On the way out, it >>could >> induce anything, from side effects to resource leaks to crashes, >> depending >> on what the state of the surrounding code is. It would leave the >>whole >> system in an unpredictable state. I cannot imagine anyone really >>wanting >> this. >> >> > So shouldn't we just make --cplus turn *all* external functions >>and > methods > (whether C-like or C++-like) into "except +"? (Or keep except+ >>for > manual > translation, but always have a catch(...)". > > Performance overhead is the only reason I can think of to not >>do this, > although IIRC C++ catch blocks are only dealt with during stack > unwinds and > doesn't cost anything/much (?) when they're not triggered. > > "except -1" should then actually mean both; "except + except >>-1". So > it's > more a question of just adding catch(...) *everywhere*, than >>making > "except > +" the default. I have no idea if there is a performance impact, but if there >>isn't, always catching all exceptions sounds like a reasonable thing to do. >>After all, we have no support for catching C++ exceptions on user side. >>> >>> >>> This is a bit like following every C call with "except *" (though >>the >>> performance ratios are unclear). It just seems a lot to wrap >>every >>> single line of a non-trivial C++ using function with try..catch >>> blocks. It seems "a lot" of just what exactly? Generated code? Binary size? >>Time spent in GCC parser? >>> >>> All of the above. And we should take a look at the runtime overhead >>> (which is hopefully nil, but who knows.) >>> Though I guess one might want to try to pull out the try-catch to at >>least only one per code line rather than one per SimpleCallNode. >>> >>> Or even higher, if possible. It's still a lot. >> >>Why would you have to do that? Can't you just insert a try/catch per >>try/except or try/finally block, or if absent, the function body. That >>will still work with the way temporaries are cleaned up. (It should >>also be implemented for parallel/prange sections). > > One disadvantage is that you don't get source code line for the .pyx file in > the stack trace. Which is often exactly the information you are looking for > (even worse, since C++ stack isn't in the stack trace, the lineno for what > seems like the ' ultimate cause' is not there). Having to surround statements > with try/except just to pinpoint which one is raising the exception would be > incredibly irritating. > > Dag Oh yeah, good point. Maybe we could use these zero-cost exceptions for cdef functions in Cython though, instead of error checks (if it appears to make any significant difference). Basically instead of the 'error' argument in CEP 526. It'd need version that ABI as well... >> "except *" only has a point when calling functions using the CPython >>API, but most external C functions are pure C, not
[Cython] Odd behavior with std::string and .decode()
I'm currently exploring using Cython to provide new Python 3 bindings for Xapian. I'm pretty much a Cython n00b but the documentation is great, and I was able to pretty quickly get something really simple working. I'm using Cython 0.15 on Ubuntu 12.04 with Python 3.2 and Xapian 1.2.12. I've pushed my current branch to github: https://github.com/warsaw/xapian/tree/py3/xapian-bindings/python3 There you'll see my xapianlib.pxd and xapian.pyx files. Where I'm seeing some odd behavior is in trying to expose the Xapian::TermGenerator.get_description() method. This returns a std::string and I'm trying to create a `description` property that coerces this to unicode before returning it to Python. Here's the relevant code: -snip snip- cdef class TermGenerator: cdef xapianlib.TermGenerator * _this def __cinit__(self): self._this = new xapianlib.TermGenerator() def __dealloc__(self): del self._this property description: def __get__(self): as_bytes = self._this.get_description().c_str() #return as_bytes return as_bytes.decode('utf-8') -snip snip- I'm sure I'm doing something naive or stupid, but the problem is that as written above, .description is returning nonsense. % python Python 3.2.3 (default, May 3 2012, 15:51:42) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import xapian >>> tg = xapian.TermGenerator() >>> tg.description '\x00\x00\x00\x00_\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' If instead, I return just the bytes object (i.e. what .get_description().c_str() returns), then I get more like what I expect. % python Python 3.2.3 (default, May 3 2012, 15:51:42) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import xapian >>> tg = xapian.TermGenerator() >>> tg.description b'Xapian::TermGenerator(stem=Xapian::Stem(none), doc=Document(Xapian::Document::Internal()), termpos=0)' >>> tg.description.decode('utf-8') 'Xapian::TermGenerator(stem=Xapian::Stem(none), doc=Document(Xapian::Document::Internal()), termpos=0)' I looked at the generated code in the first example, but didn't really see anything obvious. There are no NULs in the char* description afaict. I haven't yet tested Cython 0.16 or 0.17 to see if this behaves differently. Is this a bug or am I doing something stupid? Cheers, -Barry signature.asc Description: PGP signature ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] [cython-users] C++: how to handle failures of 'new'?
mark florisson wrote: >On 5 July 2012 21:46, Dag Sverre Seljebotn >wrote: >> >> >> mark florisson wrote: >> >>>On 3 July 2012 20:15, Robert Bradshaw wrote: On Tue, Jul 3, 2012 at 11:43 AM, Dag Sverre Seljebotn wrote: > On 07/03/2012 08:23 PM, Robert Bradshaw wrote: >> >> On Tue, Jul 3, 2012 at 11:11 AM, Stefan >Behnel >> wrote: >>> >>> Robert Bradshaw, 03.07.2012 19:58: On Tue, Jul 3, 2012 at 9:38 AM, Stefan Behnel wrote: > > Dag Sverre Seljebotn, 03.07.2012 18:11: >> >> On 07/03/2012 09:14 AM, Stefan Behnel wrote: >>> >>> I don't know what happens if a C++ exception is not being >>>caught, but >>> I >>> guess it would simply crash the application. That's a bit >more >>> visible than >> >> >> Yep. >> >>> just printing a warning when a Python exception is being >>>ignored due >>> to a >>> missing declaration. It's really unfortunate that our >>>documentation >>> didn't >>> even mention the need for this, because it's not immediately >>>obvious >>> that >>> Cython won't handle errors in "new", and testing for memory >>>errors >>> isn't >>> quite what people commonly do in their test suites. >>> >>> Apart from that, I agree, users have to take care to >properly >>>declare >>> the >>> API they are using. >> >> >> Is there any time you do NOT want a "catch (...) {}" block? I >>>can't >> see a >> C++ exception propagating to Python-land doing anything >useful >>>ever. > > > That would have been my intuition, too. If it's actually embedded, with the main driver in C++, one >might >>>want it to propagate up. >>> >>> >>> But what kind of a propagation would that be? On the way out, it >>>could >>> induce anything, from side effects to resource leaks to crashes, >>> depending >>> on what the state of the surrounding code is. It would leave the >>>whole >>> system in an unpredictable state. I cannot imagine anyone really >>>wanting >>> this. >>> >>> >> So shouldn't we just make --cplus turn *all* external >functions >>>and >> methods >> (whether C-like or C++-like) into "except +"? (Or keep >except+ >>>for >> manual >> translation, but always have a catch(...)". >> >> Performance overhead is the only reason I can think of to not >>>do this, >> although IIRC C++ catch blocks are only dealt with during >stack >> unwinds and >> doesn't cost anything/much (?) when they're not triggered. >> >> "except -1" should then actually mean both; "except + except >>>-1". So >> it's >> more a question of just adding catch(...) *everywhere*, than >>>making >> "except >> +" the default. > > > I have no idea if there is a performance impact, but if there >>>isn't, > always > catching all exceptions sounds like a reasonable thing to do. >>>After > all, we > have no support for catching C++ exceptions on user side. This is a bit like following every C call with "except *" >(though >>>the performance ratios are unclear). It just seems a lot to wrap >>>every single line of a non-trivial C++ using function with try..catch blocks. > > > It seems "a lot" of just what exactly? Generated code? Binary >size? >>>Time > spent in GCC parser? All of the above. And we should take a look at the runtime overhead (which is hopefully nil, but who knows.) > Though I guess one might want to try to pull out the try-catch to >at >>>least > only one per code line rather than one per SimpleCallNode. Or even higher, if possible. It's still a lot. >>> >>>Why would you have to do that? Can't you just insert a try/catch per >>>try/except or try/finally block, or if absent, the function body. >That >>>will still work with the way temporaries are cleaned up. (It should >>>also be implemented for parallel/prange sections). >> >> One disadvantage is that you don't get source code line for the .pyx >file in the stack trace. Which is often exactly the information you are >looking for (even worse, since C++ stack isn't in the stack trace, the >lineno for what seems like the ' ultimate cause' is not there). Having >to surround statements with try/except just to pinpoint which one is >raising the exception would be incredibly irritating. >> >> Dag > >Oh yeah, good point. Maybe we could use these zero-cost exceptions for >cdef functions in Cython though, instead of error checks (if it >appears to make any significant difference). Basically instead of the >'error' argument in CEP 526. It'
Re: [Cython] Odd behavior with std::string and .decode()
Hi Barry, Barry Warsaw, 06.07.2012 00:29: > I'm currently exploring using Cython to provide new Python 3 bindings for > Xapian. I'm pretty much a Cython n00b but the documentation is great, and I > was able to pretty quickly get something really simple working. I'm using > Cython 0.15 on Ubuntu 12.04 with Python 3.2 and Xapian 1.2.12. I've pushed my > current branch to github: > > https://github.com/warsaw/xapian/tree/py3/xapian-bindings/python3 > > There you'll see my xapianlib.pxd and xapian.pyx files. > > Where I'm seeing some odd behavior is in trying to expose the > Xapian::TermGenerator.get_description() method. This returns a std::string > and I'm trying to create a `description` property that coerces this to unicode > before returning it to Python. Here's the relevant code: > > -snip snip- > cdef class TermGenerator: > cdef xapianlib.TermGenerator * _this > > def __cinit__(self): > self._this = new xapianlib.TermGenerator() > > def __dealloc__(self): > del self._this > > property description: > def __get__(self): > as_bytes = self._this.get_description().c_str() > #return as_bytes > return as_bytes.decode('utf-8') > -snip snip- > > I'm sure I'm doing something naive or stupid, but the problem is that > as written above, .description is returning nonsense. > > % python > Python 3.2.3 (default, May 3 2012, 15:51:42) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import xapian > >>> tg = xapian.TermGenerator() > >>> tg.description > '\x00\x00\x00\x00_\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' > > If instead, I return just the bytes object (i.e. what > .get_description().c_str() returns), then I get more like what I expect. > > % python > Python 3.2.3 (default, May 3 2012, 15:51:42) > [GCC 4.6.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import xapian > >>> tg = xapian.TermGenerator() > >>> tg.description > b'Xapian::TermGenerator(stem=Xapian::Stem(none), > doc=Document(Xapian::Document::Internal()), termpos=0)' > >>> tg.description.decode('utf-8') > 'Xapian::TermGenerator(stem=Xapian::Stem(none), > doc=Document(Xapian::Document::Internal()), termpos=0)' This is very weird behaviour indeed. I wouldn't know why that should happen. What "return as_bytes.decode('utf-8')" does is that is calls strlen() to see how long the string is, then it calls the UTF-8 decode C-API function with that. The string that get_description() returns is allocated internally in the C++ object, right? So it can't suddenly die or something? One thing I would generally suggest is to do this: descr = self._this.get_description() return descr.data()[:descr.size()].decode('utf-8') Avoids the call to strlen() by explicitly slicing the pointer. Also avoids needing to make sure the C string is 0-terminated. > I looked at the generated code in the first example, but didn't really see > anything obvious. There are no NULs in the char* description afaict. I > haven't yet tested Cython 0.16 or 0.17 to see if this behaves differently. I wouldn't know any differences out of the top of my head, except that 0.17 has generally better support for STL containers and std:string (but that's unrelated to this failure). I'm planning to enable direct support for cpp_string.decode(...) as well, but that's not implemented yet. It would basically generate the verbose code above automatically. > Is this a bug or am I doing something stupid? Definitely not doing something stupid, but I have no idea why this should go wrong. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel