Emmanuel Gil Peyrot, 28.05.2014 13:25: > I was testing my cython codebase on top of pypy/cpyext, and I found a > memory corruption. After investigating it with the pypy guys (thanks > arigato!), we identified it as a cython bug: > > cdef void test(char *string): > print(string) > > def run(array): > test(array[0]) > > This code works fine under cpython, but looking at the generated code > for the last line, the call to test is done with a pointer (__pyx_t_2) > to some potentially deallocated string (__pyx_t_1), which isn’t freed > just yet on cpython but is on pypy: > > __pyx_t_1 = __Pyx_GetItemInt(__pyx_v_array, 0, …); > __Pyx_GOTREF(__pyx_t_1); > __pyx_t_2 = __Pyx_PyObject_AsString(__pyx_t_1); > __Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0; > __pyx_f_1a_test(__pyx_t_2); > > > The obvious solution is to swap the last two lines, it then works fine > on pypy (although not necessarily if the test function stores the > pointer somewhere, but it’s not cython’s fault then). > > This issue can also happen with an explicit cast: > > pointer = <char *>array[0] > test(pointer)
I pushed a fix for both cases here: https://github.com/cython/cython/commit/37e4a20615c323b695bd2ec5db0dc20a25a4417f There are still a couple of tests failing, but that should be fixable. The change moves the compile time error to assignment time and postpones the temp reference cleanup for char* inside of expressions. This makes temporary char* handling generally safer, including passing it into external functions (which, as you said, is safe as long as the user takes the obvious bit of care to not store it beyond the call scope). The only drawback is that this may keep temporary Python strings alive longer than strictly necessary (and longer than before), but if that really becomes a problem, it should be easy to work around by splitting up the expressions. > I’m not sure if it should be addressed the same way, because that would > mean keeping a reference to array[0] for all the lifetime of the > current scope, but it could still prevent obscure bugs like the memory > corruption I had. The generally assumption, and the reason why indexing is considered safer than e.g. concatenation, is that indexing often returns a duplicated reference to an object stored in a container. That does not necessarily apply to PyPy's cpyext, but the change above should at least fix the problem you found there. Stefan _______________________________________________ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel