[Cython] Cython and PyPy
I am trying to get cython to work better with PyPy. I am sort of documenting my progress and failures at https://bitbucket.org/pypy/pypy/wiki/edit/cpyext_2_-_cython So far I can run the test suite on a nightly PyPy2 http://buildbot.pypy.org/nightly/trunk, but only when using --no-refnanny. On the missing-tp_new branch of PyPy (trying to fix the datetime problems), running only the c backend I get something like Ran 4632 tests in 931.486s FAILED (failures=80, errors=18, skipped=1) This mail is a heads-up, I will hopefully issue some pull requests soon. Also, I have some questions, mainly about the test runner: - Shouldn't the "skipped" field include the number of tests in pypy_bugs.txt? - How can I get pdb to work during a single test run to try to work out the internals of cython? In nose or pytest I can add the -s option, I could not find an equivalent. - Is there a marker for test start/test stop in the test report? I would like to use awk or grep to try to analyse the multiple failures into groups - The XML backend seems to miss some of the stdout/stderr messages. Is there more documentation of test running options somewhere? Thanks Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] trying to understand why PyString_GET_SIZE cannot be cimport-ed
I am working on a branch of PyPy 2.7 to support Pandas (default PyPy is missing some CAPI support that enables the parts of cython used in Pandas). Pandas code has these lines (in lib.pyx) try: from cpython cimport PyString_GET_SIZE except ImportError: from cpython cimport PyUnicode_GET_SIZE as PyString_GET_SIZE For some reason, PyPy fails to cimport PyString_GET_SIZE, but successfully cimports PyUnicode_GET_SIZE. The substitution causes problems for PyPy, I could solve those in a different way, but I would like to understand what is going on. My cython Foo is improving but still too weak to understand why the cimport fails, could someone help me out with a hint? Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] trying to understand why PyString_GET_SIZE cannot be cimport-ed
On 07/01/17 19:25, Matti Picus wrote: I am working on a branch of PyPy 2.7 to support Pandas (default PyPy is missing some CAPI support that enables the parts of cython used in Pandas). Pandas code has these lines (in lib.pyx) try: from cpython cimport PyString_GET_SIZE except ImportError: from cpython cimport PyUnicode_GET_SIZE as PyString_GET_SIZE For some reason, PyPy fails to cimport PyString_GET_SIZE, but successfully cimports PyUnicode_GET_SIZE. The substitution causes problems for PyPy, I could solve those in a different way, but I would like to understand what is going on. My cython Foo is improving but still too weak to understand why the cimport fails, could someone help me out with a hint? Matti Sorry for the noise, this would appear to be more appropriate for cython-users and also is just wrong, I will file a bug report with pandas. Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties
To solve issue #2498, I did some experiments https://github.com/cython/cython/issues/2498#issuecomment-414543549 with hiding direct field access in an external extension type (documented here https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). The idea is to write `a.ndims` in cython (in plain python code), and in C magically get the attribute lookup converted into a `PyArray_NDIMS(a)` getter, which could be a macro or a c-function. The experiments proved fruitful, and garnered some positive feedback so I am pushing forward. I would like to get some feedback on syntax before I progress too far. Should the syntax be extended to support |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert python __getattr__ access to c functions. int ndims PyArray_NDIMS | or perhaps a decorator, like Python |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert python __getattr__ access to c functions. @property cdef int ndims(self): return PyArray_NDIMS(self) or something else? The second seems more wordy but more explicit. I don't know which would be easier to implement or require more effort to test and maintain. Matti | ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties
On 27/09/18 22:50, Robert Bradshaw wrote: On Thu, Sep 27, 2018 at 10:38 AM Matti Picus mailto:matti.pi...@gmail.com>> wrote: To solve issue #2498, I did some experiments https://github.com/cython/cython/issues/2498#issuecomment-414543549 with hiding direct field access in an external extension type (documented here https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). The idea is to write `a.ndims` in cython (in plain python code), and in C magically get the attribute lookup converted into a `PyArray_NDIMS(a)` getter, which could be a macro or a c-function. The experiments proved fruitful, and garnered some positive feedback so I am pushing forward. I would like to get some feedback on syntax before I progress too far. Should the syntax be extended to support ctypedef class numpy.ndarray [object PyArrayObject]:cdef: # Convert python __getattr__ access to c functions. int ndims PyArray_NDIMS | or perhaps a decorator, like Python |ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert python __getattr__ access to c functions. @property cdef int ndims(self): return PyArray_NDIMS(self) or something else? The second seems more wordy but more explicit. I don't know which would be easier to implement or require more effort to test and maintain. Matti | Thanks for looking into this! My preference would be to use the @property syntax, as this will be immediately understandable to any Cython user and could contain arbitrary code, rather than just a macro call. There are, however, a couple of downsides. The first is that it may not be clear when accessing an attribute that a full function call may be invoked. (Arguably this is the same issue one has with Python, but there attribute access is already expensive. The function could be inline as well if desired.) The second is that this means that this attribute is no longer an lvalue. The last is that it's a bit special to be defining methods on an extern class. Maybe it would have to be inline if it's in the pxd? If we're going to be defining a special syntax, I might prefer something like cdef extern class ...: int ndims "PyArray_NDIMS(*)" which more resembles int ndims "nd" Open to bikeshedding on what the "self" placeholder should be. As before, should the ndims lose its lvalue status in this case, or not (in case the accessor is really a macro intended to be used like this)? Sorry about the formatting messup, the original proposal was supposed to be (this time using double spacing to make sure it works): - cdef extern class ...: @property cdef int ndims(self): return PyArray_NDIMS(self) -- vs cdef extern class ...: cdef int ndims PyArray_NDIMS The proposal is for a getter via a C function or a macro. NumPy's current public API uses a mix. Currently I am interested in getters that would not allow lvalue at all. Maybe in the future we will have fast rvalue setter functions in NumPy, but the current API does not support them. It remains to be seem how much slowdown we see in real-life benchmarks when calling a small C function from a different shared object to access attributes rather than directly accessing them via struct fields. As I point out in the "experiment" comment referenced above, pandas has code that needs lvalue access to ndarray data, so they would be stuck with the old API which is deprecated but still works for now. Scipy has no such code and oculd move forward to the newer API. As far as bikeshedding the "self" parameter, I would propose doing without, and indeed I successfully hacked Cython to use the second proposal with no self argument and no quotations. Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Mitigating perfomance impact of NumPy API change
Breaking this into a number of sub-dsicussions, since we seem to be branching. The original topic was Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties On 28/09/18 01:20, Robert Bradshaw wrote: Hmm...so in this case it upgrading Cython would cause an unconditional switch from direct access to a function call without any code change (or choice) for users of numpy.pxd. I am curious what kind of a slowdown this would represent (though would assume this kind of analysis was done by the NumPy folks when choosing macro vs. function for the public API). As I point out in the "experiment" comment referenced above, pandas has code that needs lvalue access to ndarray data, so they would be stuck with the old API which is deprecated but still works for now. Scipy has no such code and oculd move forward to the newer API. But if we upgraded Cython, how would they access the old API? I suppose they could create a setter macro of their own to use in the (presumably few) cases where they needed an lvalue. - Robert NumPy changed its recommended API to an opaque one via inline getter functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. I could not find a discussion on performance impact, perhaps since the functions are in the header files and marked inline. Hopefully the compilers will properly deal with making them fast. However, it is true that when people update to a new version of a library things change. In this case, there are backward-compatibility macros that revert the post-1.7 functions into pre-1.7 macros with the same name. Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 api in the pandas build (experimental changeset https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), and was still able to access ndarray.data as a lvalue. Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties
On 28/09/18 01:20, Robert Bradshaw wrote: On Thu, Sep 27, 2018 at 11:36 PM Matti Picus <mailto:matti.pi...@gmail.com>> wrote: The problem is that when one reads cdef int aaa there's no indication as to the meaning of this. We also want to be sure to disallow this syntax everywhere but this one context. On the other hand the quotation syntax cdef int aaa "bbb" already has (widespread) meaning of establishing a C alias of the name in question which is essentially what we're trying to do here. I'm still, however, leaning towards the @property syntax (which we could allow for non-extern cdef classes as well). - Robert Using "PyArray_DIMS" with quotes but without parentheses would indeed be confusing to users and difficult to implement, so "PyArray_DIMS(*)" where the * is TBD seem nicer. It sounds like the jury is still out. In order to compare the solutions, I will move forward with the @property decorator syntax, but to keep it simple I will start small: only getters and specifically for CFuncDefNodes. Then if you still want to look at the other option I will turn my "experiment into a PR. Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Mitigating perfomance impact of NumPy API change
On 28/09/18 10:25, Matti Picus wrote: Breaking this into a number of sub-dsicussions, since we seem to be branching. The original topic was Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties On 28/09/18 01:20, Robert Bradshaw wrote: Hmm...so in this case it upgrading Cython would cause an unconditional switch from direct access to a function call without any code change (or choice) for users of numpy.pxd. I am curious what kind of a slowdown this would represent (though would assume this kind of analysis was done by the NumPy folks when choosing macro vs. function for the public API). As I point out in the "experiment" comment referenced above, pandas has code that needs lvalue access to ndarray data, so they would be stuck with the old API which is deprecated but still works for now. Scipy has no such code and oculd move forward to the newer API. But if we upgraded Cython, how would they access the old API? I suppose they could create a setter macro of their own to use in the (presumably few) cases where they needed an lvalue. - Robert NumPy changed its recommended API to an opaque one via inline getter functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. I could not find a discussion on performance impact, perhaps since the functions are in the header files and marked inline. Hopefully the compilers will properly deal with making them fast. However, it is true that when people update to a new version of a library things change. In this case, there are backward-compatibility macros that revert the post-1.7 functions into pre-1.7 macros with the same name. Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 api in the pandas build (experimental changeset https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), and was still able to access ndarray.data as a lvalue. Matti This means cython/numpy could provide an integration path based on numpy starting to ship its own numpy.pxd: - Cython would define the macro (if not already defined) to use the pre-1.7 Numpy API in the numpy.pxd it ships. This would still work (lvalues would be allowed) after direct access is replaced with the getter properties, since they are macros - NumPy would define the macro to use post-1.7 API (if not already defined) in the numpy.pxd it ships, which as I understand would take precedence over cython's. Then projects like pandas could freely upgrade Cython without changing their codebase, but would encounter errors when updating NumPy. Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] How best to contribute
Hi. I would like to start contributing in a meaningful way to Cython on the order of ~1 day a week, within the framework of the time allocated to me from my employer (Quansight Labs) toward open source contributions. Over time, my goal is push for an HPy[0] backend for Cython, but I also want to help the project move forward toward an official 3.0 release and in general to help out where needed. So working backwards: What are the immediate pain points that I could help out with in general? How can I help with the 3.0 release (there are issues marked as "blockers", are all of them truly critical)? And, in the long term, is there a way to start designing a new backend for HPy? I would also like to suggest that Cython hold monthly open "developer calls" where people can meet face-to-face (on zoom) to work out priorities and triage particularly difficult issues or PRs. I would be happy to try to set this up (mainly to negotiate a time that would work) if you-all think it is a good idea and would contribute more than it would distract. Thanks, Matti [0] HPy [https://hpy.readthedocs.io/en/latest] ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Welcome David Woods as a Cython core developer
As someone watching from the sidelines, it is nice to see the Cython team grow, especially with such a talented and committed contributor. Matti On 31/7/22 12:15, Stefan Behnel wrote: Hi everyone, with the release of the first 3.0 alpha that supports Python 3.11 (aptly named "alpha 11"), I'm happy to announce that David Woods has been promoted to a Cython core developer. David has shown an extraordinary commitment and dedication over the last years. His first merged commits were already back in 2015, mostly related to the C++ support. But within the last two years, he voluntarily took over more and more responsibility for bugs and issues and developed several major new features for the project. This includes the Walrus operator (PEP 572), cdef dataclasses (modelled after PEP 557), internal "std::move()" usage in C++ mode or support for Unicode identifiers and module names, all of which form a major part of the 3.0 feature set. David has more than deserved a place in the circle of present and prior core devs. David, thank you for your impressive work on Cython, and welcome to the core team! Stefan ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Düsseldorf HPy/PyPy/GraalPy sprint September 19-23rd 2022
There will be a sprint in Düsseldorf Sept 19 - 23. The sprint is open to anyone. The announcement on the HPy blog https://hpyproject.org/blog/posts/2022/07/dusseldorf-sprint-2022/ The announcement on the PyPy blog with pointers about "registration", and how to find accomodation https://www.pypy.org/posts/2022/07/ddorf-sprint-sep-2022.html Matti ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel