[Cython] Cython and PyPy

2016-11-24 Thread Matti Picus

  
  
I am trying to get cython to work better with PyPy. I am sort of
documenting my progress and failures at https://bitbucket.org/pypy/pypy/wiki/edit/cpyext_2_-_cython


So far I can run the test suite on a nightly PyPy2 http://buildbot.pypy.org/nightly/trunk,
but only when using --no-refnanny. On the missing-tp_new branch of
PyPy (trying to fix the datetime problems), running only the c
backend I get something like


Ran 4632 tests in 931.486s FAILED (failures=80, errors=18,
skipped=1)


This mail is a heads-up, I will hopefully issue some pull requests
soon. Also, I have some questions, mainly about the test runner:


- Shouldn't the "skipped" field include the number of tests in
pypy_bugs.txt?


- How can I get pdb to work during a single test run to try to work
out the internals of cython? In nose or pytest I can add the -s
option, I could not find an equivalent.


- Is there a marker for test start/test stop in the test report? I
would like to use awk or grep to try to analyse the multiple
failures into groups


- The XML backend seems to miss some of the stdout/stderr messages.
Is there more documentation of test running options somewhere?


Thanks

Matti
  

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] trying to understand why PyString_GET_SIZE cannot be cimport-ed

2017-01-07 Thread Matti Picus
I am working on a branch of PyPy 2.7 to support Pandas (default PyPy is 
missing some CAPI support that enables the parts of cython used in Pandas).

Pandas code has these lines (in lib.pyx)

try:
from cpython cimport PyString_GET_SIZE
except ImportError:
from cpython cimport PyUnicode_GET_SIZE as PyString_GET_SIZE


For some reason, PyPy fails to cimport PyString_GET_SIZE, but 
successfully cimports PyUnicode_GET_SIZE.
The substitution causes problems for PyPy, I could solve those in a 
different way, but I would like to understand what is going on.


My cython Foo is improving but still too weak to understand why the 
cimport fails, could someone help me out with a hint?


Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] trying to understand why PyString_GET_SIZE cannot be cimport-ed

2017-01-07 Thread Matti Picus

On 07/01/17 19:25, Matti Picus wrote:

I am working on a branch of PyPy 2.7 to support Pandas (default PyPy 
is missing some CAPI support that enables the parts of cython used in 
Pandas).

Pandas code has these lines (in lib.pyx)

try:
from cpython cimport PyString_GET_SIZE
except ImportError:
from cpython cimport PyUnicode_GET_SIZE as PyString_GET_SIZE


For some reason, PyPy fails to cimport PyString_GET_SIZE, but 
successfully cimports PyUnicode_GET_SIZE.
The substitution causes problems for PyPy, I could solve those in a 
different way, but I would like to understand what is going on.


My cython Foo is improving but still too weak to understand why the 
cimport fails, could someone help me out with a hint?


Matti
Sorry for the noise, this would appear to be more appropriate for 
cython-users and also is just wrong, I will file a bug report with pandas.

Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties

2018-09-27 Thread Matti Picus
To solve issue #2498, I did some experiments 
https://github.com/cython/cython/issues/2498#issuecomment-414543549 with 
hiding direct field access in an external extension type (documented 
here 
https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types). 
The idea is to write `a.ndims` in cython (in plain python code), and in 
C magically get the attribute lookup converted into a `PyArray_NDIMS(a)` 
getter, which could be a macro or a c-function.


The experiments proved fruitful, and garnered some positive feedback so 
I am pushing forward.


I would like to get some feedback on syntax before I progress too far. 
Should the syntax be extended to support


|ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert 
python __getattr__ access to c functions. int ndims PyArray_NDIMS |



or perhaps a decorator, like Python

|ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert 
python __getattr__ access to c functions. @property  cdef int 
ndims(self): return PyArray_NDIMS(self) or something else? The second 
seems more wordy but more explicit. I don't know which would be easier 
to implement or require more effort to test and maintain. Matti |


___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties

2018-09-27 Thread Matti Picus

On 27/09/18 22:50, Robert Bradshaw wrote:


On Thu, Sep 27, 2018 at 10:38 AM Matti Picus
mailto:matti.pi...@gmail.com>> wrote:
To solve issue #2498, I did some experiments
https://github.com/cython/cython/issues/2498#issuecomment-414543549
with
hiding direct field access in an external extension type (documented
here

https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#external-extension-types).

The idea is to write `a.ndims` in cython (in plain python code),
and in
C magically get the attribute lookup converted into a
`PyArray_NDIMS(a)`
getter, which could be a macro or a c-function.

The experiments proved fruitful, and garnered some positive
feedback so
I am pushing forward.

I would like to get some feedback on syntax before I progress too
far.
Should the syntax be extended to support

ctypedef class numpy.ndarray [object PyArrayObject]:cdef: # Convert
python __getattr__ access to c functions. int ndims PyArray_NDIMS |


or perhaps a decorator, like Python

|ctypedef class numpy.ndarray [object PyArrayObject]: cdef: # Convert
python __getattr__ access to c functions. @property  cdef int
ndims(self): return PyArray_NDIMS(self) or something else? The second
seems more wordy but more explicit. I don't know which would be
easier
to implement or require more effort to test and maintain. 




Matti |

Thanks for looking into this!

My preference would be to use the @property syntax, as this will be 
immediately understandable to any Cython user and could contain 
arbitrary code, rather than just a macro call.


There are, however, a couple of downsides. The first is that it may 
not be clear when accessing an attribute that a full function call may 
be invoked. (Arguably this is the same issue one has with Python, but 
there attribute access is already expensive. The function could be 
inline as well if desired.) The second is that this means that this 
attribute is no longer an lvalue. The last is that it's a bit special 
to be defining methods on an extern class. Maybe it would have to be 
inline if it's in the pxd?


If we're going to be defining a special syntax, I might prefer 
something like


cdef extern class ...:
    int ndims "PyArray_NDIMS(*)"

which more resembles

    int ndims "nd"

Open to bikeshedding on what the "self" placeholder should be. As 
before, should the ndims lose its lvalue status in this case, or not 
(in case the accessor is really a macro intended to be used like this)?



Sorry about the formatting messup, the original proposal was supposed to 
be (this time using double spacing to make sure it works):


-

cdef extern class ...:

    @property

    cdef int ndims(self):

    return PyArray_NDIMS(self)

--

vs



cdef extern class ...:

    cdef int ndims PyArray_NDIMS



The proposal  is for a getter via a C function or a macro. NumPy's 
current public API uses a mix. Currently I am interested in getters that 
would not allow lvalue at all. Maybe in the future we will have fast 
rvalue setter functions in NumPy, but the current API does not support 
them. It remains to be seem how much slowdown we see in real-life 
benchmarks when calling a small C function from a different shared 
object to access attributes rather than directly accessing them via 
struct fields.


As I point out in the "experiment" comment referenced above, pandas has 
code that needs lvalue access to ndarray data, so they would be stuck 
with the old API which is deprecated but still works for now. Scipy has 
no such code and oculd move forward to the newer API.


As far as bikeshedding the "self" parameter, I would propose doing 
without, and indeed I successfully hacked Cython to use the second 
proposal with no self argument and no quotations.


Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Mitigating perfomance impact of NumPy API change

2018-09-28 Thread Matti Picus
Breaking this into a number of sub-dsicussions, since we seem to be 
branching. The original topic was


Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties

On 28/09/18 01:20, Robert Bradshaw wrote:


Hmm...so in this case it upgrading Cython would cause an unconditional 
switch from direct access to a function call without any code change 
(or choice) for users of numpy.pxd. I am curious what kind of a 
slowdown this would represent (though would assume this kind of 
analysis was done by the NumPy folks when choosing macro vs. function 
for the public API).


As I point out in the "experiment" comment referenced above,
pandas has
code that needs lvalue access to ndarray data, so they would be stuck
with the old API which is deprecated but still works for now.
Scipy has
no such code and oculd move forward to the newer API.


But if we upgraded Cython, how would they access the old API? I 
suppose they could create a setter macro of their own to use in the 
(presumably few) cases where they needed an lvalue.


- Robert




NumPy changed its recommended API to an opaque one via inline getter 
functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. I 
could not find a discussion on performance impact, perhaps since the 
functions are in the header files and marked inline. Hopefully the 
compilers will properly deal with making them fast. However, it is true 
that when people update to a new version of a library things change. In 
this case, there are backward-compatibility macros that revert the 
post-1.7 functions into pre-1.7 macros with the same name.


Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 api 
in the pandas build (experimental changeset 
https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), 
and was still able to access ndarray.data as a lvalue.


Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter properties

2018-09-28 Thread Matti Picus

On 28/09/18 01:20, Robert Bradshaw wrote:
On Thu, Sep 27, 2018 at 11:36 PM Matti Picus <mailto:matti.pi...@gmail.com>> wrote:


The problem is that when one reads

    cdef int aaa 

there's no indication as to the meaning of this. We also want to be 
sure to disallow this syntax everywhere but this one context. On the 
other hand the quotation syntax


    cdef int aaa "bbb"

already has (widespread) meaning of establishing a C alias of the name 
in question which is essentially what we're trying to do here.


I'm still, however, leaning towards the @property syntax (which we 
could allow for non-extern cdef classes as well).


- Robert


Using "PyArray_DIMS" with quotes but without parentheses would indeed be 
confusing to users and difficult to implement, so "PyArray_DIMS(*)" 
where the * is TBD seem nicer.


It sounds like the jury is still out. In order to compare the solutions, 
I will move forward with the @property decorator syntax, but to keep it 
simple I will start small: only getters and specifically for 
CFuncDefNodes. Then if you still want to look at the other option I will 
turn my "experiment into a PR.


Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Mitigating perfomance impact of NumPy API change

2018-09-28 Thread Matti Picus

On 28/09/18 10:25, Matti Picus wrote:
Breaking this into a number of sub-dsicussions, since we seem to be 
branching. The original topic was


Re: [Cython] Enhancing "ctyepdef class numpy.ndarray" with getter 
properties


On 28/09/18 01:20, Robert Bradshaw wrote:


Hmm...so in this case it upgrading Cython would cause an 
unconditional switch from direct access to a function call without 
any code change (or choice) for users of numpy.pxd. I am curious what 
kind of a slowdown this would represent (though would assume this 
kind of analysis was done by the NumPy folks when choosing macro vs. 
function for the public API).


    As I point out in the "experiment" comment referenced above,
    pandas has
    code that needs lvalue access to ndarray data, so they would be 
stuck

    with the old API which is deprecated but still works for now.
    Scipy has
    no such code and oculd move forward to the newer API.


But if we upgraded Cython, how would they access the old API? I 
suppose they could create a setter macro of their own to use in the 
(presumably few) cases where they needed an lvalue.


- Robert




NumPy changed its recommended API to an opaque one via inline getter 
functions in 2011, in this PR https://github.com/numpy/numpy/pull/116. 
I could not find a discussion on performance impact, perhaps since the 
functions are in the header files and marked inline. Hopefully the 
compilers will properly deal with making them fast. However, it is 
true that when people update to a new version of a library things 
change. In this case, there are backward-compatibility macros that 
revert the post-1.7 functions into pre-1.7 macros with the same name.


Thus for the experiment I used a new numpy.pxd, defined the pre-1.7 
api in the pandas build (experimental changeset 
https://github.com/mattip/pandas/commit/9113bf7e55e1eddece3544c1ad3ef2a761b5210a), 
and was still able to access ndarray.data as a lvalue.


Matti


This means cython/numpy could provide an integration path based on numpy 
starting to ship its own numpy.pxd:


- Cython would define the macro (if not already defined) to use the 
pre-1.7 Numpy API in the numpy.pxd it ships. This would still work 
(lvalues would be allowed) after direct access is replaced with the 
getter properties, since they are macros


- NumPy would define the macro to use post-1.7 API (if not already 
defined) in the numpy.pxd it ships, which as I understand would take 
precedence over cython's. Then projects like pandas could freely upgrade 
Cython without changing their codebase, but would encounter errors when 
updating NumPy.


Matti
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] How best to contribute

2021-02-26 Thread Matti Picus

Hi.

I would like to start contributing in a meaningful way to Cython on the 
order of ~1 day a week, within the framework of the time allocated to me 
from my employer (Quansight Labs) toward open source contributions. Over 
time, my goal is push for an HPy[0] backend for Cython, but I also want 
to help the project move forward toward an official 3.0 release and in 
general to help out where needed.



So working backwards:

What are the immediate pain points that I could help out with in general?

How can I help with the 3.0 release (there are issues marked as 
"blockers", are all of them truly critical)?


And, in the long term, is there a way to start designing a new backend 
for HPy?



I would also like to suggest that Cython hold monthly open "developer 
calls" where people can meet face-to-face (on zoom) to work out 
priorities and triage particularly difficult issues or PRs. I would be 
happy to try to set this up (mainly to negotiate a time that would work) 
if you-all think it is a good idea and would contribute more than it 
would distract.



Thanks,

Matti


[0] HPy [https://hpy.readthedocs.io/en/latest]

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Welcome David Woods as a Cython core developer

2022-07-31 Thread Matti Picus
As someone watching from the sidelines, it is nice to see the Cython 
team grow, especially with such a talented and committed contributor.


Matti


On 31/7/22 12:15, Stefan Behnel wrote:

Hi everyone,

with the release of the first 3.0 alpha that supports Python 3.11 
(aptly named "alpha 11"), I'm happy to announce that David Woods has 
been promoted to a Cython core developer.


David has shown an extraordinary commitment and dedication over the 
last years. His first merged commits were already back in 2015, mostly 
related to the C++ support. But within the last two years, he 
voluntarily took over more and more responsibility for bugs and issues 
and developed several major new features for the project. This 
includes the Walrus operator (PEP 572), cdef dataclasses (modelled 
after PEP 557), internal "std::move()" usage in C++ mode or support 
for Unicode identifiers and module names, all of which form a major 
part of the 3.0 feature set. David has more than deserved a place in 
the circle of present and prior core devs.


David, thank you for your impressive work on Cython,
and welcome to the core team!

Stefan
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Düsseldorf HPy/PyPy/GraalPy sprint September 19-23rd 2022

2022-08-04 Thread Matti Picus

  
  
There will be a sprint in Düsseldorf Sept 19 - 23. The sprint is
  open to anyone.



The announcement on the HPy blog 

https://hpyproject.org/blog/posts/2022/07/dusseldorf-sprint-2022/


The announcement on the PyPy blog with pointers about
  "registration", and how to find accomodation
https://www.pypy.org/posts/2022/07/ddorf-sprint-sep-2022.html


Matti
  

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel