[Cython] Multidimensional indexing of C++ objects

2015-07-03 Thread Ian Henriksen
Hi everyone,
I'm a GSOC student working to make a Cython API for DyND. DyND
 is a relatively new n-dimensional
array library in C++ that is based on NumPy. A full set of Python bindings
(created using Cython) are provided as a separate package. The goal of my
project is to make it so that DyND arrays can be used easily within Cython
so that an n-dimensional array object can be used without any of the
corresponding Python overhead.

Currently, there isn't a good way to assign to multidimensional slices
within Cython. Since the indexing operator in C++ is limited to a single
argument, we use the call operator to represent multidimensional indexing,
and then use a proxy class to perform assignment to a slice.
Currently, in C++, assigning to a slice along the second axis of a DyND
array looks like this:

a(irange(), 1).vals() = 0;

Unfortunately, in Cython, only the index operator can be used for
assignment, so following the C++ syntax isn't currently possible. Does
anyone know of a good way to address this? I'm willing to spend some time
implementing a new feature if we can reach a consensus on a good way to
deal with this. Here are some possible solutions I've thought of:

1. We could allow assignment to C++ method and function calls that return
references. This has the advantage that it matches the existing syntax in
C++ for dealing with C++ objects. Though Cython is a Python-like language,
the ability to manipulate C++ objects directly is a key part of its feature
set. Since the native way to do things like multidimensional indexing in
C++ is via the call operator, it seems sensible to allow assignment to
C++-level call operations in Cython as well. This could be enabled via a
Cython compiler directive and be disabled by default. Using a compiler
directive like this would result in an interface similar to the one already
used for cdivision, wrap-around indexing, and index bounds checking. The
user would avoid unexpected results by default, but be able to get the
needed functionality simply by enabling it.

2. We could recommend that all assignment operations of this nature be
wrapped in a fake method that wraps the assignment in it's c-level name.
This has the advantage that it works in current and past versions of
Cython, but it is a rather unusual hack. For example, something like the
following would work right now:

# declared as a method in a pxd file:
void assign "vals() = "(int value) except +

# used in a pyx file to perform assignment to a slice of an array a:
a(irange(), 1).assign(0)

For DyND, at least for now, this would be a workable solution since the
difference lies primarily in the placement of the parenthesis and the
presence of the assignment operator. The syntax is less clear than it could
be, but it would work. On the other hand, other libraries may not be so
lucky since this involves replacing assignment to a slice with a method
call. For example, the expression template libraries Eigen and Blaze-lib
would encounter incompatibility to varying degrees if someone were to try
using them within Cython. This method also has the disadvantage that it
creates an interface that is fundamentally different from both the Python
and C++ interfaces.

I have also considered, writing a proxy class that can serve as an
effective temporary value while a multidimensional index is constructed
from a series of calls to operator[]. This is a reasonable approach, but it
leads to unnecessary code bloat. It also complicates the interface exposed
to users, since operator[] would be needed for left hand values and
operator() would be needed for right hand values. This would also make it
so that users that want to use these C++ classes in Cython would have to
include and link against another set of headers and libraries to be able to
use the proxy class. The burden of maintainability for Python bindings
created in this way would be greater as well. This also isn't a viable
approach for using any C++ class that overloads both operators.

Another option I have considered is allowing Cython's indexing operator to
dispatch to a different function. Currently, user-defined cname entries for
overloaded operators are not used. If this were changed for the indexing
operator, indexing could be performed at the C++ level using some other
method. This doesn't look like a viable approach though, since, for this to
really work, users would need some way to call different methods when a C++
object is being indexed and when it is being assigned to. Using operator[]
for left-hand values and operator() for right-hand values is a possible
solution, but that isn't a very consistent interface. Doing this would also
increase the complexity of the existing code for indexing in the Cython
compiler and could lead to name collisions for classes that overload both
operator[] and operator().

Are any of these acceptable ways to go forward? Does anyone have any better
ideas? My preference would definitely b

Re: [Cython] Multidimensional indexing of C++ objects

2015-07-03 Thread Stefan Behnel
Hi Ian!

Ian Henriksen schrieb am 04.07.2015 um 00:43:
> I'm a GSOC student working to make a Cython API for DyND. DyND
>  is a relatively new n-dimensional
> array library in C++ that is based on NumPy. A full set of Python bindings
> (created using Cython) are provided as a separate package. The goal of my
> project is to make it so that DyND arrays can be used easily within Cython
> so that an n-dimensional array object can be used without any of the
> corresponding Python overhead.
> 
> Currently, there isn't a good way to assign to multidimensional slices
> within Cython. Since the indexing operator in C++ is limited to a single
> argument, we use the call operator to represent multidimensional indexing,
> and then use a proxy class to perform assignment to a slice.
> Currently, in C++, assigning to a slice along the second axis of a DyND
> array looks like this:
> 
> a(irange(), 1).vals() = 0;
> 
> Unfortunately, in Cython, only the index operator can be used for
> assignment, so following the C++ syntax isn't currently possible. Does
> anyone know of a good way to address this?

Just an idea, don't know how feasible this is, but we could allow inline
special methods in C++ class declarations that implement Python protocols.
Example:

cdef extern from ...:
cppclass Array2D:
   int operator[] except +
   int getItemAt(ssize_t x, ssize_t y) except +

   cdef inline __getitem__(self, Py_ssize_t x, Py_ssize_t y):
   return self.getItemAt(x, y)

def test():
cdef Array2D a
return a[1, 2]

Cython could then translate an item access on an Array2D instance into the
corresponding special "method" call.

Drawbacks:

1) The example above would conflict with the C++ [] operator, so it would
be ambiguous which one is being used in Cython code. Not sure if there's a
use case for making both available to Cython code, but that would be
difficult to achieve if the need arises.

2) It doesn't solve the general problem of assigning to C++ expressions,
especially because it does not extend the syntax allowed by Cython which
would still limit what you can do in these fake special methods.

Regarding your proposals, I'd be happy if we could avoid adding syntax
support for assigning to function calls. And I agree that the cname
assignment hack is really just a big hack. It shouldn't be relied on.

Stefan

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel