On Tue, Sep 1, 2015 at 8:16 AM, Nathaniel Smith <n...@pobox.com> wrote:
> On Sun, Aug 30, 2015 at 2:44 PM, David Cournapeau <courn...@gmail.com> > wrote: > > Hi there, > > > > Reading Nathaniel summary from the numpy dev meeting, it looks like > there is > > a consensus on using cython in numpy for the Python-C interfaces. > > > > This has been on my radar for a long time: that was one of my rationale > for > > splitting multiarray into multiple "independent" .c files half a decade > ago. > > I took the opportunity of EuroScipy sprints to look back into this, but > > before looking more into it, I'd like to make sure I am not going astray: > > > > 1. The transition has to be gradual > > Yes, definitely. > > > 2. The obvious way I can think of allowing cython in multiarray is > modifying > > multiarray such as cython "owns" the PyMODINIT_FUNC and the module > > PyModuleDef table. > > The seems like a plausible place to start. > > In the longer run, I think we'll need to figure out a strategy to have > source code divided over multiple .pyx files (for the same reason we > want multiple .c files -- it'll just be impossible to work with > otherwise). And this will be difficult for annoying technical reasons, > since we definitely do *not* want to increase the API surface exposed > by multiarray.so, so we will need to compile these multiple .pyx and > .c files into a single module, and have them talk to each other via > internal interfaces. But Cython is currently very insistent that every > .pyx file should be its own extension module, and the interface > between different files should be via public APIs. > > I spent some time poking at this, and I think it's possible but will > take a few kluges at least initially. IIRC the tricky points I noticed > are: > > - For everything except the top-level .pyx file, we'd need to call the > generated module initialization functions "by hand", and have a bit of > utility code to let us access the symbol tables for the resulting > modules > > - We'd need some preprocessor hack (or something?) to prevent the > non-main module initialization functions from being exposed at the .so > level (like 'cdef extern from "foo.h"', 'foo.h' re#defines > PyMODINIT_FUNC to remove the visibility declaration) > > - By default 'cdef' functions are name-mangled, which is annoying if > you want to be able to do direct C calls between different .pyx and .c > files. You can fix this by adding a 'public' declaration to your cdef > function. But 'public' also adds dllexport stuff which would need to > be hacked out as per above. > > I think the best strategy for this is to do whatever horrible things > are necessary to get an initial version working (on a branch, of > course), and then once that's done assess what changes we want to ask > the cython folks for to let us eliminate the gross parts. > Agreed. Regarding multiple cython .pyx and symbol pollution, I think it would be fine to have an internal API with the required prefix (say `_npy_cpy_`) in a core library, and control the exported symbols at the .so level. This is how many large libraries work in practice (e.g. MKL), and is a model well understood by library users. I will start the cythonize process without caring about any of that though: one large .pyx file, and everything build together by putting everything in one .so. That will avoid having to fight both cython and distutils at the same time :) David > > (Insisting on compiling everything into the same .so will probably > also help at some point in avoiding Cython-Related Binary Size Blowup > Syndrome (CRBSBS), because the masses of boilerplate could in > principle be shared between the different files. I think some modern > linkers are even clever enough to eliminate this kind of duplicate > code automatically, since C++ suffers from a similar problem.) > > > 3. We start using cython for the parts that are mostly menial refcount > work. > > Things like functions in calculation.c are obvious candidates. > > > > Step 2 should not be disruptive, and does not look like a lot of work: > there > > are < 60 methods in the table, and most of them should be fairly > > straightforward to cythonize. At worse, we could just keep them as is > > outside cython and just "export" them in cython. > > > > Does that sound like an acceptable plan ? > > > > If so, I will start working on a PR to work on 2. > > Makes sense to me! > > -n > > -- > Nathaniel J. Smith -- http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion