On 5 October 2011 14:54, mark florisson <markflorisso...@gmail.com> wrote: > On 5 October 2011 08:38, Robert Bradshaw <rober...@math.washington.edu> wrote: >> On Wed, Oct 5, 2011 at 12:16 AM, Stefan Behnel <stefan...@behnel.de> wrote: >>> mark florisson, 04.10.2011 23:19: >>>> >>>> So I propose that after fused types gets merged we try to move as many >>>> utility codes as possible to their utility code files (unless they are >>>> used in pending pull requests or other branches). Preferably this will >>>> be done in one or a few commits. How should we split up the work >>> >>> I would propose that new utility code gets moved out into utility files >>> right away (if doable, given the current state of the infrastructure), and >>> that existing utility code gets moves when it gets modified or when someone >>> feels like it. Until we really get to the point of wanting to create a >>> separate shared library etc., there's no need to hurry with the move. >>> >>> >>>> We could actually move things before fused types get merged, as long >>>> as we don't touch binding_cfunc_utility_code. >>> >>> Another reason not to hurry, right? >>> >>> >>>> Before we go there, Stefan, do we still want to implement the header >>>> .ini style which can list dependencies and such? >>> >>> I think we'll eventually need that, but that also depends a bit on the >>> question whether we want to (or can) build a shared library or not. See >>> below. >>> >>> >>>> Another issue is that Cython compile time is increasing with the >>>> addition of control flow and cython utilities. If you use fused types >>>> you're also going to combinatorially add more compile time. >>> >>> I don't see that locally - a compiled Cython is hugely fast for me. In >>> comparison, the C compiler literally takes ages to compile the result. An >>> external shared library may or may not help with both - in particular, it is >>> not clear to me what makes the C compiler slow. If the compile time is >>> dominated by the number of inlined functions (which is not unlikely), a >>> shared library + header file will not make a difference. >>> >>> >>>> I'm sure >>>> this came up earlier, but I really think we should have a libcython >>>> and a cython.h. libcython (a shared library) should contain any common >>>> Cython-specific code not meant to be inlined, and cython.h any types, >>>> macros and inline functions etc. >>> >>> This has a couple of implications though. In order to support this on the >>> user side, we have to build one shared library per installed package in >>> order to avoid any Cython versioning issues. Just installing a versioned >>> "libcython_x.y.z.so" globally isn't enough, especially during development, >>> but also at deployment time. Different packages may use different CFLAGS or >>> Cython options, which may have an impact on the result. Encoding all >>> possible factors in the file name will be cumbersome and may mean that we >>> still end up with a number of installed Cython libraries that correlates >>> with the number of installed Cython based packages. >> >> That's a good point. Perhaps an easier first target is to have one >> "libcython" per package (with a randomized or project-specific name). >> Longer-term, I think the goal of one libcython per version is a >> reasonable one, for deployment at least. Exceptional packages (e.g. >> that require a special set of CFLAGS rather than the ones Python was >> built with) can either bundle their own or forgo any sharing of code >> as it is done now, and features that can't be easily normalized across >> (cython and c) compilation options would remain in project-specific >> generated .c files. >> >>> Next, we may not know at build time which set of Cython modules is in the >>> package. This may be less of an issue if we rely on "cythonize()" in >>> setup.py to compile all modules before hand (assuming that the user doesn't >>> call it twice, once for *.pyx, once for *.py, for example), but even if we >>> know all modules, we'd still have to figure out the complete set of utility >>> code used by all modules in order to build an adapted library with only the >>> necessary code used in the package. So we'd always end up with a complete >>> library with all utility code, which is only really interesting for larger >>> packages with several Cython modules. >> >> Yes, I'm thinking we would create relatively complete libraries, >> though if we did things on a per package level perhaps we could do >> some pruning. We could still conditionally put some of the utility >> code (especially the rarely used or shared stuff) into each module. > > Yeah that would be nice. I actually think we shouldn't do anything on > a per-package level, only a bunch of modules with related stuff > (conversion utilities/exception raising etc in one module, > buffer/memoryview utilities in another etc). We've been living with > huge files since now, I don't think we suddenly need to actively start > pruning for a little bit of memory. > > I think the module approach would also be easy to implement, as the > infrastructure for external cdef functions/classes importing/exporting > is already there. > >>> I agree with Robert that a CEP would be needed for this, both for clearing >>> up the implications and actual use cases (I know that Sage is a reasonable >>> use case, but it's also a rather special case). >>> >>> >>>> This will decrease Cython and C >>>> compile time, and will also make executables smaller. >>> >>> I don't see how this actually impacts executables. However, a self-contained >>> executable is a value in itself. >> >> As an example, we're starting to have full utility types, e.g. for >> generators and or CyFunction. Lots of the utility code (e.g. loading >> modules, raising exceptions, etc.) could be shared as well. For >> something like Sage that could be a significant savings, and it could >> be a big boon for cython.inline as well. >> >>>> This could be >>>> enabled using a command line option to Cython, as well as with >>>> distutils, eventually we may decide to make it the default (lets >>>> figure that out later). Preferably libcython.so would be installed >>>> alongside libpython.so and cython.h inside the Python include >>>> directory. >>> >>> I don't see this happening. It's easy for Python (there is only one Python >>> running at a time, with one libpython loaded), but it's a lot less safe for >>> different versions of a Cython library that are used by different modules >>> inside of the running Python. For example, we'd have to version all visible >>> symbols in operating systems with flat namespaces, in order to support >>> loading multiple versions of the library. >> >> Which is another advantage to "linking" via the cimport mechanisms. >> >>>> Lastly, I think we also should figure out a way to serialize Entry >>>> objects from CythonUtilities, which could easily and swiftly be loaded >>>> when creating the cython scope. It's quite a pain to declare all >>>> entries for utilities you write manually >>> >>> Why would you declare them manually? I thought everything would be moved out >>> into the utility code files? >>> >>> >>>> so what I mostly did was >>>> parse the utility up to and including AnalyseDeclarationsTransform, >>>> and then retrieve the entries from there. >>> >>> Sounds like a drawback regarding the processing time, but may still be a >>> reasonable way to do it. I would expect that it won't be hard to pickle the >>> resulting dict of entries into a cache file and rebuild it only when one of >>> the utility files changes. >> >> +1 >> >> It'd be great to be able to do this for the many .pxd files in Sage as >> well. Parsing .pxd files is a huge portion of the compilation of the >> Sage library. >> >> - Robert >> _______________________________________________ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel >> >
I expect it will also speed up the test runner quite a lot, which takes forever, as there are lots of small doctests. <offtopic>On an unrelated note, it'd be great if we could run individual doctests in parallel, I know py.test can do that, maybe nosetests as well. It'd be great if there was a plugin that supported cython (as well as C extension modules) that could run them, and an additional plugin that could make it work with our various test modes and directives.</offtopic> _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel