Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread Stefan Behnel

mark florisson, 04.10.2011 23:19:

So I propose that after fused types gets merged we try to move as many
utility codes as possible to their utility code files (unless they are
used in pending pull requests or other branches). Preferably this will
be done in one or a few commits. How should we split up the work


I would propose that new utility code gets moved out into utility files 
right away (if doable, given the current state of the infrastructure), and 
that existing utility code gets moves when it gets modified or when someone 
feels like it. Until we really get to the point of wanting to create a 
separate shared library etc., there's no need to hurry with the move.




We could actually move things before fused types get merged, as long
as we don't touch binding_cfunc_utility_code.


Another reason not to hurry, right?



Before we go there, Stefan, do we still want to implement the header
.ini style which can list dependencies and such?


I think we'll eventually need that, but that also depends a bit on the 
question whether we want to (or can) build a shared library or not. See below.




Another issue is that Cython compile time is increasing with the
addition of control flow and cython utilities. If you use fused types
you're also going to combinatorially add more compile time.


I don't see that locally - a compiled Cython is hugely fast for me. In 
comparison, the C compiler literally takes ages to compile the result. An 
external shared library may or may not help with both - in particular, it 
is not clear to me what makes the C compiler slow. If the compile time is 
dominated by the number of inlined functions (which is not unlikely), a 
shared library + header file will not make a difference.




I'm sure
this came up earlier, but I really think we should have a libcython
and a cython.h. libcython (a shared library) should contain any common
Cython-specific code not meant to be inlined, and cython.h any types,
macros and inline functions etc.


This has a couple of implications though. In order to support this on the 
user side, we have to build one shared library per installed package in 
order to avoid any Cython versioning issues. Just installing a versioned 
"libcython_x.y.z.so" globally isn't enough, especially during development, 
but also at deployment time. Different packages may use different CFLAGS or 
Cython options, which may have an impact on the result. Encoding all 
possible factors in the file name will be cumbersome and may mean that we 
still end up with a number of installed Cython libraries that correlates 
with the number of installed Cython based packages.


Next, we may not know at build time which set of Cython modules is in the 
package. This may be less of an issue if we rely on "cythonize()" in 
setup.py to compile all modules before hand (assuming that the user doesn't 
call it twice, once for *.pyx, once for *.py, for example), but even if we 
know all modules, we'd still have to figure out the complete set of utility 
code used by all modules in order to build an adapted library with only the 
necessary code used in the package. So we'd always end up with a complete 
library with all utility code, which is only really interesting for larger 
packages with several Cython modules.


I agree with Robert that a CEP would be needed for this, both for clearing 
up the implications and actual use cases (I know that Sage is a reasonable 
use case, but it's also a rather special case).




This will decrease Cython and C
compile time, and will also make executables smaller.


I don't see how this actually impacts executables. However, a 
self-contained executable is a value in itself.




This could be
enabled using a command line option to Cython, as well as with
distutils, eventually we may decide to make it the default (lets
figure that out later). Preferably libcython.so would be installed
alongside libpython.so and cython.h inside the Python include
directory.


I don't see this happening. It's easy for Python (there is only one Python 
running at a time, with one libpython loaded), but it's a lot less safe for 
different versions of a Cython library that are used by different modules 
inside of the running Python. For example, we'd have to version all visible 
symbols in operating systems with flat namespaces, in order to support 
loading multiple versions of the library.




Lastly, I think we also should figure out a way to serialize Entry
objects from CythonUtilities, which could easily and swiftly be loaded
when creating the cython scope. It's quite a pain to declare all
entries for utilities you write manually


Why would you declare them manually? I thought everything would be moved 
out into the utility code files?




so what I mostly did was
parse the utility up to and including AnalyseDeclarationsTransform,
and then retrieve the entries from there.


Sounds like a drawback regarding the processing time, but may still be a 
reasonable way to do it. I

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread Robert Bradshaw
On Wed, Oct 5, 2011 at 12:16 AM, Stefan Behnel  wrote:
> mark florisson, 04.10.2011 23:19:
>>
>> So I propose that after fused types gets merged we try to move as many
>> utility codes as possible to their utility code files (unless they are
>> used in pending pull requests or other branches). Preferably this will
>> be done in one or a few commits. How should we split up the work
>
> I would propose that new utility code gets moved out into utility files
> right away (if doable, given the current state of the infrastructure), and
> that existing utility code gets moves when it gets modified or when someone
> feels like it. Until we really get to the point of wanting to create a
> separate shared library etc., there's no need to hurry with the move.
>
>
>> We could actually move things before fused types get merged, as long
>> as we don't touch binding_cfunc_utility_code.
>
> Another reason not to hurry, right?
>
>
>> Before we go there, Stefan, do we still want to implement the header
>> .ini style which can list dependencies and such?
>
> I think we'll eventually need that, but that also depends a bit on the
> question whether we want to (or can) build a shared library or not. See
> below.
>
>
>> Another issue is that Cython compile time is increasing with the
>> addition of control flow and cython utilities. If you use fused types
>> you're also going to combinatorially add more compile time.
>
> I don't see that locally - a compiled Cython is hugely fast for me. In
> comparison, the C compiler literally takes ages to compile the result. An
> external shared library may or may not help with both - in particular, it is
> not clear to me what makes the C compiler slow. If the compile time is
> dominated by the number of inlined functions (which is not unlikely), a
> shared library + header file will not make a difference.
>
>
>> I'm sure
>> this came up earlier, but I really think we should have a libcython
>> and a cython.h. libcython (a shared library) should contain any common
>> Cython-specific code not meant to be inlined, and cython.h any types,
>> macros and inline functions etc.
>
> This has a couple of implications though. In order to support this on the
> user side, we have to build one shared library per installed package in
> order to avoid any Cython versioning issues. Just installing a versioned
> "libcython_x.y.z.so" globally isn't enough, especially during development,
> but also at deployment time. Different packages may use different CFLAGS or
> Cython options, which may have an impact on the result. Encoding all
> possible factors in the file name will be cumbersome and may mean that we
> still end up with a number of installed Cython libraries that correlates
> with the number of installed Cython based packages.

That's a good point. Perhaps an easier first target is to have one
"libcython" per package (with a randomized or project-specific name).
Longer-term, I think the goal of one libcython per version is a
reasonable one, for deployment at least. Exceptional packages (e.g.
that require a special set of CFLAGS rather than the ones Python was
built with) can either bundle their own or forgo any sharing of code
as it is done now, and features that can't be easily normalized across
(cython and c) compilation options would remain in project-specific
generated .c files.

> Next, we may not know at build time which set of Cython modules is in the
> package. This may be less of an issue if we rely on "cythonize()" in
> setup.py to compile all modules before hand (assuming that the user doesn't
> call it twice, once for *.pyx, once for *.py, for example), but even if we
> know all modules, we'd still have to figure out the complete set of utility
> code used by all modules in order to build an adapted library with only the
> necessary code used in the package. So we'd always end up with a complete
> library with all utility code, which is only really interesting for larger
> packages with several Cython modules.

Yes, I'm thinking we would create relatively complete libraries,
though if we did things on a per package level perhaps we could do
some pruning. We could still conditionally put some of the utility
code (especially the rarely used or shared stuff) into each module.

> I agree with Robert that a CEP would be needed for this, both for clearing
> up the implications and actual use cases (I know that Sage is a reasonable
> use case, but it's also a rather special case).
>
>
>> This will decrease Cython and C
>> compile time, and will also make executables smaller.
>
> I don't see how this actually impacts executables. However, a self-contained
> executable is a value in itself.

As an example, we're starting to have full utility types, e.g. for
generators and or CyFunction. Lots of the utility code (e.g. loading
modules, raising exceptions, etc.) could be shared as well. For
something like Sage that could be a significant savings, and it could
be a big boon for cython.inline as well.

Re: [Cython] [cython-users] Re: callback function pointer problem

2011-10-05 Thread Robert Bradshaw
On Fri, Sep 30, 2011 at 2:14 PM, Dag Sverre Seljebotn
 wrote:
> Are you saying that when coercing a struct to an object, one would copy
> scalar fields by value but reference array fields? -1, that would be
> confusing. Either the whole struct through a view, or copy it all.

+1

> It bothers me that structs are passed by value in Cython, but it seems
> impossible to change that now. (i.e, once upon a time one could have
> required the use of a copy method to do a struct assignment and give a
> syntax error otherwise, which would have worked nicer with Python
> semantics).

Of course, to do otherwise would have resulted in "pure C" code
behaving very differently from C and messy issues like "cdef int
f(struct_type a)" either meaning different things in an extern block
or not mapping to the "obvious" C signature.

On this note, eventually I would like coerce structs (and unions,
enums) to auto-generated wrapper classes, visible in the Python module
namespace if one declares them as "cpdef struct ..." (even if they're
extern).

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread mark florisson
On 5 October 2011 08:16, Stefan Behnel  wrote:
> mark florisson, 04.10.2011 23:19:
>>
>> So I propose that after fused types gets merged we try to move as many
>> utility codes as possible to their utility code files (unless they are
>> used in pending pull requests or other branches). Preferably this will
>> be done in one or a few commits. How should we split up the work
>
> I would propose that new utility code gets moved out into utility files
> right away (if doable, given the current state of the infrastructure), and
> that existing utility code gets moves when it gets modified or when someone
> feels like it. Until we really get to the point of wanting to create a
> separate shared library etc., there's no need to hurry with the move.
>
>
>> We could actually move things before fused types get merged, as long
>> as we don't touch binding_cfunc_utility_code.
>
> Another reason not to hurry, right?
>
>
>> Before we go there, Stefan, do we still want to implement the header
>> .ini style which can list dependencies and such?
>
> I think we'll eventually need that, but that also depends a bit on the
> question whether we want to (or can) build a shared library or not. See
> below.
>
>
>> Another issue is that Cython compile time is increasing with the
>> addition of control flow and cython utilities. If you use fused types
>> you're also going to combinatorially add more compile time.
>
> I don't see that locally - a compiled Cython is hugely fast for me. In
> comparison, the C compiler literally takes ages to compile the result. An
> external shared library may or may not help with both - in particular, it is
> not clear to me what makes the C compiler slow. If the compile time is
> dominated by the number of inlined functions (which is not unlikely), a
> shared library + header file will not make a difference.
>

Have you tried with the memoryviews merged? e.g. if I have this code:

from libc.stdlib cimport malloc
cdef int[:] slice =   malloc(sizeof(int) * 10)

[0] [14:45] ~  ➤ time cython test.pyx
cython test.pyx  2.61s user 0.08s system 99% cpu 2.695 total
[0] [14:45] ~  ➤ time zsh compile
zsh compile  1.88s user 0.06s system 99% cpu 1.946 total

where 'compile' is the script that invoked the same gcc command
distutils uses. As you can see it took more than 2.5 seconds to
compile this code (simply because the memoryview utilities get
included). The C compiler does it quite a lot faster here. This
obviously depends largely on your code, you get probably have it the
other way around as well.

>> I'm sure
>> this came up earlier, but I really think we should have a libcython
>> and a cython.h. libcython (a shared library) should contain any common
>> Cython-specific code not meant to be inlined, and cython.h any types,
>> macros and inline functions etc.
>
> This has a couple of implications though. In order to support this on the
> user side, we have to build one shared library per installed package in
> order to avoid any Cython versioning issues. Just installing a versioned
> "libcython_x.y.z.so" globally isn't enough, especially during development,
> but also at deployment time. Different packages may use different CFLAGS or
> Cython options, which may have an impact on the result. Encoding all
> possible factors in the file name will be cumbersome and may mean that we
> still end up with a number of installed Cython libraries that correlates
> with the number of installed Cython based packages.

Hm, I think the CFLAGS are important so long as they are compatible
with Python. When the user compiles a Cython extension module with
extra CFLAGS, this doesn't affect libpython. Similarly, the Cython
utilities are really not the user's responsibility, so libcython
doesn't need to be compiled with the same flags as the extension
module. If still wanted, the user could either recompile python with
different CFLAGS (which means libcython will get those as well), or
not use libcython at all. CFLAGS should really only pertain to user
code, not to the Cython library, which the user shouldn't be concerned
about.

> Next, we may not know at build time which set of Cython modules is in the
> package. This may be less of an issue if we rely on "cythonize()" in
> setup.py to compile all modules before hand (assuming that the user doesn't
> call it twice, once for *.pyx, once for *.py, for example), but even if we
> know all modules, we'd still have to figure out the complete set of utility
> code used by all modules in order to build an adapted library with only the
> necessary code used in the package. So we'd always end up with a complete
> library with all utility code, which is only really interesting for larger
> packages with several Cython modules.
> I agree with Robert that a CEP would be needed for this, both for clearing
> up the implications and actual use cases (I know that Sage is a reasonable
> use case, but it's also a rather special case).
>
>
>> This will decrease Cython and C
>> compile time, and 

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread mark florisson
On 5 October 2011 08:38, Robert Bradshaw  wrote:
> On Wed, Oct 5, 2011 at 12:16 AM, Stefan Behnel  wrote:
>> mark florisson, 04.10.2011 23:19:
>>>
>>> So I propose that after fused types gets merged we try to move as many
>>> utility codes as possible to their utility code files (unless they are
>>> used in pending pull requests or other branches). Preferably this will
>>> be done in one or a few commits. How should we split up the work
>>
>> I would propose that new utility code gets moved out into utility files
>> right away (if doable, given the current state of the infrastructure), and
>> that existing utility code gets moves when it gets modified or when someone
>> feels like it. Until we really get to the point of wanting to create a
>> separate shared library etc., there's no need to hurry with the move.
>>
>>
>>> We could actually move things before fused types get merged, as long
>>> as we don't touch binding_cfunc_utility_code.
>>
>> Another reason not to hurry, right?
>>
>>
>>> Before we go there, Stefan, do we still want to implement the header
>>> .ini style which can list dependencies and such?
>>
>> I think we'll eventually need that, but that also depends a bit on the
>> question whether we want to (or can) build a shared library or not. See
>> below.
>>
>>
>>> Another issue is that Cython compile time is increasing with the
>>> addition of control flow and cython utilities. If you use fused types
>>> you're also going to combinatorially add more compile time.
>>
>> I don't see that locally - a compiled Cython is hugely fast for me. In
>> comparison, the C compiler literally takes ages to compile the result. An
>> external shared library may or may not help with both - in particular, it is
>> not clear to me what makes the C compiler slow. If the compile time is
>> dominated by the number of inlined functions (which is not unlikely), a
>> shared library + header file will not make a difference.
>>
>>
>>> I'm sure
>>> this came up earlier, but I really think we should have a libcython
>>> and a cython.h. libcython (a shared library) should contain any common
>>> Cython-specific code not meant to be inlined, and cython.h any types,
>>> macros and inline functions etc.
>>
>> This has a couple of implications though. In order to support this on the
>> user side, we have to build one shared library per installed package in
>> order to avoid any Cython versioning issues. Just installing a versioned
>> "libcython_x.y.z.so" globally isn't enough, especially during development,
>> but also at deployment time. Different packages may use different CFLAGS or
>> Cython options, which may have an impact on the result. Encoding all
>> possible factors in the file name will be cumbersome and may mean that we
>> still end up with a number of installed Cython libraries that correlates
>> with the number of installed Cython based packages.
>
> That's a good point. Perhaps an easier first target is to have one
> "libcython" per package (with a randomized or project-specific name).
> Longer-term, I think the goal of one libcython per version is a
> reasonable one, for deployment at least. Exceptional packages (e.g.
> that require a special set of CFLAGS rather than the ones Python was
> built with) can either bundle their own or forgo any sharing of code
> as it is done now, and features that can't be easily normalized across
> (cython and c) compilation options would remain in project-specific
> generated .c files.
>
>> Next, we may not know at build time which set of Cython modules is in the
>> package. This may be less of an issue if we rely on "cythonize()" in
>> setup.py to compile all modules before hand (assuming that the user doesn't
>> call it twice, once for *.pyx, once for *.py, for example), but even if we
>> know all modules, we'd still have to figure out the complete set of utility
>> code used by all modules in order to build an adapted library with only the
>> necessary code used in the package. So we'd always end up with a complete
>> library with all utility code, which is only really interesting for larger
>> packages with several Cython modules.
>
> Yes, I'm thinking we would create relatively complete libraries,
> though if we did things on a per package level perhaps we could do
> some pruning. We could still conditionally put some of the utility
> code (especially the rarely used or shared stuff) into each module.

Yeah that would be nice. I actually think we shouldn't do anything on
a per-package level, only a bunch of modules with related stuff
(conversion utilities/exception raising etc in one module,
buffer/memoryview utilities in another etc). We've been living with
huge files since now, I don't think we suddenly need to actively start
pruning for a little bit of memory.

I think the module approach would also be easy to implement, as the
infrastructure for external cdef functions/classes importing/exporting
is already there.

>> I agree with Robert that a CEP would be needed for thi

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread mark florisson
On 5 October 2011 01:46, Robert Bradshaw  wrote:
> On Tue, Oct 4, 2011 at 2:19 PM, mark florisson
>  wrote:
>> Hey,
>>
>> I briefly mentioned something about this in a pull request, but maybe
>> it deserves some actual discussion on the ML.
>>
>> So I propose that after fused types gets merged we try to move as many
>> utility codes as possible to their utility code files (unless they are
>> used in pending pull requests or other branches). Preferably this will
>> be done in one or a few commits. How should we split up the work, any
>> volunteers? Perhaps people who wrote certain utilities also want to
>> move them? In that case, we should start a new branch and then merge
>> that into master when it's done.
>> We could actually move things before fused types get merged, as long
>> as we don't touch binding_cfunc_utility_code.
>
> +1 to moving towards this, but I don't see the urgency or need to do
> it all at once (though if there's going to be a big push, lets
> coordinate on a wiki or trac).

Hm, perhaps there is no strict need to hurry, as long as we take care
not to modify utilities after they have been moved. The wiki could be
great for that, but I personally don't keep track of everyone's
branches, so I don't know which utility is modified by whom (if at
all), so strictly speaking (to avoid painful merges) I'd have to ask
everyone each time I wanted to move something, or dig through
everyone's branches.

>> Before we go there, Stefan, do we still want to implement the header
>> .ini style which can list dependencies and such? I personally don't
>> care very much about it, but memoryviews and the utility loaders are
>> merged so if someone wants to take up that job, it'd be good to do
>> before moving the utilities.
>>
>> Another issue is that Cython compile time is increasing with the
>> addition of control flow and cython utilities. If you use fused types
>> you're also going to combinatorially add more compile time.
>
> Yeah, this was especially obvious with, e.g. cython.compile(...). (In
> particular, some utility code was being parsed before it could even
> figure out whether it needed to do a full re-compile...)
>
>> I'm sure
>> this came up earlier, but I really think we should have a libcython
>> and a cython.h. libcython (a shared library) should contain any common
>> Cython-specific code not meant to be inlined, and cython.h any types,
>> macros and inline functions etc. This will decrease Cython and C
>> compile time, and will also make executables smaller.
>
> +1. Yes, we talked about this earlier, but nothing concrete was
> planned. It's probably worth a CEP, if anything to have a concrete
> plan recorded somewhere other than a series of mailing list threads
> (though discussion tends to work best here).
>
>> This could be
>> enabled using a command line option to Cython, as well as with
>> distutils, eventually we may decide to make it the default (lets
>> figure that out later). Preferably libcython.so would be installed
>> alongside libpython.so and cython.h inside the Python include
>> directory. Assuming multiple versions of Cython and multiple Python
>> installations, we'd need to come up with a versioning scheme for
>> either.
>
> I would propose a cython.h file that sits in Cython/Compiler/Include
> (or similar), as a first step. The .pyx -> .c pass could be configured
> to copy this to a specific location (for shipping just the generated
> .c files).

That would be fine as well. It might be convenient for users in that
case if we could provide a cython.get_include() in addition to the
distutils hooks, and a cython-config script.

> One option is to build the shared library as a companion
> _cython_x_y_z.so module which, while not as efficient as linking at
> the C level, would probably be much simpler to implement in a
> cross-platform way. (This perhaps merits some benchmarks, but the main
> contents is likely to be things like shared classes and objects.)
> Actually linking .so files from modules that cimport each other would
> be a nice feature down the road anyways. Again, the associated .c file
> could be (optionally) generated/copied during the .pyx -> .c step.
> Installation would determine if the required module exists, and if not
> build and install it.

Hm, that's a really good idea. I think the only overhead would be the
capsule unpacking and pointer duplication, but that shouldn't suddenly
be an issue. That means we don't have to do any versioning of the
libraries and the symbols to avoid clashes in a flat namespaces as
Stefan mentioned.

>> We could also provide a static library there, for users who want to
>> link and ship a compiled and statically linked version of their code.
>> For a local Cython that isn't built, we can ignore the header and
>> shared library option and issue a warning or some such.
>>
>> Lastly, I think we also should figure out a way to serialize Entry
>> objects from CythonUtilities, which could easily and swiftly be loaded
>> when creating 

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread mark florisson
On 5 October 2011 14:54, mark florisson  wrote:
> On 5 October 2011 08:38, Robert Bradshaw  wrote:
>> On Wed, Oct 5, 2011 at 12:16 AM, Stefan Behnel  wrote:
>>> mark florisson, 04.10.2011 23:19:

 So I propose that after fused types gets merged we try to move as many
 utility codes as possible to their utility code files (unless they are
 used in pending pull requests or other branches). Preferably this will
 be done in one or a few commits. How should we split up the work
>>>
>>> I would propose that new utility code gets moved out into utility files
>>> right away (if doable, given the current state of the infrastructure), and
>>> that existing utility code gets moves when it gets modified or when someone
>>> feels like it. Until we really get to the point of wanting to create a
>>> separate shared library etc., there's no need to hurry with the move.
>>>
>>>
 We could actually move things before fused types get merged, as long
 as we don't touch binding_cfunc_utility_code.
>>>
>>> Another reason not to hurry, right?
>>>
>>>
 Before we go there, Stefan, do we still want to implement the header
 .ini style which can list dependencies and such?
>>>
>>> I think we'll eventually need that, but that also depends a bit on the
>>> question whether we want to (or can) build a shared library or not. See
>>> below.
>>>
>>>
 Another issue is that Cython compile time is increasing with the
 addition of control flow and cython utilities. If you use fused types
 you're also going to combinatorially add more compile time.
>>>
>>> I don't see that locally - a compiled Cython is hugely fast for me. In
>>> comparison, the C compiler literally takes ages to compile the result. An
>>> external shared library may or may not help with both - in particular, it is
>>> not clear to me what makes the C compiler slow. If the compile time is
>>> dominated by the number of inlined functions (which is not unlikely), a
>>> shared library + header file will not make a difference.
>>>
>>>
 I'm sure
 this came up earlier, but I really think we should have a libcython
 and a cython.h. libcython (a shared library) should contain any common
 Cython-specific code not meant to be inlined, and cython.h any types,
 macros and inline functions etc.
>>>
>>> This has a couple of implications though. In order to support this on the
>>> user side, we have to build one shared library per installed package in
>>> order to avoid any Cython versioning issues. Just installing a versioned
>>> "libcython_x.y.z.so" globally isn't enough, especially during development,
>>> but also at deployment time. Different packages may use different CFLAGS or
>>> Cython options, which may have an impact on the result. Encoding all
>>> possible factors in the file name will be cumbersome and may mean that we
>>> still end up with a number of installed Cython libraries that correlates
>>> with the number of installed Cython based packages.
>>
>> That's a good point. Perhaps an easier first target is to have one
>> "libcython" per package (with a randomized or project-specific name).
>> Longer-term, I think the goal of one libcython per version is a
>> reasonable one, for deployment at least. Exceptional packages (e.g.
>> that require a special set of CFLAGS rather than the ones Python was
>> built with) can either bundle their own or forgo any sharing of code
>> as it is done now, and features that can't be easily normalized across
>> (cython and c) compilation options would remain in project-specific
>> generated .c files.
>>
>>> Next, we may not know at build time which set of Cython modules is in the
>>> package. This may be less of an issue if we rely on "cythonize()" in
>>> setup.py to compile all modules before hand (assuming that the user doesn't
>>> call it twice, once for *.pyx, once for *.py, for example), but even if we
>>> know all modules, we'd still have to figure out the complete set of utility
>>> code used by all modules in order to build an adapted library with only the
>>> necessary code used in the package. So we'd always end up with a complete
>>> library with all utility code, which is only really interesting for larger
>>> packages with several Cython modules.
>>
>> Yes, I'm thinking we would create relatively complete libraries,
>> though if we did things on a per package level perhaps we could do
>> some pruning. We could still conditionally put some of the utility
>> code (especially the rarely used or shared stuff) into each module.
>
> Yeah that would be nice. I actually think we shouldn't do anything on
> a per-package level, only a bunch of modules with related stuff
> (conversion utilities/exception raising etc in one module,
> buffer/memoryview utilities in another etc). We've been living with
> huge files since now, I don't think we suddenly need to actively start
> pruning for a little bit of memory.
>
> I think the module approach would also be easy to implement, as the

[Cython] scons support

2011-10-05 Thread Neal Becker
I have no idea why this doesn't work for me.

Looking at
http://www.mail-archive.com/cython-dev@codespeak.net/msg09540.html

 scons --version
SCons by Steven Knight et al.:
script: v2.1.0.r5357[MODIFIED], 2011/09/09 21:31:03, by bdeegan on 
ubuntu
engine: v2.1.0.r5357[MODIFIED], 2011/09/09 21:31:03, by bdeegan on 
ubuntu
engine path: ['/usr/lib/scons/SCons']



cyenv = Environment(PYEXT_USE_DISTUTILS=True)
cyenv.Tool("pyext")
cyenv.Tool("cython")
import numpy

cyenv.Append(PYEXTINCPATH=[numpy.get_include()])
cyenv.Replace(CYTHONFLAGS=['--cplus'])
#cyenv.Replace(CXXFILESUFFIX='.cpp')
#cyenv.Replace(CYTHONCFILESUFFIX='.cpp')

cyenv.PythonExtension ('trellis_enc', ['trellis_enc.py'])
-

gives:
cython --cplus -o trellis_enc.c trellis_enc.pyx
gcc -pthread -o trellis_enc.os -c -fPIC -fno-strict-aliasing -O2 -g -pipe -Wall 
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector 
--param=ssp-buffer-size=4 
-m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -
Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 
-m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -I/usr/include/python2.7 -
I/usr/lib64/python2.7/site-packages/numpy/core/include trellis_enc.c
gcc -pthread -shared -o trellis_enc.so trellis_enc.os

Which is OK, except it used '.c' instead of '.cpp'

but if I try:

cyenv = Environment(PYEXT_USE_DISTUTILS=True)
cyenv.Tool("pyext")
cyenv.Tool("cython")
import numpy

cyenv.Append(PYEXTINCPATH=[numpy.get_include()])
cyenv.Replace(CYTHONFLAGS=['--cplus'])
cyenv.Replace(CXXFILESUFFIX='.cpp')
cyenv.Replace(CYTHONCFILESUFFIX='.cpp')

cyenv.PythonExtension ('trellis_enc', ['trellis_enc.py'])
-
cython --cplus -o trellis_enc.cpp trellis_enc.pyx
o trellis_enc.os -c -I/usr/include/python2.7 -I/usr/lib64/python2.7/site-
packages/numpy/core/include trellis_enc.cpp
sh: o: command not found

The 'gcc' command got completely mangled.

???



___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] [cython-users] Re: callback function pointer problem

2011-10-05 Thread Greg Ewing

Robert Bradshaw wrote:


On this note, eventually I would like coerce structs (and unions,
enums) to auto-generated wrapper classes, visible in the Python module
namespace if one declares them as "cpdef struct ..."


Would these wrapper classes contain a copy of the struct,
or would they reference the struct? If they reference it,
there would be issues with the lifetime of the referenced
data.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread Robert Bradshaw
On Wednesday, October 5, 2011, mark florisson wrote:

> On 5 October 2011 01:46, Robert Bradshaw 
> >
> wrote:
> > On Tue, Oct 4, 2011 at 2:19 PM, mark florisson
> > > wrote:
> >> Hey,
> >>
> >> I briefly mentioned something about this in a pull request, but maybe
> >> it deserves some actual discussion on the ML.
> >>
> >> So I propose that after fused types gets merged we try to move as many
> >> utility codes as possible to their utility code files (unless they are
> >> used in pending pull requests or other branches). Preferably this will
> >> be done in one or a few commits. How should we split up the work, any
> >> volunteers? Perhaps people who wrote certain utilities also want to
> >> move them? In that case, we should start a new branch and then merge
> >> that into master when it's done.
> >> We could actually move things before fused types get merged, as long
> >> as we don't touch binding_cfunc_utility_code.
> >
> > +1 to moving towards this, but I don't see the urgency or need to do
> > it all at once (though if there's going to be a big push, lets
> > coordinate on a wiki or trac).
>
> Hm, perhaps there is no strict need to hurry, as long as we take care
> not to modify utilities after they have been moved. The wiki could be
> great for that, but I personally don't keep track of everyone's
> branches, so I don't know which utility is modified by whom (if at
> all), so strictly speaking (to avoid painful merges) I'd have to ask
> everyone each time I wanted to move something, or dig through
> everyone's branches.
>

I was proposing that everyone lists the utility code sections that are
likely to cause merge conflicts on a wiki page, and the rest are fair game.


> >> Before we go there, Stefan, do we still want to implement the header
> >> .ini style which can list dependencies and such? I personally don't
> >> care very much about it, but memoryviews and the utility loaders are
> >> merged so if someone wants to take up that job, it'd be good to do
> >> before moving the utilities.
> >>
> >> Another issue is that Cython compile time is increasing with the
> >> addition of control flow and cython utilities. If you use fused types
> >> you're also going to combinatorially add more compile time.
> >
> > Yeah, this was especially obvious with, e.g. cython.compile(...). (In
> > particular, some utility code was being parsed before it could even
> > figure out whether it needed to do a full re-compile...)
> >
> >> I'm sure
> >> this came up earlier, but I really think we should have a libcython
> >> and a cython.h. libcython (a shared library) should contain any common
> >> Cython-specific code not meant to be inlined, and cython.h any types,
> >> macros and inline functions etc. This will decrease Cython and C
> >> compile time, and will also make executables smaller.
> >
> > +1. Yes, we talked about this earlier, but nothing concrete was
> > planned. It's probably worth a CEP, if anything to have a concrete
> > plan recorded somewhere other than a series of mailing list threads
> > (though discussion tends to work best here).
> >
> >> This could be
> >> enabled using a command line option to Cython, as well as with
> >> distutils, eventually we may decide to make it the default (lets
> >> figure that out later). Preferably libcython.so would be installed
> >> alongside libpython.so and cython.h inside the Python include
> >> directory. Assuming multiple versions of Cython and multiple Python
> >> installations, we'd need to come up with a versioning scheme for
> >> either.
> >
> > I would propose a cython.h file that sits in Cython/Compiler/Include
> > (or similar), as a first step. The .pyx -> .c pass could be configured
> > to copy this to a specific location (for shipping just the generated
> > .c files).
>
> That would be fine as well. It might be convenient for users in that
> case if we could provide a cython.get_include() in addition to the
> distutils hooks, and a cython-config script.
>

For sure. We could also have a cython.get_shared_library() (common_code?
cython_module?) which would return an Extension object to build.


> > One option is to build the shared library as a companion
> > _cython_x_y_z.so module which, while not as efficient as linking at
> > the C level, would probably be much simpler to implement in a
> > cross-platform way. (This perhaps merits some benchmarks, but the main
> > contents is likely to be things like shared classes and objects.)
> > Actually linking .so files from modules that cimport each other would
> > be a nice feature down the road anyways. Again, the associated .c file
> > could be (optionally) generated/copied during the .pyx -> .c step.
> > Installation would determine if the required module exists, and if not
> > build and install it.
>
> Hm, that's a really good idea. I think the only overhead would be the
> capsule unpacking and pointer duplication, but that shouldn't suddenly
> be an issue. That means we don't have to do any versioning of the

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread Robert Bradshaw
On Wednesday, October 5, 2011, mark florisson wrote:

> On 5 October 2011 08:16, Stefan Behnel >
> wrote:
> > mark florisson, 04.10.2011 23:19:
> >>
> >> So I propose that after fused types gets merged we try to move as many
> >> utility codes as possible to their utility code files (unless they are
> >> used in pending pull requests or other branches). Preferably this will
> >> be done in one or a few commits. How should we split up the work
> >
> > I would propose that new utility code gets moved out into utility files
> > right away (if doable, given the current state of the infrastructure),
> and
> > that existing utility code gets moves when it gets modified or when
> someone
> > feels like it. Until we really get to the point of wanting to create a
> > separate shared library etc., there's no need to hurry with the move.
> >
> >
> >> We could actually move things before fused types get merged, as long
> >> as we don't touch binding_cfunc_utility_code.
> >
> > Another reason not to hurry, right?
> >
> >
> >> Before we go there, Stefan, do we still want to implement the header
> >> .ini style which can list dependencies and such?
> >
> > I think we'll eventually need that, but that also depends a bit on the
> > question whether we want to (or can) build a shared library or not. See
> > below.
> >
> >
> >> Another issue is that Cython compile time is increasing with the
> >> addition of control flow and cython utilities. If you use fused types
> >> you're also going to combinatorially add more compile time.
> >
> > I don't see that locally - a compiled Cython is hugely fast for me. In
> > comparison, the C compiler literally takes ages to compile the result. An
> > external shared library may or may not help with both - in particular, it
> is
> > not clear to me what makes the C compiler slow. If the compile time is
> > dominated by the number of inlined functions (which is not unlikely), a
> > shared library + header file will not make a difference.
> >
>
> Have you tried with the memoryviews merged? e.g. if I have this code:
>
> from libc.stdlib cimport malloc
> cdef int[:] slice =   malloc(sizeof(int) * 10)
>
> [0] [14:45] ~  ➤ time cython test.pyx
> cython test.pyx  2.61s user 0.08s system 99% cpu 2.695 total
> [0] [14:45] ~  ➤ time zsh compile
> zsh compile  1.88s user 0.06s system 99% cpu 1.946 total
>
> where 'compile' is the script that invoked the same gcc command
> distutils uses. As you can see it took more than 2.5 seconds to
> compile this code (simply because the memoryview utilities get
> included). The C compiler does it quite a lot faster here. This
> obviously depends largely on your code, you get probably have it the
> other way around as well.
>

Anything we can do to cache/dedupe things here would be great.


> >> I'm sure
> >> this came up earlier, but I really think we should have a libcython
> >> and a cython.h. libcython (a shared library) should contain any common
> >> Cython-specific code not meant to be inlined, and cython.h any types,
> >> macros and inline functions etc.
> >
> > This has a couple of implications though. In order to support this on the
> > user side, we have to build one shared library per installed package in
> > order to avoid any Cython versioning issues. Just installing a versioned
> > "libcython_x.y.z.so" globally isn't enough, especially during
> development,
> > but also at deployment time. Different packages may use different CFLAGS
> or
> > Cython options, which may have an impact on the result. Encoding all
> > possible factors in the file name will be cumbersome and may mean that we
> > still end up with a number of installed Cython libraries that correlates
> > with the number of installed Cython based packages.
>
> Hm, I think the CFLAGS are important so long as they are compatible
> with Python. When the user compiles a Cython extension module with
> extra CFLAGS, this doesn't affect libpython. Similarly, the Cython
> utilities are really not the user's responsibility, so libcython
> doesn't need to be compiled with the same flags as the extension
> module. If still wanted, the user could either recompile python with
> different CFLAGS (which means libcython will get those as well), or
> not use libcython at all. CFLAGS should really only pertain to user
> code, not to the Cython library, which the user shouldn't be concerned
> about.
>
> > Next, we may not know at build time which set of Cython modules is in the
> > package. This may be less of an issue if we rely on "cythonize()" in
> > setup.py to compile all modules before hand (assuming that the user
> doesn't
> > call it twice, once for *.pyx, once for *.py, for example), but even if
> we
> > know all modules, we'd still have to figure out the complete set of
> utility
> > code used by all modules in order to build an adapted library with only
> the
> > necessary code used in the package. So we'd always end up with a complete
> > library with all utility code, which is only really interes

Re: [Cython] Utilities, cython.h, libcython

2011-10-05 Thread Stefan Behnel

mark florisson, 05.10.2011 15:53:

On 5 October 2011 08:16, Stefan Behnel wrote:

mark florisson, 04.10.2011 23:19:

Another issue is that Cython compile time is increasing with the
addition of control flow and cython utilities. If you use fused types
you're also going to combinatorially add more compile time.


I don't see that locally - a compiled Cython is hugely fast for me. In
comparison, the C compiler literally takes ages to compile the result. An
external shared library may or may not help with both - in particular, it is
not clear to me what makes the C compiler slow. If the compile time is
dominated by the number of inlined functions (which is not unlikely), a
shared library + header file will not make a difference.


Have you tried with the memoryviews merged?


No. I didn't expect the difference to be quite that large.



e.g. if I have this code:

from libc.stdlib cimport malloc
cdef int[:] slice =malloc(sizeof(int) * 10)

[0] [14:45] ~  ➤ time cython test.pyx
cython test.pyx  2.61s user 0.08s system 99% cpu 2.695 total
[0] [14:45] ~  ➤ time zsh compile
zsh compile  1.88s user 0.06s system 99% cpu 1.946 total

where 'compile' is the script that invoked the same gcc command
distutils uses.  As you can see it took more than 2.5 seconds to
compile this code (simply because the memoryview utilities get
included).


Ok, that hints at serious performance problems. Could you profile it to see 
where the issues are? Is it more that the code is loaded from an external 
file? Or the fact that more utility code is parsed than necessary?


It's certainly not obvious why the inclusion of static code, even from an 
external file, should make any difference.


That being said, it's not we were lacking the infrastructure for making 
Python code run faster ...




I'm sure
this came up earlier, but I really think we should have a libcython
and a cython.h. libcython (a shared library) should contain any common
Cython-specific code not meant to be inlined, and cython.h any types,
macros and inline functions etc.


This has a couple of implications though. In order to support this on the
user side, we have to build one shared library per installed package in
order to avoid any Cython versioning issues. Just installing a versioned
"libcython_x.y.z.so" globally isn't enough, especially during development,
but also at deployment time. Different packages may use different CFLAGS or
Cython options, which may have an impact on the result. Encoding all
possible factors in the file name will be cumbersome and may mean that we
still end up with a number of installed Cython libraries that correlates
with the number of installed Cython based packages.


Hm, I think the CFLAGS are important so long as they are compatible
with Python. When the user compiles a Cython extension module with
extra CFLAGS, this doesn't affect libpython. Similarly, the Cython
utilities are really not the user's responsibility, so libcython
doesn't need to be compiled with the same flags as the extension
module. If still wanted, the user could either recompile python with
different CFLAGS (which means libcython will get those as well), or
not use libcython at all. CFLAGS should really only pertain to user
code, not to the Cython library, which the user shouldn't be concerned
about.


Well, it's either the user or the OS distribution that installs (and 
potentially builds) the libraries. That already makes it two responsible 
entities for many systems that have to agree on what gets installed in what 
way. I'm just saying, don't underestimate the details in world wide 
deployments.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel