Re: [Cython] Status
> The mystery to me is why MacOSX introduced .dylib instead of > sticking with .so. There were *.so files and hacks to load them. But the structure od dylib is different and uses a slightly different loader, dyld. I guess they wanted to make a distinction. The had some kind of Obj C dynamic plugin things as well. Software history is full of regrets. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
> On 31 Jan 2020, at 16:51, Greg Ewing wrote: > > On 31/01/20 9:47 am, John Skaller2 wrote: > 2. pyport is plain wrong. It contains conflicting C typedefs. >>> >>> PRs welcome. >> Is this your prefered method (pull request)? > > I'm sure PRs are very welcome, but at the least you could > give us some idea of what these conflicting typedefs are! The file is small: cdef extern from "Python.h": ctypedef int int32_t ctypedef int int64_t ctypedef unsigned int uint32_t ctypedef unsigned int uint64_t Obviously this is an incorrect translation of the original source. One of each pair may well be correct. But its impossible both are. Defining a symbol defined in the C99 standard seems like a bad idea. Python’s pyport.h actually says: #include .. #define PY_UINT32_T uint32_t #define PY_UINT64_T uint64_t /* Signed variants of the above */ #define PY_INT32_T int32_t #define PY_INT64_T int64_t … — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Adding GPU support to cython
Hi, I opened a feature ticket: https://github.com/cython/cython/issues/3342 It describes my current prototype based on OpenMP. Any feedback? Also, I would like to do some more advanced analysis to improve the map-clauses. I do not want to go to a complex index analysis or alike, but a simple access analysis should cover many cases. All I would like to figure out is if a given variable (memview) was used (other than instantiated) before and/or after the device/parallel/device) block and ideally of a use was definitely a read-only. Any suggestion/hint how to do that? Thanks frank -Original Message- From: cython-devel On Behalf Of Schlimbach, Frank Sent: Friday, January 24, 2020 12:55 PM To: Core developer mailing list of the Cython compiler Subject: Re: [Cython] Adding GPU support to cython Hi Stefan, thanks for your response. Good to hear this is still of interest. Yes, I realized these are rather old CEPs. I spent some time with looking into the Cython code and concluded that it'd be the most consistent (and simplest) approach to stick with OpenMP and use it's offload pragmas (e.g. 'target' introduced in 4.5). Given a properly setup compiler this would in theory only require one or two compiler flags to enable offloading. I even have a first prototype which generates code that existing compilers seem to swallow. It's not ready for a PR since I have not been able to get it linked an run on GPU and I wanted to get some general feedback first. You can find the code on my offload branch https://github.com/fschlimb/cython/tree/offload (it's wip so please apologize that not all comments have been updated yet to reflect my changes). Here's what it does: - accept a new 'with' directive 'device' which marks a region/block to be offloaded to a device (OpenMP target) - I also considered extending 'gil' or 'parallel' to accept an optional 'device' argument but an extra directive seemed more general/flexible to also allow non-parallel code - I don't believe we should try to automate offloading right now. Once we have something that works on explicit demand we can still think about a performance model and auto-enable offloading. - the DeviceWithBlockNode is added to the 'parallel stack' and can occur only as the outmost parallel directive - a 'with device()' requires 'nogil' - a 'with device()' will create a new scope annotated with a '#pragma omp target' - all variables which get assigned within the 'with device()' block are currently mapped as 'tofrom' - all other variables used are mapped as 'to' - identifying 'from' candidates is harder and not yet done (need to know that there is required allocation but no assignment before the 'with device()' block) - identifying 'alloc' candidates would also need additional analysis (e.g. not used outside the 'device()' block) - all object mode stuff (like exceptions for error handling) are currently disabled in a 'with device()' block Example: def f(int[:,::1] X): cdef int v = 1 cdef int i with gil, device(), parallel(): for i in prange(4): X[i] = v the 'with device' block becomes something like (simplified) { size_t __pyx_v_X__count = __pyx_v_X.shape[0]*__pyx_v_X.shape[1]; #pragma omp target map(to: __pyx_v_v) map(tofrom: __pyx_v_i , __pyx_v_X.data[0:__pyx_v_X__count], __pyx_v_X.memview, __pyx_v_X.shape, __pyx_v_X.strides, __pyx_v_X.suboffsets) { #pragma omp parallel #pragma omp for firstprivate(__pyx_v_i) lastprivate(__pyx_v_i) for((__pyx_v_i=0; __pyx_v_i<4; ++__pyx_v_i) { __pyx_v_X[__pyx_v_i] = __pyx_v_v; } } } There are lots of things to be added and improved, in particular I am currently adding an optional argument 'map' to 'device()' which allows manually setting the map-clauses for each variable. This is necessary to allow not only optimizations but also sending only partial array data to/from the device (like when the device memory cannot hold an entire array the developer would block the computation). We can probably add some magic for simple cases but there is probably no solution for the general problem of determining the accessed index-space. Among others, things to also look at include - non-contiguous arrays/memviews - overlapping arrays/memviews - keeping data on the device between 'with device()' blocks (USM (unified shared memory) or omp target data?) - error handling - tests - docu/comments I found that the functionality I needed to touch is somewhat scattered around the compiler pipeline. It might be worth thinking about restructuring a few things to make the whole OpenMP/parallel/offload stuff more maintainable. Of course you might see other solutions than mine which make this simpler. Any thoughts/feedback/usecases appreciated frank -Original Message- From: cython-devel On Behalf Of Stefan Behnel Sent: Friday, January 24, 2020 11:22 AM To: cython-devel@python.org Subject: Re: [C
Re: [Cython] Status
ob should be PyObject* >>> >>> No, the declaration looks correct to me. The input is an object. >> I don’t understand. ob isn’t a type, is it? A type is required. > > It's a (dummy) parameter name. Cython defaults to "object" when a > type isn't specified. > > Looking at the other declarations in that file, it was probably > *meant* to say "object ob", but it's not wrong -- it still works > that way. Ok, but now the syntax is made very context sensitive. To interpret it correctly, you have to know “ob” is not a type. And the Python docs make exactly the same mistake. In C this would not work because there is no default type, so the Python docs are wrong because they’re supposedly documenting C. [The only case it could be correct would be if the symbol were a macro] And my translator script got fooled, because it assumes any single identifier used as a parameter is a type, and if two words are used, the first is a type and the second can be discarded, except in the special case “unsigned int”. Note, I’m just trying to help by bringing up inconsistencies, which are things my simplistic translator script can’t handle. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
On 1/02/20 12:25 am, John Skaller2 wrote: cdef extern from "Python.h": ctypedef int int32_t ctypedef int int64_t ctypedef unsigned int uint32_t ctypedef unsigned int uint64_t These work because Cython doesn't need to know the exact sizes of these types. All it needs to know is that they're some kind of integer so that its type checks will pass. The typedef names end up in the generated C code, and the C compiler figures out their actual sizes. Obviously this is an incorrect translation of the original source. Extern declarations in Cython are not meant to be exact translations. They only need to tell Cython enough about the thing being declared so that it can cope. -- Greg ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
On 1/02/20 12:34 am, John Skaller2 wrote: Ok, but now the syntax is made very context sensitive. To interpret it correctly, you have to know “ob” is not a type. Yes, Cython mostly follows C declaration syntax, and C also has this property. In C this would not work because there is no default type, Yes, there is -- the default type in C is int. This is a valid function definition in C: f(x) { } It's equivalent to int f(int x) { } And my translator script got fooled, because it assumes any single identifier used as a parameter is a type, Then it's making an incorrect assumption. -- Greg ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
> On 1 Feb 2020, at 00:36, Greg Ewing wrote: > > On 1/02/20 12:25 am, John Skaller2 wrote: >> cdef extern from "Python.h": >> ctypedef int int32_t >> ctypedef int int64_t >> ctypedef unsigned int uint32_t >> ctypedef unsigned int uint64_t > > These work because Cython doesn't need to know the exact > sizes of these types. All it needs to know is that they're > some kind of integer so that its type checks will pass. > The typedef names end up in the generated C code, and the > C compiler figures out their actual sizes. Ah. I see. That makes sense. So this is some kind of hack way of getting something a bit like Haskell type classes, you’re basically saying int32_t and int64_t are of class “Integer”. This also explains the conflict for me, because Felix is the opposite: it aims to make the types of things more precise (and has actual type classes for generalisation). — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
> > Yes, there is -- the default type in C is int. I don’t think that is true in C99 but I’m not sure. Its definitely not allowed in C++. I know because I actually moved the motion on the C++ ISO committee to disallow it :-) In any case its a bad idea in an interface specification even if it’s legal. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] Size of output
When I ran Cython on a two line Python function I got this from wc: 4276 13798 161338 oldtest.c It took a while to actually find the implementation of the function. A lot of the emitted code appeared to be run time and compile time support code which wasn’t actually used. Eliminating stuff that isn’t required with dependency tracking is nontrivial, and not much use whereas a single self contained compilable C files is very useful. Is there an option to use an #include for the standard stuff? — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
On 1/02/20 3:03 am, John Skaller2 wrote: So this is some kind of hack way of getting something a bit like Haskell type classes, you’re basically saying int32_t and int64_t are of class “Integer”. I suppose you could think of it that way, but it's really not that formal. This also explains the conflict for me, because Felix is the opposite: it aims to make the types of things more precise (and has actual type classes for generalisation). To define them any more precisely, Cython would need to know how things vary depending on the platform, which would mean conditional compilation, etc. It's much easier to leave all that up to the C compiler and system headers. It also ensures that there can't be any mismatch between the two. -- Greg ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
On 1/02/20 3:08 am, John Skaller2 wrote: I don’t think that is true in C99 but I’m not sure. You may be right, I haven't been keeping up with all the twists and turns of recent C standards. The gcc I just tried it on allowed it, but warned about it. In any case its a bad idea in an interface specification even if it’s legal. Perhaos in C, but I think it makes sense for types in Cython to default to object, because it deals with objects so much. It means that functions that take and return objects exclusively look like Python. -- Greg ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Size of output
On 1/02/20 3:17 am, John Skaller2 wrote: When I ran Cython on a two line Python function I got this from wc: 4276 13798 161338 oldtest.c That seems a bit excessive. A lot of the emitted code appeared to be run time and compile time support code which wasn’t actually used. Not sure what's going on there. Pyrex made efforts to only include support code that was actually used, but Cython has changed a lot since then and I haven't been following its development closely. Either it's slipped on that, or the support code has become more bloated. Can you remove any of it and still have it compile? If so, filing a bug report might be useful. Is there an option to use an #include for the standard stuff? There are upsides and downsides to that as well. The way things are, the generated file is self-contained, and can be shipped without worrying about it becoming disconnected from a compatible version of the include file. This is important when details of the support code can change without notice between Cython releases. -- Greg ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Size of output
On Fri, Jan 31, 2020 at 3:17 PM Greg Ewing wrote: > > On 1/02/20 3:17 am, John Skaller2 wrote: > > When I ran Cython on a two line Python function I got this from wc: > > > > 4276 13798 161338 oldtest.c > > That seems a bit excessive. > > > A lot of the emitted code appeared to be run time and compile > > time support code which wasn’t actually used. > > Not sure what's going on there. Pyrex made efforts to only > include support code that was actually used, but Cython > has changed a lot since then and I haven't been following > its development closely. Either it's slipped on that, or > the support code has become more bloated. Cython attempts to do the same. Taking a quick glance at an auto-generated file for an empty .pyx, we have ~200 lines of macros normalizing various C compiler issues ~500 lines defining macros to normalize across Python 2.7-3.9 ~200 lines of providing defaults for various CYTHON_ macros ~300 lines of macros for optional optimizations for CPython details (vs. using more public/pypy compatible, ... APIs) ~300 lines module setup code. Even for trivial modules, we still declare and call functions for creating globals, preparing types, etc. even if we don't have any globals, types, etc. ~300 lines exception handling and traceback creation ~700 lines conversion for basic int and string types (which we assume to be available in various utilities). Much of this is macro-heavy code, to allow maximum flexibility at C compile time, but much would get elided by the preprocessor for any particular environment. Extra utility code is inserted on an as-needed bases, e.g. function creation, various dataytype optimizations, other type conversions, etc. These are re-used within a module. A two line function could add a lot (e.g. just defining a function and its wrapper is a good chunk of code, and whatever the function does of course). I agree there's some fat that could be trimmed there, but not sure it'd be worth the effort. > Can you remove any of it and still have it compile? If > so, filing a bug report might be useful. +1 > > Is there an option to use an #include for the standard stuff? > > There are upsides and downsides to that as well. The way > things are, the generated file is self-contained, and can > be shipped without worrying about it becoming disconnected > from a compatible version of the include file. This is > important when details of the support code can change > without notice between Cython releases. +1 We have an option "common_utility_include_dir" that would create a shared utility folder into which the compiler could create (versioned) #includable files to possibly be shared across many modules, but it was never completely finished (and in particular was difficult to reconcile with cycache, which is like ccache for Cython, due to the outside references a cython artifact could then produce). We've thought of going even further and providing a shared runtime library, but that has some of the same issues (plus more, though in some cases we use the pattern where every module declares type X, but before creating its own looks to see if one was already loaded to let modules share the same internal type at runtime). ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Status
> On 1 Feb 2020, at 09:49, Greg Ewing wrote: > > On 1/02/20 3:03 am, John Skaller2 wrote: >> So this is some kind of hack way >> of getting something a bit like Haskell type classes, >> you’re basically saying int32_t and int64_t are of class “Integer”. > > I suppose you could think of it that way, but it's really > not that formal. > >> This also explains the conflict for me, because Felix is the opposite: >> it aims to make the types of things more precise (and has actual >> type classes for generalisation). > > To define them any more precisely, Cython would need to > know how things vary depending on the platform, which would > mean conditional compilation, etc. It's much easier to leave > all that up to the C compiler and system headers. It also > ensures that there can't be any mismatch between the two. But the all hell breaks loose for pointers. Your hack only works for rvalues. Of course you probably know this doesn’t occur. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Size of output
> >> Is there an option to use an #include for the standard stuff? > > There are upsides and downsides to that as well. Hence an option. But it could be work to implement so I’m just exploring at the moment. > The way > things are, the generated file is self-contained, and can > be shipped without worrying about it becoming disconnected > from a compatible version of the include file. This is > important when details of the support code can change > without notice between Cython releases. Yes, and in this case an include file may actually be better because it will upgrade with Cython. YMMV I guess. But the main reason is to remove a lot of useless clutter. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Size of output
> I agree there's some fat that could be trimmed there, but not sure > it'd be worth the effort. You’re probably right. Its a problem writing a compiler in a language wholy unsuited for the job. Even with a more suitable language, emitting code, in the right order, with just the things actually required, is difficult. I use a multi-pass predictive system and a multi-pass code generator and I find bugs all the time because it isn’t run by actually dependencies but predicted ones. — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
[Cython] linkage
OMG. Python is moving backwards. Some people just have no understanding of tech. As of 3.8, extensions must not be dynamically linked to libpython. This doesn’t apply to Windows or MacOS because that’s the only way on those platforms. But Debian/Ubuntu was always wrong and now the error is being made canonical. If anyone here knows a way on Linux to fix this, with some sort of stub loader for example, I’d be interested. All my code is linked with visibility=default, and all dynamic loads use two level namespaces, i.e, the symbol table of a shared library being imported is only visible to the importer. The may be some impact on Cython, since its primary job is building Python extensions. BTW: its all due to a stupid bug in ld which links shared libraries without bothering to check external references are satisfiable. Until load time, maybe.. :-) — John Skaller skal...@internode.on.net ___ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel