Re: [Cython] Status

2020-01-31 Thread John Skaller2

> The mystery to me is why MacOSX introduced .dylib instead of
> sticking with .so.

There were *.so files and hacks to load them. But the structure od dylib
is different and uses a slightly different loader, dyld. I guess they wanted
to make a distinction. The had some kind of Obj C dynamic plugin things
as well. Software history is full of regrets.


—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread John Skaller2


> On 31 Jan 2020, at 16:51, Greg Ewing  wrote:
> 
> On 31/01/20 9:47 am, John Skaller2 wrote:
> 
 2. pyport is plain wrong. It contains conflicting C typedefs.
>>> 
>>> PRs welcome.
>> Is this your prefered method (pull request)?
> 
> I'm sure PRs are very welcome, but at the least you could
> give us some idea of what these conflicting typedefs are!


The file is small:

cdef extern from "Python.h":
ctypedef int int32_t
ctypedef int int64_t
ctypedef unsigned int uint32_t
ctypedef unsigned int uint64_t

Obviously this is an incorrect translation of the original source.
One of each pair may well be correct. But its impossible both are.
Defining a symbol defined in the C99 standard seems like a bad idea.

Python’s pyport.h actually says:

#include 
..
#define PY_UINT32_T uint32_t
#define PY_UINT64_T uint64_t

/* Signed variants of the above */
#define PY_INT32_T int32_t
#define PY_INT64_T int64_t

…


—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Adding GPU support to cython

2020-01-31 Thread Schlimbach, Frank
Hi,
I opened a feature ticket: https://github.com/cython/cython/issues/3342
It describes my current prototype based on OpenMP.

Any feedback?

Also, I would like to do some more advanced analysis to improve the 
map-clauses. I do not want to go to a complex index analysis or alike, but a 
simple access analysis should cover many cases. All I would like to figure out 
is if a given variable (memview) was used (other than instantiated) before 
and/or after the device/parallel/device) block and ideally of a use was 
definitely a read-only. Any suggestion/hint how to do that?

Thanks

frank

-Original Message-
From: cython-devel  
On Behalf Of Schlimbach, Frank
Sent: Friday, January 24, 2020 12:55 PM
To: Core developer mailing list of the Cython compiler 
Subject: Re: [Cython] Adding GPU support to cython

Hi Stefan,
thanks for your response. Good to hear this is still of interest.

Yes, I realized these are rather old CEPs. I spent some time with looking into 
the Cython code and concluded that it'd be the most consistent (and simplest) 
approach to stick with OpenMP and use it's offload pragmas (e.g. 'target' 
introduced in 4.5). Given a properly setup compiler this would in theory only 
require one or two compiler flags to enable offloading. I even have a first 
prototype which generates code that existing compilers seem to swallow. It's 
not ready for a PR since I have not been able to get it linked an run on GPU 
and I wanted to get some general feedback first. You can find the code on my 
offload branch https://github.com/fschlimb/cython/tree/offload (it's wip so 
please apologize that not all comments have been updated yet to reflect my 
changes).

Here's what it does:
- accept a new 'with' directive 'device' which marks a region/block to be 
offloaded to a device (OpenMP target)
  - I also considered extending 'gil' or 'parallel' to accept an optional 
'device' argument but an extra directive seemed more general/flexible to also 
allow non-parallel code
  - I don't believe we should try to automate offloading right now. Once we 
have something that works on explicit demand we can still think about a 
performance model and auto-enable offloading.
- the DeviceWithBlockNode is added to the 'parallel stack' and can occur only 
as the outmost parallel directive
- a 'with device()' requires 'nogil'
- a 'with device()' will create a new scope annotated with a '#pragma omp 
target'
  - all variables which get assigned within the 'with device()' block are 
currently mapped as 'tofrom'
  - all other variables used are mapped as 'to'
  - identifying 'from' candidates is harder and not yet done (need to know that 
there is required allocation but no assignment before the 'with device()' block)
  - identifying 'alloc' candidates would also need additional analysis (e.g. 
not used outside the 'device()' block)
- all object mode stuff (like exceptions for error handling) are currently 
disabled in a 'with device()' block

Example:

def f(int[:,::1] X):
cdef int v = 1
cdef int i
with gil, device(), parallel():
for i in prange(4):
X[i] = v

the 'with device' block becomes something like (simplified)

{
size_t __pyx_v_X__count = __pyx_v_X.shape[0]*__pyx_v_X.shape[1];
#pragma omp target map(to: __pyx_v_v) map(tofrom: __pyx_v_i , 
__pyx_v_X.data[0:__pyx_v_X__count], __pyx_v_X.memview, __pyx_v_X.shape, 
__pyx_v_X.strides, __pyx_v_X.suboffsets)
{
#pragma omp parallel
#pragma omp for firstprivate(__pyx_v_i) lastprivate(__pyx_v_i)
for((__pyx_v_i=0; __pyx_v_i<4; ++__pyx_v_i) {
 __pyx_v_X[__pyx_v_i] = __pyx_v_v;
}
}
}

There are lots of things to be added and improved, in particular I am currently 
adding an optional argument 'map' to 'device()' which allows manually setting 
the map-clauses for each variable. This is necessary to allow not only 
optimizations but also sending only partial array data to/from the device (like 
when the device memory cannot hold an entire array the developer would block 
the computation). We can probably add some magic for simple cases but there is 
probably no solution for the general problem of determining the accessed 
index-space.

Among others, things to also look at include
- non-contiguous arrays/memviews
- overlapping arrays/memviews
- keeping data on the device between 'with device()' blocks (USM (unified 
shared memory) or omp target data?)
- error handling
- tests
- docu/comments

I found that the functionality I needed to touch is somewhat scattered around 
the compiler pipeline. It might be worth thinking about restructuring a few 
things to make the whole OpenMP/parallel/offload stuff more maintainable. Of 
course you might see other solutions than mine which make this simpler.

Any thoughts/feedback/usecases appreciated

frank

-Original Message-
From: cython-devel  
On Behalf Of Stefan Behnel
Sent: Friday, January 24, 2020 11:22 AM
To: cython-devel@python.org
Subject: Re: [C

Re: [Cython] Status

2020-01-31 Thread John Skaller2

ob should be PyObject*
>>> 
>>> No, the declaration looks correct to me. The input is an object.
>> I don’t understand. ob isn’t a type, is it? A type is required.
> 
> It's a (dummy) parameter name. Cython defaults to "object" when a
> type isn't specified.
> 
> Looking at the other declarations in that file, it was probably
> *meant* to say "object ob", but it's not wrong -- it still works
> that way.

Ok, but now the syntax is made very context sensitive.
To interpret it correctly, you have to know “ob” is not a type.
And the Python docs make exactly the same mistake.
In C this would not work because there is no default type,
so the Python docs are wrong because they’re supposedly
documenting C. [The only case it could be correct would
be if the symbol were a macro]

And my translator script got fooled, because it assumes
any single identifier used as a parameter is a type,
and if two words are used, the first is a type and the second
can be discarded, except in the special case “unsigned int”.

Note, I’m just trying to help by bringing up inconsistencies,
which are things my simplistic translator script can’t handle.

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread Greg Ewing

On 1/02/20 12:25 am, John Skaller2 wrote:

cdef extern from "Python.h":
 ctypedef int int32_t
 ctypedef int int64_t
 ctypedef unsigned int uint32_t
 ctypedef unsigned int uint64_t


These work because Cython doesn't need to know the exact
sizes of these types. All it needs to know is that they're
some kind of integer so that its type checks will pass.
The typedef names end up in the generated C code, and the
C compiler figures out their actual sizes.


Obviously this is an incorrect translation of the original source.


Extern declarations in Cython are not meant to be exact
translations. They only need to tell Cython enough about
the thing being declared so that it can cope.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread Greg Ewing

On 1/02/20 12:34 am, John Skaller2 wrote:


Ok, but now the syntax is made very context sensitive.
To interpret it correctly, you have to know “ob” is not a type.


Yes, Cython mostly follows C declaration syntax, and C also has
this property.


In C this would not work because there is no default type,


Yes, there is -- the default type in C is int. This is
a valid function definition in C:

  f(x) {
  }

It's equivalent to

  int f(int x) {
  }


And my translator script got fooled, because it assumes
any single identifier used as a parameter is a type,


Then it's making an incorrect assumption.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread John Skaller2


> On 1 Feb 2020, at 00:36, Greg Ewing  wrote:
> 
> On 1/02/20 12:25 am, John Skaller2 wrote:
>> cdef extern from "Python.h":
>> ctypedef int int32_t
>> ctypedef int int64_t
>> ctypedef unsigned int uint32_t
>> ctypedef unsigned int uint64_t
> 
> These work because Cython doesn't need to know the exact
> sizes of these types. All it needs to know is that they're
> some kind of integer so that its type checks will pass.
> The typedef names end up in the generated C code, and the
> C compiler figures out their actual sizes.

Ah. I see. That makes sense. So this is some kind of hack way
of getting something a bit like Haskell type classes,
you’re basically saying int32_t and int64_t are of class “Integer”.

This also explains the conflict for me, because Felix is the opposite:
it aims to make the types of things more precise (and has actual
type classes for generalisation).
—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread John Skaller2


> 
> Yes, there is -- the default type in C is int.

I don’t think that is true in C99 but I’m not sure.

Its definitely not allowed in C++.
I know because I actually moved the motion on the C++ ISO committee
to disallow it :-)

In any case its a bad idea in an interface specification even if it’s legal.

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] Size of output

2020-01-31 Thread John Skaller2
When I ran Cython on a two line Python function I got this from wc:

4276   13798  161338 oldtest.c

It took a while to actually find the implementation of the function.

A lot of the emitted code appeared to be run time and compile
time support code which wasn’t actually used.

Eliminating stuff that isn’t required with dependency tracking is nontrivial,
and not much use whereas a single self contained compilable C files
is very useful. Is there an option to use an #include for the standard stuff?

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread Greg Ewing

On 1/02/20 3:03 am, John Skaller2 wrote:


 So this is some kind of hack way
of getting something a bit like Haskell type classes,
you’re basically saying int32_t and int64_t are of class “Integer”.


I suppose you could think of it that way, but it's really
not that formal.


This also explains the conflict for me, because Felix is the opposite:
it aims to make the types of things more precise (and has actual
type classes for generalisation).


To define them any more precisely, Cython would need to
know how things vary depending on the platform, which would
mean conditional compilation, etc. It's much easier to leave
all that up to the C compiler and system headers. It also
ensures that there can't be any mismatch between the two.

--
Greg

___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread Greg Ewing

On 1/02/20 3:08 am, John Skaller2 wrote:


I don’t think that is true in C99 but I’m not sure.


You may be right, I haven't been keeping up with all the
twists and turns of recent C standards. The gcc I just
tried it on allowed it, but warned about it.


In any case its a bad idea in an interface specification even if it’s legal.


Perhaos in C, but I think it makes sense for types in
Cython to default to object, because it deals with objects
so much. It means that functions that take and return
objects exclusively look like Python.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Size of output

2020-01-31 Thread Greg Ewing

On 1/02/20 3:17 am, John Skaller2 wrote:

When I ran Cython on a two line Python function I got this from wc:

 4276   13798  161338 oldtest.c


That seems a bit excessive.


A lot of the emitted code appeared to be run time and compile
time support code which wasn’t actually used.


Not sure what's going on there. Pyrex made efforts to only
include support code that was actually used, but Cython
has changed a lot since then and I haven't been following
its development closely. Either it's slipped on that, or
the support code has become more bloated.

Can you remove any of it and still have it compile? If
so, filing a bug report might be useful.


Is there an option to use an #include for the standard stuff?


There are upsides and downsides to that as well. The way
things are, the generated file is self-contained, and can
be shipped without worrying about it becoming disconnected
from a compatible version of the include file. This is
important when details of the support code can change
without notice between Cython releases.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Size of output

2020-01-31 Thread Robert Bradshaw
On Fri, Jan 31, 2020 at 3:17 PM Greg Ewing  wrote:
>
> On 1/02/20 3:17 am, John Skaller2 wrote:
> > When I ran Cython on a two line Python function I got this from wc:
> >
> >  4276   13798  161338 oldtest.c
>
> That seems a bit excessive.
>
> > A lot of the emitted code appeared to be run time and compile
> > time support code which wasn’t actually used.
>
> Not sure what's going on there. Pyrex made efforts to only
> include support code that was actually used, but Cython
> has changed a lot since then and I haven't been following
> its development closely. Either it's slipped on that, or
> the support code has become more bloated.

Cython attempts to do the same.

Taking a quick glance at an auto-generated file for an empty .pyx, we have

~200 lines of macros normalizing various C compiler issues
~500 lines defining macros to normalize across Python 2.7-3.9
~200 lines of providing defaults for various CYTHON_ macros
~300 lines of macros for optional optimizations for CPython details
(vs. using more public/pypy compatible, ... APIs)
~300 lines module setup code. Even for trivial modules, we still
declare and call functions for creating globals, preparing types, etc.
even if we don't have any globals, types, etc.
~300 lines exception handling and traceback creation
~700 lines conversion for basic int and string types (which we assume
to be available in various utilities).

Much of this is macro-heavy code, to allow maximum flexibility at C
compile time, but much would get elided by the preprocessor for any
particular environment.

Extra utility code is inserted on an as-needed bases, e.g. function
creation, various dataytype optimizations, other type conversions,
etc. These are re-used within a module. A two line function could add
a lot (e.g. just defining a function and its wrapper is a good chunk
of code, and whatever the function does of course).

I agree there's some fat that could be trimmed there, but not sure
it'd be worth the effort.

> Can you remove any of it and still have it compile? If
> so, filing a bug report might be useful.

+1

> > Is there an option to use an #include for the standard stuff?
>
> There are upsides and downsides to that as well. The way
> things are, the generated file is self-contained, and can
> be shipped without worrying about it becoming disconnected
> from a compatible version of the include file. This is
> important when details of the support code can change
> without notice between Cython releases.

+1

We have an option "common_utility_include_dir" that would create a
shared utility folder into which the compiler could create (versioned)
#includable files to possibly be shared across many modules, but it
was never completely finished (and in particular was difficult to
reconcile with cycache, which is like ccache for Cython, due to the
outside references a cython artifact could then produce). We've
thought of going even further and providing a shared runtime library,
but that has some of the same issues (plus more, though in some cases
we use the pattern where every module declares type X, but before
creating its own looks to see if one was already loaded to let modules
share the same internal type at runtime).
___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Status

2020-01-31 Thread John Skaller2


> On 1 Feb 2020, at 09:49, Greg Ewing  wrote:
> 
> On 1/02/20 3:03 am, John Skaller2 wrote:
>> So this is some kind of hack way
>> of getting something a bit like Haskell type classes,
>> you’re basically saying int32_t and int64_t are of class “Integer”.
> 
> I suppose you could think of it that way, but it's really
> not that formal.
> 
>> This also explains the conflict for me, because Felix is the opposite:
>> it aims to make the types of things more precise (and has actual
>> type classes for generalisation).
> 
> To define them any more precisely, Cython would need to
> know how things vary depending on the platform, which would
> mean conditional compilation, etc. It's much easier to leave
> all that up to the C compiler and system headers. It also
> ensures that there can't be any mismatch between the two.

But the all hell breaks loose for pointers. Your hack only 
works for rvalues. Of course you probably know this doesn’t occur.

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Size of output

2020-01-31 Thread John Skaller2
> 
>> Is there an option to use an #include for the standard stuff?
> 
> There are upsides and downsides to that as well.

Hence an option. But it could be work to implement so I’m just
exploring at the moment.

> The way
> things are, the generated file is self-contained, and can
> be shipped without worrying about it becoming disconnected
> from a compatible version of the include file. This is
> important when details of the support code can change
> without notice between Cython releases.

Yes, and in this case an include file may actually be better
because it will upgrade with Cython. YMMV I guess.

But the main reason is to remove a lot of useless clutter.

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Size of output

2020-01-31 Thread John Skaller2

> I agree there's some fat that could be trimmed there, but not sure
> it'd be worth the effort.

You’re probably right. Its a problem writing a compiler in a language
wholy unsuited for the job. Even with a more suitable language,
emitting code, in the right order, with just the things actually required,
is difficult. I use a multi-pass predictive system and a multi-pass
code generator and I find bugs all the time because it isn’t run by
actually dependencies but predicted ones. 



—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel


[Cython] linkage

2020-01-31 Thread John Skaller2
OMG. Python is moving backwards. Some people just have no understanding
of tech. As of 3.8, extensions must not be dynamically linked to libpython.
This doesn’t apply to Windows or MacOS because that’s the only way on those 
platforms. 

But Debian/Ubuntu was always wrong and now the error is being made canonical.

If anyone here knows a way on Linux to fix this, with some sort of stub loader
for example, I’d be interested. All my code is linked with visibility=default,
and all dynamic loads use two level namespaces, i.e, the symbol table of a
shared library being imported is only visible to the importer.

The may be some impact on Cython, since its primary job is building
Python extensions.

BTW: its all due to a stupid bug in ld which links shared libraries
without bothering to check external references are satisfiable.
Until load time, maybe.. :-)

—
John Skaller
skal...@internode.on.net





___
cython-devel mailing list
cython-devel@python.org
https://mail.python.org/mailman/listinfo/cython-devel