Re: [Cython] OpenMP support

2011-03-11 Thread Dag Sverre Seljebotn

On 03/11/2011 08:20 AM, Stefan Behnel wrote:

Robert Bradshaw, 11.03.2011 01:46:
On Tue, Mar 8, 2011 at 11:16 AM, Francesc Alted  
wrote:

A Tuesday 08 March 2011 18:50:15 Stefan Behnel escrigué:

mark florisson, 08.03.2011 18:00:

What I meant was that the
wrapper returned by the decorator would have to call the closure
for every iteration, which introduces function call overhead.

[...]

I guess we just have to establish what we want to do: do we
want to support code with Python objects (and exceptions etc), or
just C code written in Cython?


I like the approach that Sturla mentioned: using closures to
implement worker threads. I think that's very pythonic. You could do
something like this, for example:

  def worker():
  for item in queue:
  with nogil:
  do_stuff(item)

  queue.extend(work_items)
  start_threads(worker, count)

Note that the queue is only needed to tell the thread what to work
on. A lot of things can be shared over the closure. So the queue may
not even be required in many cases.


I like this approach too.  I suppose that you will need to annotate the
items so that they are not Python objects, no?  Something like:

 def worker():
 cdef int item  # tell that item is not a Python object!
 for item in queue:
 with nogil:
 do_stuff(item)

 queue.extend(work_items)
 start_threads(worker, count)


On a slightly higher level, are we just trying to use OpenMP from
Cython, or are we trying to build it into the language? If the former,
it may make sense to stick closer than one might otherwise be tempted
in terms of API to the underlying C to leverage the existing
documentation. A library with a more Pythonic interface could perhaps
be written on top of that. Alternatively, if we're building it into
Cython itself, I'd it might be worth modeling it after the
multiprocessing module (though I understand it would be implemented
with threads), which I think is a decent enough model for managing
embarrassingly parallel operations.


+1



The above code is similar to that,
though I'd prefer the for loop implicit rather than as part of the
worker method (or at least as an argument).


It provides a simple way to write per-thread initialisation code, 
though. And it's likely easier to make looping fast than to speed up 
the call into a closure. However, eventually, both ways will need to 
be supported anyway.




If we went this route,
what are the advantages of using OpenMP over, say, pthreads in the
background? (And could the latter be done with just a library + some
fancy GIL specifications?)


In the above example, basically everything is explicit and nothing 
more than a simplified threading setup is needed. Even the 
implementation of "start_threads()" could be done in a couple of lines 
of Python code, including the collection of results and errors. If 
someone thinks we need more than that, I'd like to see a couple of 
concrete use cases and code examples first.




One thing that's nice about OpenMP as
implemented in C is that the serial code looks almost exactly like the
parallel code; the code at http://wiki.cython.org/enhancements/openmp
has this property too.


Writing it with a closure isn't really that much different. You can 
put the inner function right where it would normally get executed and 
add a bit of calling/load distributing code below it. Not that bad IMO.


It may be worth providing some ready-to-use decorators to do the load 
balancing, but I don't really like the idea of having a decorator 
magically invoke the function in-place that it decorates.




Also, I like the idea of being able to hold the GIL by the invoking
thread and having the "sharing" threads do the appropriate locking
among themselves when needed if possible, e.g. for exception raising.


I like the explicit "with nogil" block in my example above. It makes 
it easy to use normal Python setup code, to synchronise based on the 
GIL if desired (e.g. to use a normal Python queue for communication), 
and it's simple enough not to get in the way.


I'm supporting Robert here. Basically, I'm +1 to anything that can make 
me pretend the GIL doesn't exist, even if it comes with a 2x performance 
hit: Because that will make me write parallell code (which I can't be 
bothered to do in Cython currently), and I have 4 cores on the laptop I 
use for debugging, so I'd still get a 2x speedup.


Perhaps the long-term solution is something like an "autogil" mode could 
work where Cython automatically releases the GIL on blocks where it can 
(such as a typed for-loop), and acquires it back when needed (an 
exception-raising if-block within said for-loop). And when doing 
multi-threading, GIL-requiring calls are dispatched to a master 
GIL-holding thread (which would not be a worker thread, i.e. on 4 cores 
you'd have 4 workers + 1 GIL-holding support thread). So the advice for 
speeding up code is simply "make sure your co

Re: [Cython] OpenMP support

2011-03-11 Thread Matej Laitl
> On a slightly higher level, are we just trying to use OpenMP from
> Cython, or are we trying to build it into the language? If the former,
> it may make sense to stick closer than one might otherwise be tempted
> in terms of API to the underlying C to leverage the existing
> documentation. A library with a more Pythonic interface could perhaps
> be written on top of that. Alternatively, if we're building it into
> Cython itself, I'd it might be worth modeling it after the
> multiprocessing module (though I understand it would be implemented
> with threads), which I think is a decent enough model for managing
> embarrassingly parallel operations. The above code is similar to that,
> though I'd prefer the for loop implicit rather than as part of the
> worker method (or at least as an argument). If we went this route,
> what are the advantages of using OpenMP over, say, pthreads in the
> background? (And could the latter be done with just a library + some
> fancy GIL specifications?) One thing that's nice about OpenMP as
> implemented in C is that the serial code looks almost exactly like the
> parallel code; the code at http://wiki.cython.org/enhancements/openmp
> has this property too.

+1.

I'm strongly for implementing thin and low-level support for OpenMP at the 
first place instead of (ab?)using it to implement high-level threading API.

Also, code like that would have an advantage (significant for my project[1]) of 
being compilable by older cython / interpretable by python with no cython at 
all ("pure" pure-python mode):

#pragma omp parallel for private(var1) reduction(+:var2) schedule(guided)
for i in range(n):
do_work(i)

[1] http://github.com/strohel/PyBayes

Also, the implementation should be straightforward, I use to patch generated 
.c manually and only python → c variable name translation is needed, + perhaps 
taking care of temp variables that need to be thread-local.

My five cents,
  Matěj Laitl
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Sturla Molden

Den 11.03.2011 01:46, skrev Robert Bradshaw:

On a slightly higher level, are we just trying to use OpenMP from
Cython, or are we trying to build it into the language?


OpenMP is a specification, not a particular implementation. 
Implementation for Cython should either be compiler pragmas or a library.


I'd like it to be a library, as it should also be usable from Python. I 
have made some progress on the library route, depending on Cython's 
closures.


nogil makes things feel a bit awkward, though.

We could for example imagine code like this:

with openmp.ordered(i):


Context managers are forbidden in nogil AFAIK.  So we end up with ugly 
hacks like this:


with nogil:
   if openmp._ordered(i): # always returns 1, but will synchronize


Would it be possible to:

- Make context managers that are allowed without the GIL? We don't need 
to worry about exceptions, but it should be possible to short-circuit 
from __enter__ to __exit__.


- Have cpdefs that are callable without the GIL?

This would certainly make OpenMP syntax look cleaner.


Sturla























___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Stefan Behnel

Dag Sverre Seljebotn, 11.03.2011 08:56:

Basically, I'm +1 to anything that can make me
pretend the GIL doesn't exist, even if it comes with a 2x performance hit:
Because that will make me write parallell code (which I can't be bothered
to do in Cython currently), and I have 4 cores on the laptop I use for
debugging, so I'd still get a 2x speedup.

Perhaps the long-term solution is something like an "autogil" mode could
work where Cython automatically releases the GIL on blocks where it can
(such as a typed for-loop), and acquires it back when needed (an
exception-raising if-block within said for-loop).


I assume you mean this to become a decorator or other option written into 
the code.




And when doing
multi-threading, GIL-requiring calls are dispatched to a master GIL-holding
thread (which would not be a worker thread, i.e. on 4 cores you'd have 4
workers + 1 GIL-holding support thread). So the advice for speeding up code
is simply "make sure your code is all typed", just like before, but people
can follow that advice without even having to learn about the GIL.


The GIL does not only protect the interpreter core. It also protects C 
level data structures in user code and keeps threaded code from running 
amok. Releasing and acquiring it doesn't come for free either, so besides 
likely breaking code that was not specifically written to be reentrant, 
releasing it automatically may also introduce a performance penalty for 
many users.


I'm very happy the GIL exists, and I'm against anything that tries to 
disable it automatically. Threading is an extremely dangerous programming 
model. The GIL has its gotchas, too, but it still simplifies it quite a 
bit. Actually, threading is so complex and easy to get wrong, that any 
threaded code should always be written specifically to support threading. 
Explicitly acquiring and releasing the GIL is really just a minor issue on 
that path.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Stefan Behnel

Sturla Molden, 11.03.2011 12:13:

OpenMP is a specification, not a particular implementation. Implementation
for Cython should either be compiler pragmas or a library.

I'd like it to be a library, as it should also be usable from Python. I
have made some progress on the library route, depending on Cython's closures.

nogil makes things feel a bit awkward, though.

We could for example imagine code like this:

with openmp.ordered(i):


Context managers are forbidden in nogil AFAIK. So we end up with ugly hacks
like this:

with nogil:
if openmp._ordered(i): # always returns 1, but will synchronize


Would it be possible to:

- Make context managers that are allowed without the GIL? We don't need to
worry about exceptions, but it should be possible to short-circuit from
__enter__ to __exit__.


The two special methods are Python methods. There is no way to call them 
without holding the GIL. If we wanted to enable something like a context 
manager inside of a nogil block, it would necessarily be a different protocol.




- Have cpdefs that are callable without the GIL?


"cpdef" functions are meant to be overridable from Python code, so, no, you 
cannot call them from within a nogil block as they may actually have been 
overridden.


What's your use actual case for this?

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Dag Sverre Seljebotn

On 03/11/2011 12:37 PM, Stefan Behnel wrote:

Dag Sverre Seljebotn, 11.03.2011 08:56:

Basically, I'm +1 to anything that can make me
pretend the GIL doesn't exist, even if it comes with a 2x performance 
hit:
Because that will make me write parallell code (which I can't be 
bothered

to do in Cython currently), and I have 4 cores on the laptop I use for
debugging, so I'd still get a 2x speedup.

Perhaps the long-term solution is something like an "autogil" mode could
work where Cython automatically releases the GIL on blocks where it can
(such as a typed for-loop), and acquires it back when needed (an
exception-raising if-block within said for-loop).


I assume you mean this to become a decorator or other option written 
into the code.




And when doing
multi-threading, GIL-requiring calls are dispatched to a master 
GIL-holding

thread (which would not be a worker thread, i.e. on 4 cores you'd have 4
workers + 1 GIL-holding support thread). So the advice for speeding 
up code
is simply "make sure your code is all typed", just like before, but 
people

can follow that advice without even having to learn about the GIL.


The GIL does not only protect the interpreter core. It also protects C 
level data structures in user code and keeps threaded code from 
running amok. Releasing and acquiring it doesn't come for free either, 
so besides likely breaking code that was not specifically written to 
be reentrant, releasing it automatically may also introduce a 
performance penalty for many users.


The intention was that the GIL would be acquired in exceptional 
circumstances (doesn't matter for overall performance) or during 
debugging (again don't care about performance). But I agree the idea 
needs more thought on the possible pitfalls.





I'm very happy the GIL exists, and I'm against anything that tries to 
disable it automatically. Threading is an extremely dangerous 
programming model. The GIL has its gotchas, too, but it still 
simplifies it quite a bit. Actually, threading is so complex and easy 
to get wrong, that any threaded code should always be written 
specifically to support threading. Explicitly acquiring and releasing 
the GIL is really just a minor issue on that path.


I guess the point is that OpenMP takes that "extremely dangerous 
programming model" and makes it tractable, at least for a class of 
trivial problems (not necessarily SIMD, but almost).


BTW, threading is often used simply because how how array data is laid 
out in memory. Typical usecase is every thread write to different 
non-overlapping blocks of the same array (and read from the same input 
arrays that are not changed). Then you move on to step B, which does the 
same, but perhaps blocks the arrays in a different way between threads. 
Then step C blocks the data in yet another way, etc. But at each step 
it's just "input arrays, non-overlapping blocks in output arrays", 
global parameters, local loop counters.


(One doesn't need to use threads, there was another thread on 
multiprocessing + shared memory arrays.)


Just saying that not all use of threads is "extremely dangerous", and 
OpenMP exists explicitly to dumb threading down for those cases.


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Stefan Behnel

Hi,

ticket 654 describes a code generation problem where the arguments to a 
cdef function are not being evaluated in the order they are written down in 
the code.


http://trac.cython.org/cython_trac/ticket/654

This introduces problems when the arguments have side effects or are not 
simple, e.g. they are function calls themselves or are taken from object 
attributes or live in a closure. For example,


f(g(a), a.x, h(a))

will produce different results depending on the evaluation order if g or h 
change the value of a.x.


However, apparently, the order of evaluation is only guaranteed by Python, 
not by C. Now, the question is: what are the right semantics for Cython 
here: follow Python or C?


Personally, I think it would be nice to keep up Python's semantics, but 
when I implemented this I broke quite some code in Sage (you may have 
noticed that the sage-build project in Hudson has been red for a while). 
There are things in C and especially in C++ that cannot be easily copied 
into a temporary variable in order to make sure they are evaluated before 
the following arguments. This is not a problem for Python function calls 
where all arguments end up being copied (and often converted) anyway. It is 
a problem for C function calls, though.


What do you think about this?

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Sturla Molden

Den 11.03.2011 11:42, skrev Matej Laitl:

#pragma omp parallel for private(var1) reduction(+:var2) schedule(guided)
for i in range(n):
 do_work(i)



I do like this, as it is valid Python and can be turned on/off with a 
compiler flag to Cython.


Issues to warn about:

- We cannot jump out of a parallel block from C/C++/Fortran (goto, 
longjmp, C++ exception). That applies to Python exceptions as well, and 
the generated Cython code.


- GIL issue: CPython interpreter actually call GetCurrentThreadId(), 
e.g. in thread_nt.h. So the OpenMP thread using the Python CAPI must be 
the OS thread holding the GIL. It is not sufficient that the master 
thread does.


- Remember that NumPy arrays are unboxed. Those local variables should 
be silently passed as firstprivate.


- Refcounting with Python objects and private variables.

None of the above applies if we go with the library approach. But then 
it would look less like OpenMP in C.


Also, do we want this?

   #pragma omp parallel
   if 1:


It is a consequence of just re-using the C-syntax for OpenMP, as 
intendation matters in Cython. There are no anonymous blocks similar to 
C in Cython:


   #pragma omp parallel
   {

   }




Sturla













___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Vitja Makarov
2011/3/11 Stefan Behnel :
> Hi,
>
> ticket 654 describes a code generation problem where the arguments to a cdef
> function are not being evaluated in the order they are written down in the
> code.
>
> http://trac.cython.org/cython_trac/ticket/654
>
> This introduces problems when the arguments have side effects or are not
> simple, e.g. they are function calls themselves or are taken from object
> attributes or live in a closure. For example,
>
>    f(g(a), a.x, h(a))
>
> will produce different results depending on the evaluation order if g or h
> change the value of a.x.
>
> However, apparently, the order of evaluation is only guaranteed by Python,
> not by C. Now, the question is: what are the right semantics for Cython
> here: follow Python or C?
>

+1 for Python, Cython is mostly Python than C, btw I don't think that
it makes much sense.

> Personally, I think it would be nice to keep up Python's semantics, but when
> I implemented this I broke quite some code in Sage (you may have noticed
> that the sage-build project in Hudson has been red for a while). There are
> things in C and especially in C++ that cannot be easily copied into a
> temporary variable in order to make sure they are evaluated before the
> following arguments. This is not a problem for Python function calls where
> all arguments end up being copied (and often converted) anyway. It is a
> problem for C function calls, though.
>
> What do you think about this?
>


>    f(g(a), a.x, h(a))

Why could not this be translated into:

tmp1 = g(a)
tmp2 = a.x
tmp3 = h(a)

f(tmp1, tmp2, tmp3)


-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Stefan Behnel

Vitja Makarov, 11.03.2011 15:04:

2011/3/11 Stefan Behnel:

Personally, I think it would be nice to keep up Python's semantics, but when
I implemented this I broke quite some code in Sage (you may have noticed
that the sage-build project in Hudson has been red for a while). There are
things in C and especially in C++ that cannot be easily copied into a
temporary variable in order to make sure they are evaluated before the
following arguments. This is not a problem for Python function calls where
all arguments end up being copied (and often converted) anyway. It is a
problem for C function calls, though.



f(g(a), a.x, h(a))


Why could not this be translated into:

tmp1 = g(a)
tmp2 = a.x
tmp3 = h(a)

f(tmp1, tmp2, tmp3)


See above.

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Sturla Molden

Den 11.03.2011 12:43, skrev Stefan Behnel:


What's your use actual case for this?



Just avoid different syntax inside and outside nogil-blocks. I like this 
style


with openmp.critical:


better than what is currently legal with nogil:

openmp.critical()
if 1:

openmp.end_critical()


Sturla










___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Stefan Behnel

Sturla Molden, 11.03.2011 15:19:

Den 11.03.2011 12:43, skrev Stefan Behnel:


What's your use actual case for this?



Just avoid different syntax inside and outside nogil-blocks. I like this style

with openmp.critical:


better than what is currently legal with nogil:

openmp.critical()
if 1:

openmp.end_critical()


No, I meant, what is your specific need to "cpdef" in this context?

Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Lisandro Dalcin
On 11 March 2011 09:45, Stefan Behnel  wrote:
> Hi,
>
> ticket 654 describes a code generation problem where the arguments to a cdef
> function are not being evaluated in the order they are written down in the
> code.
>
> http://trac.cython.org/cython_trac/ticket/654
>
> This introduces problems when the arguments have side effects or are not
> simple, e.g. they are function calls themselves or are taken from object
> attributes or live in a closure. For example,
>
>    f(g(a), a.x, h(a))
>
> will produce different results depending on the evaluation order if g or h
> change the value of a.x.
>
> However, apparently, the order of evaluation is only guaranteed by Python,
> not by C. Now, the question is: what are the right semantics for Cython
> here: follow Python or C?
>
> Personally, I think it would be nice to keep up Python's semantics, but when
> I implemented this I broke quite some code in Sage (you may have noticed
> that the sage-build project in Hudson has been red for a while). There are
> things in C and especially in C++ that cannot be easily copied into a
> temporary variable in order to make sure they are evaluated before the
> following arguments. This is not a problem for Python function calls where
> all arguments end up being copied (and often converted) anyway. It is a
> problem for C function calls, though.
>
> What do you think about this?
>

Regarding our previous history, I would go for Python semantics...

Perhaps you could add a compiler directive for these rare cases where
you need/want C/C++ semantics?


-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
3000 Santa Fe, Argentina
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Stefan Behnel

Stefan Behnel, 11.03.2011 15:08:

Vitja Makarov, 11.03.2011 15:04:

2011/3/11 Stefan Behnel:

Personally, I think it would be nice to keep up Python's semantics, but
when
I implemented this I broke quite some code in Sage (you may have noticed
that the sage-build project in Hudson has been red for a while). There are
things in C and especially in C++ that cannot be easily copied into a
temporary variable in order to make sure they are evaluated before the
following arguments. This is not a problem for Python function calls where
all arguments end up being copied (and often converted) anyway. It is a
problem for C function calls, though.



f(g(a), a.x, h(a))


Why could not this be translated into:

tmp1 = g(a)
tmp2 = a.x
tmp3 = h(a)

f(tmp1, tmp2, tmp3)


See above.


To be a little clearer here, it's a problem in C for example with struct 
values. Copying them by value into a temp variable can be expensive, 
potentially twice as expensive as simply passing them into the function 
normally.


Not sure what kind of additional devilry C++ provides here, but I'd expect 
that object values can exhibit bizarre behaviour when being assigned. Maybe 
others can enlighten me here.


I have no idea how many cases there actually are that we can't handle or 
that may lead to a performance degradation when using temps, but the 
problem is that they exist at all.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Francesc Alted
A Friday 11 March 2011 11:42:26 Matej Laitl escrigué:
> I'm strongly for implementing thin and low-level support for OpenMP
> at the first place instead of (ab?)using it to implement high-level
> threading API.

My opinion on this continues to be -1.  I'm afraid that difficult access 
to OpenMP on Windows platforms (lack of OpenMP in MSVC Express is, as I 
see it, a major showstopper, although perhaps GCC 4.x on Win is stable 
enough already, I don't know) would prevent of *true* portability of 
OpenMP-powered Cython programs.

IMHO, going to the native Python threads + possibly new Cython syntax is 
a better venue.  But I'd like to be proved that the problem for Win is 
not that grave...

-- 
Francesc Alted
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Robert Bradshaw
On Fri, Mar 11, 2011 at 9:36 AM, Stefan Behnel  wrote:
> Stefan Behnel, 11.03.2011 15:08:
>>
>> Vitja Makarov, 11.03.2011 15:04:
>>>
>>> 2011/3/11 Stefan Behnel:

 Personally, I think it would be nice to keep up Python's semantics, but
 when
 I implemented this I broke quite some code in Sage (you may have noticed
 that the sage-build project in Hudson has been red for a while). There
 are
 things in C and especially in C++ that cannot be easily copied into a
 temporary variable in order to make sure they are evaluated before the
 following arguments. This is not a problem for Python function calls
 where
 all arguments end up being copied (and often converted) anyway. It is a
 problem for C function calls, though.
>>>
 f(g(a), a.x, h(a))
>>>
>>> Why could not this be translated into:
>>>
>>> tmp1 = g(a)
>>> tmp2 = a.x
>>> tmp3 = h(a)
>>>
>>> f(tmp1, tmp2, tmp3)
>>
>> See above.
>
> To be a little clearer here, it's a problem in C for example with struct
> values. Copying them by value into a temp variable can be expensive,
> potentially twice as expensive as simply passing them into the function
> normally.

Yep, and some types (e.g. array types) can't be assigned to at all.
FWIW, the issues with Sage is that many libraries use the "stack
allocated, pass-by-reference" trick

typedef foo_struct foo_t[1]

but we have declared them to be of type "void*" because we don't care
about or want to muck with the internals. Cleaning this up is
something we should do, but is taking a while, and aside from that it
makes Cython even more dependent on correct type declarations (and
backwards incompatible in this regard). Sage is a great test for the
buildbot, so keeping it red for so long is not a good thing either.

> Not sure what kind of additional devilry C++ provides here, but I'd expect
> that object values can exhibit bizarre behaviour when being assigned. Maybe
> others can enlighten me here.

Yes, C++ allows overloading of the assignment operator, so assigning
may lead to arbitrary code (as well as probably an expensive copy, as
with structs, and structs in C++ are really just classes with a
different visibility).

> I have no idea how many cases there actually are that we can't handle or
> that may lead to a performance degradation when using temps, but the problem
> is that they exist at all.

Note that this applies not just to function arguments, but a host of
other places, e.g. the order of evaluation of "f() + g()" in C is
unspecified. Fleshing this out completely will lead to a lot more
temps and verbose C code. And then we'd just cross our fingers that
the C compiler was able to optimize all these temps away (though still
possibly producing inferior code if it were able to switch order of
execution).

Whatever we do, we need a flag/directive. Perhaps there's a way to
guarantee correct ordering for all valid Python code, even in the
presence of (function and variable) type inference, but allow some
leeway for explicitly declared C functions.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Greg Ewing

Stefan Behnel wrote:

This introduces problems when the arguments have side effects or are not 
simple, e.g.


f(g(a), a.x, h(a))

>

What do you think about this?


I think it's a bad idea to write code that relies on the order
of evaluation like this. If the order matters, it's better to
be explicit about it:

   arg1 = g(a)
   arg2 = h(a)
   f(arg1, a.x, arg2)

So I would say it's not something worth worrying about overly
much.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Out of order side effects of argument evaluation in function calls (ticket #654)

2011-03-11 Thread Greg Ewing

Stefan Behnel wrote:

To be a little clearer here, it's a problem in C for example with struct 
values. Copying them by value into a temp variable can be expensive, 
potentially twice as expensive as simply passing them into the function 
normally.


What are you actually proposing to do here? Anything that
Cython does in the generated code to guarantee evaluation
order is going to involve using temp variables, and thus
incur the same overhead.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] OpenMP support

2011-03-11 Thread Sturla Molden
The free C/C++ compiler in Windows SDK supports OpenMP. This is the  
system C compiler on Windows.


Visual C++ Express is an IDE for beginners and hobbyists.

OpenMP on GCC is the same on Windows as on any other platform.

Sturla




A Friday 11 March 2011 11:42:26 Matej Laitl escrigué:

I'm strongly for implementing thin and low-level support for OpenMP
at the first place instead of (ab?)using it to implement high-level
threading API.


My opinion on this continues to be -1.  I'm afraid that difficult  
access
to OpenMP on Windows platforms (lack of OpenMP in MSVC Express is,  
as I

see it, a major showstopper, although perhaps GCC 4.x on Win is stable
enough already, I don't know) would prevent of *true* portability of
OpenMP-powered Cython programs.

IMHO, going to the native Python threads + possibly new Cython  
syntax is

a better venue.  But I'd like to be proved that the problem for Win is
not that grave...

--
Francesc Alted
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel