Re: [Cython] cython.parallel tasks, single, master, critical, barriers

Dag Sverre Seljebotn Sun, 09 Oct 2011 05:57:54 -0700

On 10/09/2011 02:18 PM, Dag Sverre Seljebotn wrote:

On 10/09/2011 02:11 PM, mark florisson wrote:

Hey,


So far people have been enthusiastic about the cython.parallel features,
I think we should introduce some new features. I propose the following,


Great!!

I only have time for a very short feedback now, perhaps more will follow.

assume parallel has been imported from cython:

with parallel.master():
this is executed in the master thread in a parallel (non-prange)
section

with parallel.single():
same as master, except any thread may do the execution

An optional keyword argument 'nowait' specifies whether there will be a
barrier at the end. The default is to wait.


I like

if parallel.is_master():
    ...
explicit_barrier_somehow() # see below

better as a Pythonization. One could easily support is_master to be usedin other contexts as well, simply by assigning a status flag in themaster block.

Using an if-test flows much better with Python I feel, but thatnaturally lead to making the barrier explicit. But I like the barrieralways being explicit, rather than having it as a predicate on all thedifferent constructs like in OpenMP....

I'm less sure about single, since making it a function indicates onecould use it in other contexts and the whole thing becomes too magic(since it's tied to the position of invocation). I'm tempted to suggest


for _ in prange(1):
    ...

as our syntax for single.


with parallel.task():
create a task to be executed by some thread in the team
once a thread takes up the task it shall only be executed by that
thread and no other thread (so the task will be tied to the thread)

C variables will be firstprivate
Python objects will be shared

parallel.taskwait() # wait on any direct descendent tasks to finish


Regarding tasks, I think this is mapping OpenMP too close to Python.
Closures are excellent for the notion of a task, so I think something
based on the futures API would work better. I realize that makes the
mapping to OpenMP and implementation a bit more difficult, but I think
it is worth it in the long run.


with parallel.critical():
this section of code is mutually exclusive with other critical sections
optional keyword argument 'name' specifies a name for the critical
section,
which means all sections with that name will exclude each other,
but not
critical sections with different names

Note: all threads that encounter the section will execute it, just
not at the same time


Yes, this works well as a with-statement...

..except that it is slightly magic in that it binds to call position(unlike anything in Python). I.e. this would be more "correct", or atleast Pythonic:


with parallel.critical(__file__, __line__):
    ...


with parallel.barrier():
all threads wait until everyone has reached the barrier
either no one or everyone should encounter the barrier
shared variables are flushed


I have problems with requiring a noop with block...

I'd much rather write

parallel.barrier()

However, that ties a function call to the place of invocation, andsuggests that one could do


if rand() > .5:
    barrier()
else:
    i += 3
    barrier()

and have the same barrier in each case. Again,

barrier(__file__, __line__)

gets us purity at the cost of practicality. Another way is the pthreadsapproach (although one may have to use pthread rather then OpenMP to getit, unless there are named barriers?):


barrier_a = parallel.barrier()
barrier_b = parallel.barrier()
with parallel:
    barrier_a.wait()
    if rand() > .5:
        barrier_b.wait()
    else:
        i += 3
        barrier_b.wait()


I'm really not sure here.


Unfortunately, gcc again manages to horribly break master and single
constructs in loops (versions 4.2 throughout 4.6), so I suppose I'll
first file a bug report. Other (better) compilers like Portland (and I'm
sure Intel) work fine. I suppose a warning in the documentation will
suffice there.

If we at some point implement vector/SIMD operations we could also try
out the Fortran openmp workshare construct.


I'm starting to learn myself OpenCL as part of a course. It's very neat
for some kinds of parallelism. What I'm saying is that at least of the
case of SIMD, we should not lock ourselves to Fortran+OpenMP thinking
too early, but also look forward to coming architectures (e.g., AMD's
GPU-and-CPU on same die design).

Dag Sverre
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Re: [Cython] cython.parallel tasks, single, master, critical, barriers

Reply via email to