On 10/09/2011 02:18 PM, Dag Sverre Seljebotn wrote:
On 10/09/2011 02:11 PM, mark florisson wrote:
Hey,

So far people have been enthusiastic about the cython.parallel features,
I think we should introduce some new features. I propose the following,

Great!!

I only have time for a very short feedback now, perhaps more will follow.

assume parallel has been imported from cython:

with parallel.master():
this is executed in the master thread in a parallel (non-prange)
section

with parallel.single():
same as master, except any thread may do the execution

An optional keyword argument 'nowait' specifies whether there will be a
barrier at the end. The default is to wait.

I like

if parallel.is_master():
    ...
explicit_barrier_somehow() # see below

better as a Pythonization. One could easily support is_master to be used in other contexts as well, simply by assigning a status flag in the master block.

Using an if-test flows much better with Python I feel, but that naturally lead to making the barrier explicit. But I like the barrier always being explicit, rather than having it as a predicate on all the different constructs like in OpenMP....

I'm less sure about single, since making it a function indicates one could use it in other contexts and the whole thing becomes too magic (since it's tied to the position of invocation). I'm tempted to suggest

for _ in prange(1):
    ...

as our syntax for single.


with parallel.task():
create a task to be executed by some thread in the team
once a thread takes up the task it shall only be executed by that
thread and no other thread (so the task will be tied to the thread)

C variables will be firstprivate
Python objects will be shared

parallel.taskwait() # wait on any direct descendent tasks to finish

Regarding tasks, I think this is mapping OpenMP too close to Python.
Closures are excellent for the notion of a task, so I think something
based on the futures API would work better. I realize that makes the
mapping to OpenMP and implementation a bit more difficult, but I think
it is worth it in the long run.


with parallel.critical():
this section of code is mutually exclusive with other critical sections
optional keyword argument 'name' specifies a name for the critical
section,
which means all sections with that name will exclude each other,
but not
critical sections with different names

Note: all threads that encounter the section will execute it, just
not at the same time

Yes, this works well as a with-statement...

..except that it is slightly magic in that it binds to call position (unlike anything in Python). I.e. this would be more "correct", or at least Pythonic:

with parallel.critical(__file__, __line__):
    ...



with parallel.barrier():
all threads wait until everyone has reached the barrier
either no one or everyone should encounter the barrier
shared variables are flushed

I have problems with requiring a noop with block...

I'd much rather write

parallel.barrier()

However, that ties a function call to the place of invocation, and suggests that one could do

if rand() > .5:
    barrier()
else:
    i += 3
    barrier()

and have the same barrier in each case. Again,

barrier(__file__, __line__)

gets us purity at the cost of practicality. Another way is the pthreads approach (although one may have to use pthread rather then OpenMP to get it, unless there are named barriers?):

barrier_a = parallel.barrier()
barrier_b = parallel.barrier()
with parallel:
    barrier_a.wait()
    if rand() > .5:
        barrier_b.wait()
    else:
        i += 3
        barrier_b.wait()


I'm really not sure here.


Unfortunately, gcc again manages to horribly break master and single
constructs in loops (versions 4.2 throughout 4.6), so I suppose I'll
first file a bug report. Other (better) compilers like Portland (and I'm
sure Intel) work fine. I suppose a warning in the documentation will
suffice there.

If we at some point implement vector/SIMD operations we could also try
out the Fortran openmp workshare construct.

I'm starting to learn myself OpenCL as part of a course. It's very neat
for some kinds of parallelism. What I'm saying is that at least of the
case of SIMD, we should not lock ourselves to Fortran+OpenMP thinking
too early, but also look forward to coming architectures (e.g., AMD's
GPU-and-CPU on same die design).

Dag Sverre
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to