On 4 April 2011 13:53, Dag Sverre Seljebotn <d.s.seljeb...@astro.uio.no> wrote: > On 04/04/2011 01:23 PM, Stefan Behnel wrote: >> >> Dag Sverre Seljebotn, 04.04.2011 12:17: >>> >>> CEP up at http://wiki.cython.org/enhancements/prange >> >> """ >> Variable handling >> >> Rather than explicit declaration of shared/private variables we rely on >> conventions: >> >> * Thread-shared: Variables that are only read and not written in the >> loop body are shared across threads. Variables that are only used in the >> else block are considered shared as well. >> >> * Thread-private: Variables that are assigned to in the loop body are >> thread-private. Obviously, the iteration counter is thread-private as well. >> >> * Reduction: Variables that only used on the LHS of an inplace >> operator, such as s above, are marked as targets for reduction. If the >> variable is also used in other ways (LHS of assignment or in an expression) >> it does instead turn into a thread-private variable. Note: This means that >> if one, e.g., inserts printf(... s) above, s is turned into a thread-local >> variable. OTOH, there is simply no way to correctly emulate the effect >> printf(... s) would have in a sequential loop, so such code must be >> discouraged anyway. >> """ >> >> What about simply (ab-)using Python semantics and creating a new inner >> scope for the prange loop body? That would basically make the loop behave >> like a closure function, but with the looping header at the 'right' place >> rather than after the closure. > > I'm not quite sure what the concrete changes to the CEP this would lead to > (assuming you mean this as a proposal for alternative semantics, and not an > implementation detail). > > How would we treat reduction variables? They need to be supported, and > there's nothing in Python semantics to support reduction variables, they are > a rather special case everywhere. I suppose keeping the reduction clause > above, or use the "nonlocal" keyword in the loop body... > > Also there's the else:-block, although we could make that part of the scope. > And the "lastprivate" functionality, although that could be dropped without > much loss. > >> >> Also, in the example, the local variable declaration of "tmp" outside of >> the loop looks somewhat misplaced, although it's precedented by >> comprehensions (which also have their own local scope in Cython). > > Well, depending on the decision of lastprivate, the declaration would need > to be outside; I really like the idea of moving "cdef", and am prepared to > drop lastprivate for this. > > Being explicit about thread-local variables does make things a lot safer to > use. > > (One problem is that switching between serial and parallel one needs to move > variable declarations. But that only happens once, and one can use > "nthreads=1" to disable parallel after that.) > > An example would then be: > > def f(np.ndarray[double] x, double alpha): > cdef double s = 0, globtmp > with nogil: > for i in prange(x.shape[0]): > cdef double tmp # thread-private > tmp = alpha * i # alpha available from global scope > s += x[i] * tmp # still automatic reduction for inplace operators > # printf(...s) -> now leads to error, since s is not declared > thread-private but is read > else: > # tmp still available here...looks a bit strange, but useful > s += tmp * 10 > globtmp = tmp # we save tmp for later > # tmp not available here, globtmp is > return s > > Or, we just drop support for the else block on these loops.
I think since we are disallowing break (yet) we shouldn't support the else clause. Basically, I think we can make the CEP a tad more simple. I think we could declare everything outside of the prange body. Then, in the prange loop body: if a variable is assigned to anywhere -> make it lastprivate - if a variable is read before assigned to -> make it firstprivate in addition to lastprivate (raise compiler error if the variable is not initialized outside of the loop body) if a variable is only ever read -> make it shared (the default for OpenMP) if a variable has an inplace operator -> make it a reduction There is really no reason to disallow reading of the reduction variable (in e.g. a printf). The reduction should also be initialized outside of the prange body. Then prange() could be implemented in pure mode as simply the sequential version, i.e. range() which some more arguments. For any scratch space buffers etc, I'd prefer something like with cython.parallel: cdef char *buf = malloc(100) for i in prange(n): use buf free(buf) At least it fits my brain pretty well :) (this code does however assume that malloc is thread-safe). Anyway, I'm not sure I just covered all cases, but what do you think? > Dag Sverre > _______________________________________________ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel