Re: [Cython] Fused Types

2011-05-04 Thread mark florisson
On 4 May 2011 01:07, Greg Ewing  wrote:
> mark florisson wrote:
>
>> cdef func(floating x, floating y):
>>    ...
>>
>> you get a "float, float" version, and a "double, double" version, but
>> not "float, double" or "double, float".
>
> It's hard to draw conclusions from this example because
> it's degenerate. You don't really need multiple versions of a
> function like that, because of float <-> double coercions.

It's only degenerate if you want a real world example, and not one
that provides a simple answer to your original question...

> A more telling example might be
>
>  cdef double dot_product(floating *u, floating *v, int length)
>
> By your current rules, this would give you one version that
> takes two float vectors, and another that takes two double
> vectors.
>
> But if you want to find the dot product of a float vector and
> a double vector, you're out of luck.

Sure, so you can create two fused types. I do however somewhat like
your proposal with the indexing in the definition.

> --
> Greg
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Fused Types

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 01:07 AM, Greg Ewing wrote:

mark florisson wrote:


cdef func(floating x, floating y):
...

you get a "float, float" version, and a "double, double" version, but
not "float, double" or "double, float".


It's hard to draw conclusions from this example because
it's degenerate. You don't really need multiple versions of a
function like that, because of float <-> double coercions.

A more telling example might be

cdef double dot_product(floating *u, floating *v, int length)

By your current rules, this would give you one version that
takes two float vectors, and another that takes two double
vectors.

But if you want to find the dot product of a float vector and
a double vector, you're out of luck.


First, I'm open for your proposed syntax too...But in the interest of 
seeing how we got here:


The argument to the above goes that you *should* be out of luck. For 
instance, talking about dot products, BLAS itself has float-float and 
double-double, but not float-double AFAIK.


What you are saying that this does not have the full power of C++ 
templates. And the answer is that yes, this does not have the full power 
of C++ templates.


At the same time we discussed this, we also discussed better support for 
string-based templating languages (so that, e.g., compilation error 
messages could refer to the template file). The two are complementary.


Going back to Greg's syntax: What I don't like is that it makes the 
simple unambiguous cases, where this would actually be used in real 
life, less readable.


Would it be too complicated to have both? For instance;

 i) You are allowed to use a *single* fused_type on a *function* 
without declaration.


def f(floating x, floating *y): # ok

Turns into

def f[floating T](T x, T *y):

This is NOT ok:

def f(floating x, integral y):
# ERROR: Please explicitly declare fused types inside []

 ii) Using more than one fused type, or using it on a cdef class or 
struct, you need to use the [] declaration.



Finally: It is a bit uncomfortable that we seem to be hashing things out 
even as Mark is implementing this. Would it be feasible to have a Skype 
session sometimes this week where everybody interested in the outcome of 
this come together for an hour and actually decide on something?


Mark: How much does this discussion of syntax impact your development? 
Are you able to treat them just as polish on top and work on the 
"engine" undisturbed by this?


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Fused Types

2011-05-04 Thread mark florisson
On 4 May 2011 10:24, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 01:07 AM, Greg Ewing wrote:
>>
>> mark florisson wrote:
>>
>>> cdef func(floating x, floating y):
>>> ...
>>>
>>> you get a "float, float" version, and a "double, double" version, but
>>> not "float, double" or "double, float".
>>
>> It's hard to draw conclusions from this example because
>> it's degenerate. You don't really need multiple versions of a
>> function like that, because of float <-> double coercions.
>>
>> A more telling example might be
>>
>> cdef double dot_product(floating *u, floating *v, int length)
>>
>> By your current rules, this would give you one version that
>> takes two float vectors, and another that takes two double
>> vectors.
>>
>> But if you want to find the dot product of a float vector and
>> a double vector, you're out of luck.
>
> First, I'm open for your proposed syntax too...But in the interest of seeing
> how we got here:
>
> The argument to the above goes that you *should* be out of luck. For
> instance, talking about dot products, BLAS itself has float-float and
> double-double, but not float-double AFAIK.
>
> What you are saying that this does not have the full power of C++ templates.
> And the answer is that yes, this does not have the full power of C++
> templates.
>
> At the same time we discussed this, we also discussed better support for
> string-based templating languages (so that, e.g., compilation error messages
> could refer to the template file). The two are complementary.
>
> Going back to Greg's syntax: What I don't like is that it makes the simple
> unambiguous cases, where this would actually be used in real life, less
> readable.
>
> Would it be too complicated to have both? For instance;
>
>  i) You are allowed to use a *single* fused_type on a *function* without
> declaration.
>
> def f(floating x, floating *y): # ok
>
> Turns into
>
> def f[floating T](T x, T *y):
>
> This is NOT ok:
>
> def f(floating x, integral y):
> # ERROR: Please explicitly declare fused types inside []
>
>  ii) Using more than one fused type, or using it on a cdef class or struct,
> you need to use the [] declaration.
>

I don't think it would be too complicated, but as you mention it's
probably not a very likely case, and if the user does need it, a new
(equivalent) fused type can be created. The current way reads a lot
nicer than the indexed one in my opinion. So I'd be fine with
implementing it, but I find the current way more elegant.

> Finally: It is a bit uncomfortable that we seem to be hashing things out
> even as Mark is implementing this. Would it be feasible to have a Skype
> session sometimes this week where everybody interested in the outcome of
> this come together for an hour and actually decide on something?
>
> Mark: How much does this discussion of syntax impact your development? Are
> you able to treat them just as polish on top and work on the "engine"
> undisturbed by this?

Thanks for your consideration, I admit it it feels a bit uncomfortable
:) But at least this change shouldn't have such a big impact on the
code, it would mean some changes in a select few places, so it's
definitely polish. In any event, before we settle on this, I'd like to
do the cpdef support first and work on indexing from Python space, so
I think we have enough time to settle this argument on the ML.
Before that, I'm just going to finish up for a pull request for the
OpenMP branch, I'd like to see if I can get rid of some warnings.

> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 21 April 2011 20:13, Dag Sverre Seljebotn  wrote:
> On 04/21/2011 10:37 AM, Robert Bradshaw wrote:
>>
>> On Mon, Apr 18, 2011 at 7:51 AM, mark florisson
>>   wrote:
>>>
>>> On 18 April 2011 16:41, Dag Sverre Seljebotn
>>>  wrote:

 Excellent! Sounds great! (as I won't have my laptop for some days I
 can't
 have a look yet but I will later)

 You're right about (the current) buffers and the gil. A testcase
 explicitly
 for them would be good.

 Firstprivate etc: i think it'd be nice myself, but it is probably better
 to
 take a break from it at this point so that we can think more about that
 and
 not do anything rash; perhaps open up a specific thread on them and ask
 for
 more general input. Perhaps you want to take a break or task-switch to
 something else (fused types?) until I can get around to review and merge
 what you have so far? You'll know best what works for you though. If you
 decide to implement explicit threadprivate variables because you've got
 the
 flow I certainly wom't object myself.

>>>  Ok, cool, I'll move on :) I already included a test with a prange and
>>> a numpy buffer with indexing.
>>
>> Wow, you're just plowing away at this. Very cool.
>>
>> +1 to disallowing nested prange, that seems to get really messy with
>> little benefit.
>>
>> In terms of the CEP, I'm still unconvinced that firstprivate is not
>> safe to infer, but lets leave the initial values undefined rather than
>> specifying them to be NaNs (we can do that as an implementation if you
>> want), which will give us flexibility to change later once we've had a
>> chance to play around with it.
>
> I don't see any technical issues with inferring firstprivate, the question
> is whether we want to. I suggest not inferring it in order to make this
> safer: One should be able to just try to change a loop from "range" to
> "prange", and either a) have things fail very hard, or b) just work
> correctly and be able to trust the results.
>
> Note that when I suggest using NaN, it is as initial values for EACH
> ITERATION, not per-thread initialization. It is not about "firstprivate" or
> not, but about disabling thread-private variables entirely in favor of
> "per-iteration" variables.
>
> I believe that by talking about "readonly" and "per-iteration" variables,
> rather than "thread-shared" and "thread-private" variables, this can be used
> much more safely and with virtually no knowledge of the details of
> threading. Again, what's in my mind are scientific programmers with (too)
> little training.
>
> In the end it's a matter of taste and what is most convenient to more users.
> But I believe the case of needing real thread-private variables that
> preserves per-thread values across iterations (and thus also can possibly
> benefit from firstprivate) is seldomly enough used that an explicit
> declaration is OK, in particular when it buys us so much in safety in the
> common case.
>
> To be very precise,
>
> cdef double x, z
> for i in prange(n):
>    x = f(x)
>    z = f(i)
>    ...
>
> goes to
>
> cdef double x, z
> for i in prange(n):
>    x = z = nan
>    x = f(x)
>    z = f(i)
>    ...
>
> and we leave it to the C compiler to (trivially) optimize away "z = nan".
> And, yes, it is a stopgap solution until we've got control flow analysis so
> that we can outright disallow such uses of x (without threadprivate
> declaration, which also gives firstprivate behaviour).
>

I think the preliminary OpenMP support is ready for review. It
supports 'with cython.parallel.parallel:' and 'for i in
cython.parallel.prange(...):'. It works in generators and closures and
the docs are updated. Support for break/continue/with gil isn't there
yet.

There are two remaining issue. The first is warnings for potentially
uninitialized variables for prange(). When you do

for i in prange(start, stop, step): ...

it generates code like

nsteps = (stop - start) / step;
#pragma omp parallel for lastprivate(i)
for (temp = 0; temp < nsteps; temp++) {
i = start + temp * step;
...
}

So here it will complain about 'i' being potentially uninitialized, as
it might not be assigned to in the loop. However, simply assigning 0
to 'i' can't work either, as you expect zero iterations not to touch
it. So for now, we have a bunch of warnings, as I don't see a
__attribute__ to suppress it selectively.

The second is NaN-ing private variables, NaN isn't part of C. For gcc,
the docs ( http://www.delorie.com/gnu/docs/glibc/libc_407.html ) have
the following to say:

"You can use `#ifdef NAN' to test whether the machine supports NaN.
(Of course, you must arrange for GNU extensions to be visible, such as
by defining _GNU_SOURCE, and then you must include `math.h'.)"

So I'm thinking that if NaN is not available (or the compiler is not
GCC), we can use FLT_MAX, DBL_MAX and LDBL_MAX instead from float.h.
Would this be the proper way to handle this?
_

Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 12:00 PM, mark florisson wrote:

On 21 April 2011 20:13, Dag Sverre Seljebotn  wrote:

On 04/21/2011 10:37 AM, Robert Bradshaw wrote:


On Mon, Apr 18, 2011 at 7:51 AM, mark florisson
wrote:


On 18 April 2011 16:41, Dag Sverre Seljebotn
  wrote:


Excellent! Sounds great! (as I won't have my laptop for some days I
can't
have a look yet but I will later)

You're right about (the current) buffers and the gil. A testcase
explicitly
for them would be good.

Firstprivate etc: i think it'd be nice myself, but it is probably better
to
take a break from it at this point so that we can think more about that
and
not do anything rash; perhaps open up a specific thread on them and ask
for
more general input. Perhaps you want to take a break or task-switch to
something else (fused types?) until I can get around to review and merge
what you have so far? You'll know best what works for you though. If you
decide to implement explicit threadprivate variables because you've got
the
flow I certainly wom't object myself.


  Ok, cool, I'll move on :) I already included a test with a prange and
a numpy buffer with indexing.


Wow, you're just plowing away at this. Very cool.

+1 to disallowing nested prange, that seems to get really messy with
little benefit.

In terms of the CEP, I'm still unconvinced that firstprivate is not
safe to infer, but lets leave the initial values undefined rather than
specifying them to be NaNs (we can do that as an implementation if you
want), which will give us flexibility to change later once we've had a
chance to play around with it.


I don't see any technical issues with inferring firstprivate, the question
is whether we want to. I suggest not inferring it in order to make this
safer: One should be able to just try to change a loop from "range" to
"prange", and either a) have things fail very hard, or b) just work
correctly and be able to trust the results.

Note that when I suggest using NaN, it is as initial values for EACH
ITERATION, not per-thread initialization. It is not about "firstprivate" or
not, but about disabling thread-private variables entirely in favor of
"per-iteration" variables.

I believe that by talking about "readonly" and "per-iteration" variables,
rather than "thread-shared" and "thread-private" variables, this can be used
much more safely and with virtually no knowledge of the details of
threading. Again, what's in my mind are scientific programmers with (too)
little training.

In the end it's a matter of taste and what is most convenient to more users.
But I believe the case of needing real thread-private variables that
preserves per-thread values across iterations (and thus also can possibly
benefit from firstprivate) is seldomly enough used that an explicit
declaration is OK, in particular when it buys us so much in safety in the
common case.

To be very precise,

cdef double x, z
for i in prange(n):
x = f(x)
z = f(i)
...

goes to

cdef double x, z
for i in prange(n):
x = z = nan
x = f(x)
z = f(i)
...

and we leave it to the C compiler to (trivially) optimize away "z = nan".
And, yes, it is a stopgap solution until we've got control flow analysis so
that we can outright disallow such uses of x (without threadprivate
declaration, which also gives firstprivate behaviour).



I think the preliminary OpenMP support is ready for review. It
supports 'with cython.parallel.parallel:' and 'for i in
cython.parallel.prange(...):'. It works in generators and closures and
the docs are updated. Support for break/continue/with gil isn't there
yet.

There are two remaining issue. The first is warnings for potentially
uninitialized variables for prange(). When you do

for i in prange(start, stop, step): ...

it generates code like

nsteps = (stop - start) / step;
#pragma omp parallel for lastprivate(i)
for (temp = 0; temp<  nsteps; temp++) {
 i = start + temp * step;
 ...
}

So here it will complain about 'i' being potentially uninitialized, as
it might not be assigned to in the loop. However, simply assigning 0
to 'i' can't work either, as you expect zero iterations not to touch
it. So for now, we have a bunch of warnings, as I don't see a
__attribute__ to suppress it selectively.


Isn't this is orthogonal to OpenMP -- even if it said "range", your 
testcase could get such a warning? If so, the fix is simply to 
initialize i in your testcase code.



The second is NaN-ing private variables, NaN isn't part of C. For gcc,
the docs ( http://www.delorie.com/gnu/docs/glibc/libc_407.html ) have
the following to say:

"You can use `#ifdef NAN' to test whether the machine supports NaN.
(Of course, you must arrange for GNU extensions to be visible, such as
by defining _GNU_SOURCE, and then you must include `math.h'.)"

So I'm thinking that if NaN is not available (or the compiler is not
GCC), we can use FLT_MAX, DBL_MAX and LDBL_MAX instead from float.h.
Would this be the proper way to handle this?


I think it is sufficient. A relati

Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 12:45, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 12:00 PM, mark florisson wrote:
>>
>> On 21 April 2011 20:13, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 04/21/2011 10:37 AM, Robert Bradshaw wrote:

 On Mon, Apr 18, 2011 at 7:51 AM, mark florisson
     wrote:
>
> On 18 April 2011 16:41, Dag Sverre
> Seljebotn
>  wrote:
>>
>> Excellent! Sounds great! (as I won't have my laptop for some days I
>> can't
>> have a look yet but I will later)
>>
>> You're right about (the current) buffers and the gil. A testcase
>> explicitly
>> for them would be good.
>>
>> Firstprivate etc: i think it'd be nice myself, but it is probably
>> better
>> to
>> take a break from it at this point so that we can think more about
>> that
>> and
>> not do anything rash; perhaps open up a specific thread on them and
>> ask
>> for
>> more general input. Perhaps you want to take a break or task-switch to
>> something else (fused types?) until I can get around to review and
>> merge
>> what you have so far? You'll know best what works for you though. If
>> you
>> decide to implement explicit threadprivate variables because you've
>> got
>> the
>> flow I certainly wom't object myself.
>>
>  Ok, cool, I'll move on :) I already included a test with a prange and
> a numpy buffer with indexing.

 Wow, you're just plowing away at this. Very cool.

 +1 to disallowing nested prange, that seems to get really messy with
 little benefit.

 In terms of the CEP, I'm still unconvinced that firstprivate is not
 safe to infer, but lets leave the initial values undefined rather than
 specifying them to be NaNs (we can do that as an implementation if you
 want), which will give us flexibility to change later once we've had a
 chance to play around with it.
>>>
>>> I don't see any technical issues with inferring firstprivate, the
>>> question
>>> is whether we want to. I suggest not inferring it in order to make this
>>> safer: One should be able to just try to change a loop from "range" to
>>> "prange", and either a) have things fail very hard, or b) just work
>>> correctly and be able to trust the results.
>>>
>>> Note that when I suggest using NaN, it is as initial values for EACH
>>> ITERATION, not per-thread initialization. It is not about "firstprivate"
>>> or
>>> not, but about disabling thread-private variables entirely in favor of
>>> "per-iteration" variables.
>>>
>>> I believe that by talking about "readonly" and "per-iteration" variables,
>>> rather than "thread-shared" and "thread-private" variables, this can be
>>> used
>>> much more safely and with virtually no knowledge of the details of
>>> threading. Again, what's in my mind are scientific programmers with (too)
>>> little training.
>>>
>>> In the end it's a matter of taste and what is most convenient to more
>>> users.
>>> But I believe the case of needing real thread-private variables that
>>> preserves per-thread values across iterations (and thus also can possibly
>>> benefit from firstprivate) is seldomly enough used that an explicit
>>> declaration is OK, in particular when it buys us so much in safety in the
>>> common case.
>>>
>>> To be very precise,
>>>
>>> cdef double x, z
>>> for i in prange(n):
>>>    x = f(x)
>>>    z = f(i)
>>>    ...
>>>
>>> goes to
>>>
>>> cdef double x, z
>>> for i in prange(n):
>>>    x = z = nan
>>>    x = f(x)
>>>    z = f(i)
>>>    ...
>>>
>>> and we leave it to the C compiler to (trivially) optimize away "z = nan".
>>> And, yes, it is a stopgap solution until we've got control flow analysis
>>> so
>>> that we can outright disallow such uses of x (without threadprivate
>>> declaration, which also gives firstprivate behaviour).
>>>
>>
>> I think the preliminary OpenMP support is ready for review. It
>> supports 'with cython.parallel.parallel:' and 'for i in
>> cython.parallel.prange(...):'. It works in generators and closures and
>> the docs are updated. Support for break/continue/with gil isn't there
>> yet.
>>
>> There are two remaining issue. The first is warnings for potentially
>> uninitialized variables for prange(). When you do
>>
>> for i in prange(start, stop, step): ...
>>
>> it generates code like
>>
>> nsteps = (stop - start) / step;
>> #pragma omp parallel for lastprivate(i)
>> for (temp = 0; temp<  nsteps; temp++) {
>>     i = start + temp * step;
>>     ...
>> }
>>
>> So here it will complain about 'i' being potentially uninitialized, as
>> it might not be assigned to in the loop. However, simply assigning 0
>> to 'i' can't work either, as you expect zero iterations not to touch
>> it. So for now, we have a bunch of warnings, as I don't see a
>> __attribute__ to suppress it selectively.
>
> Isn't this is orthogonal to OpenMP -- even if it said "range", your testcase
> could get such a warning? If so, the fix is simply to initialize i

Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 12:59 PM, mark florisson wrote:

On 4 May 2011 12:45, Dag Sverre Seljebotn  wrote:

On 05/04/2011 12:00 PM, mark florisson wrote:

There are two remaining issue. The first is warnings for potentially
uninitialized variables for prange(). When you do

for i in prange(start, stop, step): ...

it generates code like

nsteps = (stop - start) / step;
#pragma omp parallel for lastprivate(i)
for (temp = 0; temp  0, but the compiler doesn't detect
this, so it will still issue a warning even if 'i' is initialized (the
warning is at the place of the lastprivate declaration).


Ah. But this is then more important than I initially thought it was. You 
are saying that this is the case:


cdef int i = 0
with nogil:
for i in prange(n):
...
print i # garbage when n == 0?

It would be in the interest of less semantic differences w.r.t. range to 
deal better with this case.


Will it silence the warning if we make "i" firstprivate as well as 
lastprivate? firstprivate would only affect the case of zero iterations, 
since we overwrite with NaN if the loop is entered...


Dag
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 13:15, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 12:59 PM, mark florisson wrote:
>>
>> On 4 May 2011 12:45, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 12:00 PM, mark florisson wrote:

 There are two remaining issue. The first is warnings for potentially
 uninitialized variables for prange(). When you do

 for i in prange(start, stop, step): ...

 it generates code like

 nsteps = (stop - start) / step;
 #pragma omp parallel for lastprivate(i)
 for (temp = 0; temp<    nsteps; temp++) {
     i = start + temp * step;
     ...
 }

 So here it will complain about 'i' being potentially uninitialized, as
 it might not be assigned to in the loop. However, simply assigning 0
 to 'i' can't work either, as you expect zero iterations not to touch
 it. So for now, we have a bunch of warnings, as I don't see a
 __attribute__ to suppress it selectively.
>>>
>>> Isn't this is orthogonal to OpenMP -- even if it said "range", your
>>> testcase
>>> could get such a warning? If so, the fix is simply to initialize i in
>>> your
>>> testcase code.
>>
>> No, the problem is that 'i' needs to be lastprivate, and 'i' is
>> assigned to in the loop body. It's irrelevant whether 'i' is assigned
>> to before the loop. I think this is the case because the spec says
>> that lastprivate variables will get the value of the private variable
>> of the last sequential iteration, but it cannot at compile time know
>> whether there might be zero iterations, which I believe the spec
>> doesn't have anything to say about. So basically we could guard
>> against it by checking if nsteps>  0, but the compiler doesn't detect
>> this, so it will still issue a warning even if 'i' is initialized (the
>> warning is at the place of the lastprivate declaration).
>
> Ah. But this is then more important than I initially thought it was. You are
> saying that this is the case:
>
> cdef int i = 0
> with nogil:
>    for i in prange(n):
>        ...
> print i # garbage when n == 0?

I think it may be, depending on the implementation. With libgomp it
return 0. With the check it should also return 0.

> It would be in the interest of less semantic differences w.r.t. range to
> deal better with this case.
>
> Will it silence the warning if we make "i" firstprivate as well as
> lastprivate? firstprivate would only affect the case of zero iterations,
> since we overwrite with NaN if the loop is entered...

Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,
that works. So we need both the check and an initialization in there:

if (nsteps > 0) {
i = 0;
#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) ...
}

Now any subsequent read of 'i' will only issue a warning if 'i' is not
initialized before the prange() by the user. So if you leave your
index variable uninitialized (because you know in advance nsteps will
be greater than zero), you'll still get a warning. But at least you
will be able to shut up the compiler :)

> Dag
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 01:30 PM, mark florisson wrote:

On 4 May 2011 13:15, Dag Sverre Seljebotn  wrote:

On 05/04/2011 12:59 PM, mark florisson wrote:


On 4 May 2011 12:45, Dag Sverre Seljebotn
  wrote:


On 05/04/2011 12:00 PM, mark florisson wrote:


There are two remaining issue. The first is warnings for potentially
uninitialized variables for prange(). When you do

for i in prange(start, stop, step): ...

it generates code like

nsteps = (stop - start) / step;
#pragma omp parallel for lastprivate(i)
for (temp = 0; temp<  nsteps; temp++) {
 i = start + temp * step;
 ...
}

So here it will complain about 'i' being potentially uninitialized, as
it might not be assigned to in the loop. However, simply assigning 0
to 'i' can't work either, as you expect zero iterations not to touch
it. So for now, we have a bunch of warnings, as I don't see a
__attribute__ to suppress it selectively.


Isn't this is orthogonal to OpenMP -- even if it said "range", your
testcase
could get such a warning? If so, the fix is simply to initialize i in
your
testcase code.


No, the problem is that 'i' needs to be lastprivate, and 'i' is
assigned to in the loop body. It's irrelevant whether 'i' is assigned
to before the loop. I think this is the case because the spec says
that lastprivate variables will get the value of the private variable
of the last sequential iteration, but it cannot at compile time know
whether there might be zero iterations, which I believe the spec
doesn't have anything to say about. So basically we could guard
against it by checking if nsteps>0, but the compiler doesn't detect
this, so it will still issue a warning even if 'i' is initialized (the
warning is at the place of the lastprivate declaration).


Ah. But this is then more important than I initially thought it was. You are
saying that this is the case:

cdef int i = 0
with nogil:
for i in prange(n):
...
print i # garbage when n == 0?


I think it may be, depending on the implementation. With libgomp it
return 0. With the check it should also return 0.


It would be in the interest of less semantic differences w.r.t. range to
deal better with this case.

Will it silence the warning if we make "i" firstprivate as well as
lastprivate? firstprivate would only affect the case of zero iterations,
since we overwrite with NaN if the loop is entered...


Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,


Doh.


that works. So we need both the check and an initialization in there:

if (nsteps>  0) {
 i = 0;
 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (temp = 0; ...; ...) ...
}


Why do you need the if-test? Won't simply

#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) ...

do the job -- any initial value will be copied into all threads, 
including the "last" thread, even if there are no iterations?


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 13:39, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 01:30 PM, mark florisson wrote:
>>
>> On 4 May 2011 13:15, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 12:59 PM, mark florisson wrote:

 On 4 May 2011 12:45, Dag Sverre Seljebotn
  wrote:
>
> On 05/04/2011 12:00 PM, mark florisson wrote:
>>
>> There are two remaining issue. The first is warnings for potentially
>> uninitialized variables for prange(). When you do
>>
>> for i in prange(start, stop, step): ...
>>
>> it generates code like
>>
>> nsteps = (stop - start) / step;
>> #pragma omp parallel for lastprivate(i)
>> for (temp = 0; temp<      nsteps; temp++) {
>>     i = start + temp * step;
>>     ...
>> }
>>
>> So here it will complain about 'i' being potentially uninitialized, as
>> it might not be assigned to in the loop. However, simply assigning 0
>> to 'i' can't work either, as you expect zero iterations not to touch
>> it. So for now, we have a bunch of warnings, as I don't see a
>> __attribute__ to suppress it selectively.
>
> Isn't this is orthogonal to OpenMP -- even if it said "range", your
> testcase
> could get such a warning? If so, the fix is simply to initialize i in
> your
> testcase code.

 No, the problem is that 'i' needs to be lastprivate, and 'i' is
 assigned to in the loop body. It's irrelevant whether 'i' is assigned
 to before the loop. I think this is the case because the spec says
 that lastprivate variables will get the value of the private variable
 of the last sequential iteration, but it cannot at compile time know
 whether there might be zero iterations, which I believe the spec
 doesn't have anything to say about. So basically we could guard
 against it by checking if nsteps>    0, but the compiler doesn't detect
 this, so it will still issue a warning even if 'i' is initialized (the
 warning is at the place of the lastprivate declaration).
>>>
>>> Ah. But this is then more important than I initially thought it was. You
>>> are
>>> saying that this is the case:
>>>
>>> cdef int i = 0
>>> with nogil:
>>>    for i in prange(n):
>>>        ...
>>> print i # garbage when n == 0?
>>
>> I think it may be, depending on the implementation. With libgomp it
>> return 0. With the check it should also return 0.
>>
>>> It would be in the interest of less semantic differences w.r.t. range to
>>> deal better with this case.
>>>
>>> Will it silence the warning if we make "i" firstprivate as well as
>>> lastprivate? firstprivate would only affect the case of zero iterations,
>>> since we overwrite with NaN if the loop is entered...
>>
>> Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,
>
> Doh.
>
>> that works. So we need both the check and an initialization in there:
>>
>> if (nsteps>  0) {
>>     i = 0;
>>     #pragma omp parallel for firstprivate(i) lastprivate(i)
>>     for (temp = 0; ...; ...) ...
>> }
>
> Why do you need the if-test? Won't simply
>
> #pragma omp parallel for firstprivate(i) lastprivate(i)
> for (temp = 0; ...; ...) ...
>
> do the job -- any initial value will be copied into all threads, including
> the "last" thread, even if there are no iterations?

It will, but you don't expect your iteration variable to change with
zero iterations.

> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 01:41 PM, mark florisson wrote:

On 4 May 2011 13:39, Dag Sverre Seljebotn  wrote:

On 05/04/2011 01:30 PM, mark florisson wrote:


On 4 May 2011 13:15, Dag Sverre Seljebotn
  wrote:


On 05/04/2011 12:59 PM, mark florisson wrote:


On 4 May 2011 12:45, Dag Sverre Seljebotn
  wrote:


On 05/04/2011 12:00 PM, mark florisson wrote:


There are two remaining issue. The first is warnings for potentially
uninitialized variables for prange(). When you do

for i in prange(start, stop, step): ...

it generates code like

nsteps = (stop - start) / step;
#pragma omp parallel for lastprivate(i)
for (temp = 0; temp  0, but the compiler doesn't detect
this, so it will still issue a warning even if 'i' is initialized (the
warning is at the place of the lastprivate declaration).


Ah. But this is then more important than I initially thought it was. You
are
saying that this is the case:

cdef int i = 0
with nogil:
for i in prange(n):
...
print i # garbage when n == 0?


I think it may be, depending on the implementation. With libgomp it
return 0. With the check it should also return 0.


It would be in the interest of less semantic differences w.r.t. range to
deal better with this case.

Will it silence the warning if we make "i" firstprivate as well as
lastprivate? firstprivate would only affect the case of zero iterations,
since we overwrite with NaN if the loop is entered...


Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,


Doh.


that works. So we need both the check and an initialization in there:

if (nsteps>0) {
 i = 0;
 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (temp = 0; ...; ...) ...
}


Why do you need the if-test? Won't simply

#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) ...

do the job -- any initial value will be copied into all threads, including
the "last" thread, even if there are no iterations?


It will, but you don't expect your iteration variable to change with
zero iterations.


Look.

i = 42
for i in prange(n):
f(i)
print i # want 42 whenever n == 0

Now, translate this to:

i = 42;
#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) {
i = ...
}
#pragma omp parallel end
/* At this point, i == 42 if n == 0 */

Am I missing something?

DS
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 13:45, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 01:41 PM, mark florisson wrote:
>>
>> On 4 May 2011 13:39, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 01:30 PM, mark florisson wrote:

 On 4 May 2011 13:15, Dag Sverre Seljebotn
  wrote:
>
> On 05/04/2011 12:59 PM, mark florisson wrote:
>>
>> On 4 May 2011 12:45, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 12:00 PM, mark florisson wrote:

 There are two remaining issue. The first is warnings for potentially
 uninitialized variables for prange(). When you do

 for i in prange(start, stop, step): ...

 it generates code like

 nsteps = (stop - start) / step;
 #pragma omp parallel for lastprivate(i)
 for (temp = 0; temp<        nsteps; temp++) {
     i = start + temp * step;
     ...
 }

 So here it will complain about 'i' being potentially uninitialized,
 as
 it might not be assigned to in the loop. However, simply assigning 0
 to 'i' can't work either, as you expect zero iterations not to touch
 it. So for now, we have a bunch of warnings, as I don't see a
 __attribute__ to suppress it selectively.
>>>
>>> Isn't this is orthogonal to OpenMP -- even if it said "range", your
>>> testcase
>>> could get such a warning? If so, the fix is simply to initialize i in
>>> your
>>> testcase code.
>>
>> No, the problem is that 'i' needs to be lastprivate, and 'i' is
>> assigned to in the loop body. It's irrelevant whether 'i' is assigned
>> to before the loop. I think this is the case because the spec says
>> that lastprivate variables will get the value of the private variable
>> of the last sequential iteration, but it cannot at compile time know
>> whether there might be zero iterations, which I believe the spec
>> doesn't have anything to say about. So basically we could guard
>> against it by checking if nsteps>      0, but the compiler doesn't
>> detect
>> this, so it will still issue a warning even if 'i' is initialized (the
>> warning is at the place of the lastprivate declaration).
>
> Ah. But this is then more important than I initially thought it was.
> You
> are
> saying that this is the case:
>
> cdef int i = 0
> with nogil:
>    for i in prange(n):
>        ...
> print i # garbage when n == 0?

 I think it may be, depending on the implementation. With libgomp it
 return 0. With the check it should also return 0.

> It would be in the interest of less semantic differences w.r.t. range
> to
> deal better with this case.
>
> Will it silence the warning if we make "i" firstprivate as well as
> lastprivate? firstprivate would only affect the case of zero
> iterations,
> since we overwrite with NaN if the loop is entered...

 Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,
>>>
>>> Doh.
>>>
 that works. So we need both the check and an initialization in there:

 if (nsteps>    0) {
     i = 0;
     #pragma omp parallel for firstprivate(i) lastprivate(i)
     for (temp = 0; ...; ...) ...
 }
>>>
>>> Why do you need the if-test? Won't simply
>>>
>>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>>> for (temp = 0; ...; ...) ...
>>>
>>> do the job -- any initial value will be copied into all threads,
>>> including
>>> the "last" thread, even if there are no iterations?
>>
>> It will, but you don't expect your iteration variable to change with
>> zero iterations.
>
> Look.
>
> i = 42
> for i in prange(n):
>    f(i)
> print i # want 42 whenever n == 0
>
> Now, translate this to:
>
> i = 42;
> #pragma omp parallel for firstprivate(i) lastprivate(i)
> for (temp = 0; ...; ...) {
>    i = ...
> }
> #pragma omp parallel end
> /* At this point, i == 42 if n == 0 */
>
> Am I missing something?

Yes, 'i' may be uninitialized with nsteps > 0 (this should be valid
code). So if nsteps > 0, we need to initialize 'i' to something to get
correct behaviour with firstprivate.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 13:47, mark florisson  wrote:
> On 4 May 2011 13:45, Dag Sverre Seljebotn  wrote:
>> On 05/04/2011 01:41 PM, mark florisson wrote:
>>>
>>> On 4 May 2011 13:39, Dag Sverre Seljebotn
>>>  wrote:

 On 05/04/2011 01:30 PM, mark florisson wrote:
>
> On 4 May 2011 13:15, Dag Sverre Seljebotn
>  wrote:
>>
>> On 05/04/2011 12:59 PM, mark florisson wrote:
>>>
>>> On 4 May 2011 12:45, Dag Sverre Seljebotn
>>>  wrote:

 On 05/04/2011 12:00 PM, mark florisson wrote:
>
> There are two remaining issue. The first is warnings for potentially
> uninitialized variables for prange(). When you do
>
> for i in prange(start, stop, step): ...
>
> it generates code like
>
> nsteps = (stop - start) / step;
> #pragma omp parallel for lastprivate(i)
> for (temp = 0; temp<        nsteps; temp++) {
>     i = start + temp * step;
>     ...
> }
>
> So here it will complain about 'i' being potentially uninitialized,
> as
> it might not be assigned to in the loop. However, simply assigning 0
> to 'i' can't work either, as you expect zero iterations not to touch
> it. So for now, we have a bunch of warnings, as I don't see a
> __attribute__ to suppress it selectively.

 Isn't this is orthogonal to OpenMP -- even if it said "range", your
 testcase
 could get such a warning? If so, the fix is simply to initialize i in
 your
 testcase code.
>>>
>>> No, the problem is that 'i' needs to be lastprivate, and 'i' is
>>> assigned to in the loop body. It's irrelevant whether 'i' is assigned
>>> to before the loop. I think this is the case because the spec says
>>> that lastprivate variables will get the value of the private variable
>>> of the last sequential iteration, but it cannot at compile time know
>>> whether there might be zero iterations, which I believe the spec
>>> doesn't have anything to say about. So basically we could guard
>>> against it by checking if nsteps>      0, but the compiler doesn't
>>> detect
>>> this, so it will still issue a warning even if 'i' is initialized (the
>>> warning is at the place of the lastprivate declaration).
>>
>> Ah. But this is then more important than I initially thought it was.
>> You
>> are
>> saying that this is the case:
>>
>> cdef int i = 0
>> with nogil:
>>    for i in prange(n):
>>        ...
>> print i # garbage when n == 0?
>
> I think it may be, depending on the implementation. With libgomp it
> return 0. With the check it should also return 0.
>
>> It would be in the interest of less semantic differences w.r.t. range
>> to
>> deal better with this case.
>>
>> Will it silence the warning if we make "i" firstprivate as well as
>> lastprivate? firstprivate would only affect the case of zero
>> iterations,
>> since we overwrite with NaN if the loop is entered...
>
> Well, it wouldn't be NaN, it would be start + step * temp :) But, yes,

 Doh.

> that works. So we need both the check and an initialization in there:
>
> if (nsteps>    0) {
>     i = 0;
>     #pragma omp parallel for firstprivate(i) lastprivate(i)
>     for (temp = 0; ...; ...) ...
> }

 Why do you need the if-test? Won't simply

 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (temp = 0; ...; ...) ...

 do the job -- any initial value will be copied into all threads,
 including
 the "last" thread, even if there are no iterations?
>>>
>>> It will, but you don't expect your iteration variable to change with
>>> zero iterations.
>>
>> Look.
>>
>> i = 42
>> for i in prange(n):
>>    f(i)
>> print i # want 42 whenever n == 0
>>
>> Now, translate this to:
>>
>> i = 42;
>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>> for (temp = 0; ...; ...) {
>>    i = ...
>> }
>> #pragma omp parallel end
>> /* At this point, i == 42 if n == 0 */
>>
>> Am I missing something?
>
> Yes, 'i' may be uninitialized with nsteps > 0 (this should be valid
> code). So if nsteps > 0, we need to initialize 'i' to something to get
> correct behaviour with firstprivate.
>
 And of course, if you initialize 'i' unconditionally, you change 'i'
whereas you might have to leave it unaffected.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 01:48 PM, mark florisson wrote:

On 4 May 2011 13:47, mark florisson  wrote:

On 4 May 2011 13:45, Dag Sverre Seljebotn  wrote:



Look.

i = 42
for i in prange(n):
f(i)
print i # want 42 whenever n == 0

Now, translate this to:

i = 42;
#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) {
i = ...
}
#pragma omp parallel end
/* At this point, i == 42 if n == 0 */

Am I missing something?


Yes, 'i' may be uninitialized with nsteps>  0 (this should be valid
code). So if nsteps>  0, we need to initialize 'i' to something to get
correct behaviour with firstprivate.


This I don't see. I think I need to be spoon-fed on this one.


  And of course, if you initialize 'i' unconditionally, you change 'i'
whereas you might have to leave it unaffected.


This I see.

Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 13:54, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 01:48 PM, mark florisson wrote:
>>
>> On 4 May 2011 13:47, mark florisson  wrote:
>>>
>>> On 4 May 2011 13:45, Dag Sverre Seljebotn
>>>  wrote:
>
 Look.

 i = 42
 for i in prange(n):
    f(i)
 print i # want 42 whenever n == 0

 Now, translate this to:

 i = 42;
 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (temp = 0; ...; ...) {
    i = ...
 }
 #pragma omp parallel end
 /* At this point, i == 42 if n == 0 */

 Am I missing something?
>>>
>>> Yes, 'i' may be uninitialized with nsteps>  0 (this should be valid
>>> code). So if nsteps>  0, we need to initialize 'i' to something to get
>>> correct behaviour with firstprivate.
>
> This I don't see. I think I need to be spoon-fed on this one.

So assume this code

cdef int i

for i in prange(10): ...

Now if we transform this without the guard we get

int i;

#pragma omp parallel for firstprivate(i) lastprivate(i)
for (...) { ...}

This is invalid C code, but valid Cython code. So we need to
initialize 'i', but then we get our "leave it unaffected for 0
iterations" paradox. So we need a guard.

>>  And of course, if you initialize 'i' unconditionally, you change 'i'
>> whereas you might have to leave it unaffected.
>
> This I see.
>
> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 01:59 PM, mark florisson wrote:

On 4 May 2011 13:54, Dag Sverre Seljebotn  wrote:

On 05/04/2011 01:48 PM, mark florisson wrote:


On 4 May 2011 13:47, mark florissonwrote:


On 4 May 2011 13:45, Dag Sverre Seljebotn
  wrote:



Look.

i = 42
for i in prange(n):
f(i)
print i # want 42 whenever n == 0

Now, translate this to:

i = 42;
#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) {
i = ...
}
#pragma omp parallel end
/* At this point, i == 42 if n == 0 */

Am I missing something?


Yes, 'i' may be uninitialized with nsteps>0 (this should be valid
code). So if nsteps>0, we need to initialize 'i' to something to get
correct behaviour with firstprivate.


This I don't see. I think I need to be spoon-fed on this one.


So assume this code

cdef int i

for i in prange(10): ...

Now if we transform this without the guard we get

int i;

#pragma omp parallel for firstprivate(i) lastprivate(i)
for (...) { ...}

This is invalid C code, but valid Cython code. So we need to
initialize 'i', but then we get our "leave it unaffected for 0
iterations" paradox. So we need a guard.


You mean C code won't compile if i is firstprivate and not initialized? 
(Sorry, I'm not aware of such things.)


My first instinct is to initialize i to 0xbadabada. After all, its value 
is not specified -- we're not violating any Cython specs by initializing 
it to garbage ourselves.


OTOH, I see that your approach with an if-test is more 
Valgrind-friendly, so I'm OK with that.


Would it work to do

if (nsteps > 0) {
#pragma omp parallel
i = 0;
#pragma omp for lastprivate(i)
for (temp = 0; ...) ...
...
}

instead, to get rid of the warning without using a firstprivate? Not 
sure if there's an efficiency difference here, I suppose a good C 
compiler could compile them to the same thing.


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 14:10, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 01:59 PM, mark florisson wrote:
>>
>> On 4 May 2011 13:54, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 01:48 PM, mark florisson wrote:

 On 4 May 2011 13:47, mark florisson    wrote:
>
> On 4 May 2011 13:45, Dag Sverre Seljebotn
>  wrote:
>>>
>> Look.
>>
>> i = 42
>> for i in prange(n):
>>    f(i)
>> print i # want 42 whenever n == 0
>>
>> Now, translate this to:
>>
>> i = 42;
>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>> for (temp = 0; ...; ...) {
>>    i = ...
>> }
>> #pragma omp parallel end
>> /* At this point, i == 42 if n == 0 */
>>
>> Am I missing something?
>
> Yes, 'i' may be uninitialized with nsteps>    0 (this should be valid
> code). So if nsteps>    0, we need to initialize 'i' to something to
> get
> correct behaviour with firstprivate.
>>>
>>> This I don't see. I think I need to be spoon-fed on this one.
>>
>> So assume this code
>>
>> cdef int i
>>
>> for i in prange(10): ...
>>
>> Now if we transform this without the guard we get
>>
>> int i;
>>
>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>> for (...) { ...}
>>
>> This is invalid C code, but valid Cython code. So we need to
>> initialize 'i', but then we get our "leave it unaffected for 0
>> iterations" paradox. So we need a guard.
>
> You mean C code won't compile if i is firstprivate and not initialized?
> (Sorry, I'm not aware of such things.)

It will compile and warn, but it is technically invalid, as you're
reading an uninitialized variable, which has undefined behavior. If
e.g. the variable contains a trap representation on a certain
architecture, it might halt the program (I'm not sure which
architecture that would be, but I believe they exist).

> My first instinct is to initialize i to 0xbadabada. After all, its value is
> not specified -- we're not violating any Cython specs by initializing it to
> garbage ourselves.

The problem is that we don't know whether the user has initialized the
variable. So if we want firstprivate to suppress warnings, we should
assume that the user hasn't and do it ourselves.

> OTOH, I see that your approach with an if-test is more Valgrind-friendly, so
> I'm OK with that.
>
> Would it work to do
>
> if (nsteps > 0) {
>    #pragma omp parallel
>    i = 0;
>    #pragma omp for lastprivate(i)
>    for (temp = 0; ...) ...
>    ...
> }

I'm assuming you mean #pragma omp parallel private(i), otherwise you
have a race (I'm not sure how much that matters for assignment). In
any case, with the private() clause 'i' would be uninitialized
afterwards. In either case it won't do anything useful.

> instead, to get rid of the warning without using a firstprivate? Not sure if
> there's an efficiency difference here, I suppose a good C compiler could
> compile them to the same thing.
>
> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 14:17, mark florisson  wrote:
> On 4 May 2011 14:10, Dag Sverre Seljebotn  wrote:
>> On 05/04/2011 01:59 PM, mark florisson wrote:
>>>
>>> On 4 May 2011 13:54, Dag Sverre Seljebotn
>>>  wrote:

 On 05/04/2011 01:48 PM, mark florisson wrote:
>
> On 4 May 2011 13:47, mark florisson    wrote:
>>
>> On 4 May 2011 13:45, Dag Sverre Seljebotn
>>  wrote:

>>> Look.
>>>
>>> i = 42
>>> for i in prange(n):
>>>    f(i)
>>> print i # want 42 whenever n == 0
>>>
>>> Now, translate this to:
>>>
>>> i = 42;
>>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>>> for (temp = 0; ...; ...) {
>>>    i = ...
>>> }
>>> #pragma omp parallel end
>>> /* At this point, i == 42 if n == 0 */
>>>
>>> Am I missing something?
>>
>> Yes, 'i' may be uninitialized with nsteps>    0 (this should be valid
>> code). So if nsteps>    0, we need to initialize 'i' to something to
>> get
>> correct behaviour with firstprivate.

 This I don't see. I think I need to be spoon-fed on this one.
>>>
>>> So assume this code
>>>
>>> cdef int i
>>>
>>> for i in prange(10): ...
>>>
>>> Now if we transform this without the guard we get
>>>
>>> int i;
>>>
>>> #pragma omp parallel for firstprivate(i) lastprivate(i)
>>> for (...) { ...}
>>>
>>> This is invalid C code, but valid Cython code. So we need to
>>> initialize 'i', but then we get our "leave it unaffected for 0
>>> iterations" paradox. So we need a guard.
>>
>> You mean C code won't compile if i is firstprivate and not initialized?
>> (Sorry, I'm not aware of such things.)
>
> It will compile and warn, but it is technically invalid, as you're
> reading an uninitialized variable, which has undefined behavior. If
> e.g. the variable contains a trap representation on a certain
> architecture, it might halt the program (I'm not sure which
> architecture that would be, but I believe they exist).
>
>> My first instinct is to initialize i to 0xbadabada. After all, its value is
>> not specified -- we're not violating any Cython specs by initializing it to
>> garbage ourselves.
>
> The problem is that we don't know whether the user has initialized the
> variable. So if we want firstprivate to suppress warnings, we should
> assume that the user hasn't and do it ourselves.

The alternative would be to give 'cdef int i' initialized semantics,
to whatever value we please. So instead of generating 'int i;' code,
we could always generate 'int i = ...;'. But currently we don't do
that.

>> OTOH, I see that your approach with an if-test is more Valgrind-friendly, so
>> I'm OK with that.
>>
>> Would it work to do
>>
>> if (nsteps > 0) {
>>    #pragma omp parallel
>>    i = 0;
>>    #pragma omp for lastprivate(i)
>>    for (temp = 0; ...) ...
>>    ...
>> }
>
> I'm assuming you mean #pragma omp parallel private(i), otherwise you
> have a race (I'm not sure how much that matters for assignment). In
> any case, with the private() clause 'i' would be uninitialized
> afterwards. In either case it won't do anything useful.
>
>> instead, to get rid of the warning without using a firstprivate? Not sure if
>> there's an efficiency difference here, I suppose a good C compiler could
>> compile them to the same thing.
>>
>> Dag Sverre
>> ___
>> cython-devel mailing list
>> cython-devel@python.org
>> http://mail.python.org/mailman/listinfo/cython-devel
>>
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 02:17 PM, mark florisson wrote:

On 4 May 2011 14:10, Dag Sverre Seljebotn  wrote:

On 05/04/2011 01:59 PM, mark florisson wrote:


On 4 May 2011 13:54, Dag Sverre Seljebotn
  wrote:


On 05/04/2011 01:48 PM, mark florisson wrote:


On 4 May 2011 13:47, mark florisson  wrote:


On 4 May 2011 13:45, Dag Sverre Seljebotn
  wrote:



Look.

i = 42
for i in prange(n):
f(i)
print i # want 42 whenever n == 0

Now, translate this to:

i = 42;
#pragma omp parallel for firstprivate(i) lastprivate(i)
for (temp = 0; ...; ...) {
i = ...
}
#pragma omp parallel end
/* At this point, i == 42 if n == 0 */

Am I missing something?


Yes, 'i' may be uninitialized with nsteps>  0 (this should be valid
code). So if nsteps>  0, we need to initialize 'i' to something to
get
correct behaviour with firstprivate.


This I don't see. I think I need to be spoon-fed on this one.


So assume this code

cdef int i

for i in prange(10): ...

Now if we transform this without the guard we get

int i;

#pragma omp parallel for firstprivate(i) lastprivate(i)
for (...) { ...}

This is invalid C code, but valid Cython code. So we need to
initialize 'i', but then we get our "leave it unaffected for 0
iterations" paradox. So we need a guard.


You mean C code won't compile if i is firstprivate and not initialized?
(Sorry, I'm not aware of such things.)


It will compile and warn, but it is technically invalid, as you're
reading an uninitialized variable, which has undefined behavior. If
e.g. the variable contains a trap representation on a certain
architecture, it might halt the program (I'm not sure which
architecture that would be, but I believe they exist).


My first instinct is to initialize i to 0xbadabada. After all, its value is
not specified -- we're not violating any Cython specs by initializing it to
garbage ourselves.


The problem is that we don't know whether the user has initialized the
variable. So if we want firstprivate to suppress warnings, we should
assume that the user hasn't and do it ourselves.


I meant that if we don't care about Valgrindability, we can initialize i 
at the top of our function (i.e. where it says "int __pyx_v_i").



OTOH, I see that your approach with an if-test is more Valgrind-friendly, so
I'm OK with that.

Would it work to do

if (nsteps>  0) {
#pragma omp parallel
i = 0;
#pragma omp for lastprivate(i)
for (temp = 0; ...) ...
...
}


I'm assuming you mean #pragma omp parallel private(i), otherwise you
have a race (I'm not sure how much that matters for assignment). In
any case, with the private() clause 'i' would be uninitialized
afterwards. In either case it won't do anything useful.


Sorry, I meant that lastprivate(i) should go on the parallel line.

if (nsteps>  0) {
#pragma omp parallel lastprivate(i)
i = 0;
#pragma omp for
for (temp = 0; ...) ...
...
}

won't this silence the warning? At any rate, it's obvious you have a 
better handle on this than me, so I'll shut up now and leave you to it :-)


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 14:23, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 02:17 PM, mark florisson wrote:
>>
>> On 4 May 2011 14:10, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 01:59 PM, mark florisson wrote:

 On 4 May 2011 13:54, Dag Sverre Seljebotn
  wrote:
>
> On 05/04/2011 01:48 PM, mark florisson wrote:
>>
>> On 4 May 2011 13:47, mark florisson
>>  wrote:
>>>
>>> On 4 May 2011 13:45, Dag Sverre Seljebotn
>>>  wrote:
>
 Look.

 i = 42
 for i in prange(n):
    f(i)
 print i # want 42 whenever n == 0

 Now, translate this to:

 i = 42;
 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (temp = 0; ...; ...) {
    i = ...
 }
 #pragma omp parallel end
 /* At this point, i == 42 if n == 0 */

 Am I missing something?
>>>
>>> Yes, 'i' may be uninitialized with nsteps>      0 (this should be
>>> valid
>>> code). So if nsteps>      0, we need to initialize 'i' to something
>>> to
>>> get
>>> correct behaviour with firstprivate.
>
> This I don't see. I think I need to be spoon-fed on this one.

 So assume this code

 cdef int i

 for i in prange(10): ...

 Now if we transform this without the guard we get

 int i;

 #pragma omp parallel for firstprivate(i) lastprivate(i)
 for (...) { ...}

 This is invalid C code, but valid Cython code. So we need to
 initialize 'i', but then we get our "leave it unaffected for 0
 iterations" paradox. So we need a guard.
>>>
>>> You mean C code won't compile if i is firstprivate and not initialized?
>>> (Sorry, I'm not aware of such things.)
>>
>> It will compile and warn, but it is technically invalid, as you're
>> reading an uninitialized variable, which has undefined behavior. If
>> e.g. the variable contains a trap representation on a certain
>> architecture, it might halt the program (I'm not sure which
>> architecture that would be, but I believe they exist).
>>
>>> My first instinct is to initialize i to 0xbadabada. After all, its value
>>> is
>>> not specified -- we're not violating any Cython specs by initializing it
>>> to
>>> garbage ourselves.
>>
>> The problem is that we don't know whether the user has initialized the
>> variable. So if we want firstprivate to suppress warnings, we should
>> assume that the user hasn't and do it ourselves.
>
> I meant that if we don't care about Valgrindability, we can initialize i at
> the top of our function (i.e. where it says "int __pyx_v_i").

Indeed, but as the current semantics don't do this, I think we also
shouldn't. The good thing is that if we don't do it, the user will see
warnings from the C compiler if used uninitialized.

>>> OTOH, I see that your approach with an if-test is more Valgrind-friendly,
>>> so
>>> I'm OK with that.
>>>
>>> Would it work to do
>>>
>>> if (nsteps>  0) {
>>>    #pragma omp parallel
>>>    i = 0;
>>>    #pragma omp for lastprivate(i)
>>>    for (temp = 0; ...) ...
>>>    ...
>>> }
>>
>> I'm assuming you mean #pragma omp parallel private(i), otherwise you
>> have a race (I'm not sure how much that matters for assignment). In
>> any case, with the private() clause 'i' would be uninitialized
>> afterwards. In either case it won't do anything useful.
>
> Sorry, I meant that lastprivate(i) should go on the parallel line.
>
> if (nsteps>  0) {
>    #pragma omp parallel lastprivate(i)
>    i = 0;
>    #pragma omp for
>    for (temp = 0; ...) ...
>    ...
> }
>
> won't this silence the warning? At any rate, it's obvious you have a better
> handle on this than me, so I'll shut up now and leave you to it :-)

lastprivate() is not valid on a plain parallel constructs, as it's not
a loop. There's only private() and shared().
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn
Moving pull requestion discussion 
(https://github.com/cython/cython/pull/28) over here:


First, I got curious why you'd have a strip off "-pthread" from CC. I'd 
think you could just execute with it with "-pthread", which seems simpler.


Second: If parallel.parallel is not callable, how are scheduling 
parameters for parallel blocks handled? Is there a reason to not support 
that? Do you think it should stay this way, or will parallel take 
parameters in the future?


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 18:35, Dag Sverre Seljebotn  wrote:
> Moving pull requestion discussion (https://github.com/cython/cython/pull/28)
> over here:
>
> First, I got curious why you'd have a strip off "-pthread" from CC. I'd
> think you could just execute with it with "-pthread", which seems simpler.

It needs to end up in a list of arguments, and it's not needed at all
as I only need the version. I guess I could do (cc + " -v").split()
but eh.

> Second: If parallel.parallel is not callable, how are scheduling parameters
> for parallel blocks handled? Is there a reason to not support that? Do you
> think it should stay this way, or will parallel take parameters in the
> future?

Well, as I mentioned a while back, you cannot schedule parallel
blocks, there is no worksharing involved. All a parallel block does is
executed a code block in however many threads there are available. The
scheduling parameters are valid for a worksharing for loop only, as
you schedule (read "distribute") the work among the threads.

> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 07:03 PM, mark florisson wrote:

On 4 May 2011 18:35, Dag Sverre Seljebotn  wrote:

Moving pull requestion discussion (https://github.com/cython/cython/pull/28)
over here:

First, I got curious why you'd have a strip off "-pthread" from CC. I'd
think you could just execute with it with "-pthread", which seems simpler.


It needs to end up in a list of arguments, and it's not needed at all
as I only need the version. I guess I could do (cc + " -v").split()
but eh.


OK, that's reassuring, thought perhaps you had encountered a strange gcc 
strain.





Second: If parallel.parallel is not callable, how are scheduling parameters
for parallel blocks handled? Is there a reason to not support that? Do you
think it should stay this way, or will parallel take parameters in the
future?


Well, as I mentioned a while back, you cannot schedule parallel
blocks, there is no worksharing involved. All a parallel block does is
executed a code block in however many threads there are available. The
scheduling parameters are valid for a worksharing for loop only, as
you schedule (read "distribute") the work among the threads.


Perhaps I used the wrong terms; but checking the specs, I guess I meant 
"num_threads", which definitely applies to parallel.


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 19:44, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 07:03 PM, mark florisson wrote:
>>
>> On 4 May 2011 18:35, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> Moving pull requestion discussion
>>> (https://github.com/cython/cython/pull/28)
>>> over here:
>>>
>>> First, I got curious why you'd have a strip off "-pthread" from CC. I'd
>>> think you could just execute with it with "-pthread", which seems
>>> simpler.
>>
>> It needs to end up in a list of arguments, and it's not needed at all
>> as I only need the version. I guess I could do (cc + " -v").split()
>> but eh.
>
> OK, that's reassuring, thought perhaps you had encountered a strange gcc
> strain.
>
>>
>>> Second: If parallel.parallel is not callable, how are scheduling
>>> parameters
>>> for parallel blocks handled? Is there a reason to not support that? Do
>>> you
>>> think it should stay this way, or will parallel take parameters in the
>>> future?
>>
>> Well, as I mentioned a while back, you cannot schedule parallel
>> blocks, there is no worksharing involved. All a parallel block does is
>> executed a code block in however many threads there are available. The
>> scheduling parameters are valid for a worksharing for loop only, as
>> you schedule (read "distribute") the work among the threads.
>
> Perhaps I used the wrong terms; but checking the specs, I guess I meant
> "num_threads", which definitely applies to parallel.

Ah, that level of scheduling :) Right, so it doesn't take that, but I
don't think it's a big issue. If dynamic scheduling is enabled, it's
only a suggestion, if dynamic scheduling is disabled (whether it's
turned on or off by default is implementation defined) it will give
the the amount of threads requested, if available.
The user can still use omp_set_num_threads(), although admittedly that
modifies a global setting.

> Dag Sverre
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Fused Types

2011-05-04 Thread Robert Bradshaw
On Wed, May 4, 2011 at 1:47 AM, mark florisson
 wrote:
> On 4 May 2011 10:24, Dag Sverre Seljebotn  wrote:
>> On 05/04/2011 01:07 AM, Greg Ewing wrote:
>>>
>>> mark florisson wrote:
>>>
 cdef func(floating x, floating y):
 ...

 you get a "float, float" version, and a "double, double" version, but
 not "float, double" or "double, float".
>>>
>>> It's hard to draw conclusions from this example because
>>> it's degenerate. You don't really need multiple versions of a
>>> function like that, because of float <-> double coercions.
>>>
>>> A more telling example might be
>>>
>>> cdef double dot_product(floating *u, floating *v, int length)
>>>
>>> By your current rules, this would give you one version that
>>> takes two float vectors, and another that takes two double
>>> vectors.
>>>
>>> But if you want to find the dot product of a float vector and
>>> a double vector, you're out of luck.
>>
>> First, I'm open for your proposed syntax too...But in the interest of seeing
>> how we got here:
>>
>> The argument to the above goes that you *should* be out of luck. For
>> instance, talking about dot products, BLAS itself has float-float and
>> double-double, but not float-double AFAIK.
>>
>> What you are saying that this does not have the full power of C++ templates.
>> And the answer is that yes, this does not have the full power of C++
>> templates.
>>
>> At the same time we discussed this, we also discussed better support for
>> string-based templating languages (so that, e.g., compilation error messages
>> could refer to the template file). The two are complementary.
>>
>> Going back to Greg's syntax: What I don't like is that it makes the simple
>> unambiguous cases, where this would actually be used in real life, less
>> readable.
>>
>> Would it be too complicated to have both? For instance;
>>
>>  i) You are allowed to use a *single* fused_type on a *function* without
>> declaration.
>>
>> def f(floating x, floating *y): # ok
>>
>> Turns into
>>
>> def f[floating T](T x, T *y):
>>
>> This is NOT ok:
>>
>> def f(floating x, integral y):
>> # ERROR: Please explicitly declare fused types inside []
>>
>>  ii) Using more than one fused type, or using it on a cdef class or struct,
>> you need to use the [] declaration.
>>
>
> I don't think it would be too complicated, but as you mention it's
> probably not a very likely case, and if the user does need it, a new
> (equivalent) fused type can be created. The current way reads a lot
> nicer than the indexed one in my opinion. So I'd be fine with
> implementing it, but I find the current way more elegant.

I was actually thinking of exactly the same thing--supporting syntax
(i) for the case of a single type parameter, but the drawback is the
introduction of two distinct syntaxes for essentially the same
feature. Something like this is necessary to give an ordering to the
types for structs and classes, or when a fused type is used for
intermediate results but not in the argument list. I really like the
elegance of the (much more common) single-parameter variant.

Another option is using the with syntax, which was also considered for
supporting C++ templates.

>> Finally: It is a bit uncomfortable that we seem to be hashing things out
>> even as Mark is implementing this. Would it be feasible to have a Skype
>> session sometimes this week where everybody interested in the outcome of
>> this come together for an hour and actually decide on something?
>>
>> Mark: How much does this discussion of syntax impact your development? Are
>> you able to treat them just as polish on top and work on the "engine"
>> undisturbed by this?
>
> Thanks for your consideration, I admit it it feels a bit uncomfortable
> :) But at least this change shouldn't have such a big impact on the
> code, it would mean some changes in a select few places, so it's
> definitely polish. In any event, before we settle on this, I'd like to
> do the cpdef support first and work on indexing from Python space, so
> I think we have enough time to settle this argument on the ML.
> Before that, I'm just going to finish up for a pull request for the
> OpenMP branch, I'd like to see if I can get rid of some warnings.

Yes, please feel free to focus on the back end and move onto other the
things while the syntax is still in limbo, rather than implementing
every whim of the mailing list :).

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Yury V. Zaytsev
On Wed, 2011-05-04 at 10:35 +0400, Vitja Makarov wrote: 
> > Can you please provide me jenkins account and I'll try to fix the issues 
> > myself?
> >
> 
> It's better to use:
> 
> $ git fetch origin
> $ git checkout -f origin/master
> 
> Instead of git pull

Or

$ git fetch origin
$ git reset --hard origin/master

which is what we used for our buildbot.

-- 
Sincerely yours,
Yury V. Zaytsev


___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Vitja Makarov
2011/5/4 Yury V. Zaytsev :
> On Wed, 2011-05-04 at 10:35 +0400, Vitja Makarov wrote:
>> > Can you please provide me jenkins account and I'll try to fix the issues 
>> > myself?
>> >
>>
>> It's better to use:
>>
>> $ git fetch origin
>> $ git checkout -f origin/master
>>
>> Instead of git pull
>
> Or
>
> $ git fetch origin
> $ git reset --hard origin/master
>
> which is what we used for our buildbot.
>
> --
> Sincerely yours,
> Yury V. Zaytsev
>

Thanks! Am I right: when you do reset '--hard origin/master' you are
on the master branch and when you do checkout you are in a 'detached
state'?

But it seems to me that the problem is somewhere in the jenkins configuration.


-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Yury V. Zaytsev
On Wed, 2011-05-04 at 22:43 +0400, Vitja Makarov wrote:

> Thanks! Am I right: when you do reset '--hard origin/master' you are
> on the master branch and when you do checkout you are in a 'detached
> state'?

Yes, I think that you are right, that's why we used to do reset instead:

$ git fetch origin
$ git checkout master
$ git reset --hard origin/master

By the way, you can also do

$ git clean -dfx

to make sure that EVERYTHING that doesn't belong to the tree is plainly
wiped out (don't do that on your real checkouts unless you definitively
have nothing to lose).

> But it seems to me that the problem is somewhere in the jenkins configuration.

I didn't mean to say that there's no problem with Jenkins, just wanted
to suggest a possibly better way of updating the CI checkout :-)

-- 
Sincerely yours,
Yury V. Zaytsev


___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Stefan Behnel

Vitja Makarov, 04.05.2011 07:09:

Jenkins doesn't work for me. It seems that it can't do pull and is
running tests again obsolete sources.
May be because of forced push.

There are only 6 errors here:
https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/


According to the build logs of the sdist job, it was previously checking 
out the "master" branch and it seems you reconfigured it to use the 
"unreachable_code" branch now. At least the recent checkouts have used the 
latest snapshot of the branches, so ISTM that everything is working 
correctly. Could you point me to a build where something was going wrong 
for you?


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 08:07 PM, mark florisson wrote:

On 4 May 2011 19:44, Dag Sverre Seljebotn  wrote:

On 05/04/2011 07:03 PM, mark florisson wrote:


On 4 May 2011 18:35, Dag Sverre Seljebotn
  wrote:


Moving pull requestion discussion
(https://github.com/cython/cython/pull/28)
over here:

First, I got curious why you'd have a strip off "-pthread" from CC. I'd
think you could just execute with it with "-pthread", which seems
simpler.


It needs to end up in a list of arguments, and it's not needed at all
as I only need the version. I guess I could do (cc + " -v").split()
but eh.


OK, that's reassuring, thought perhaps you had encountered a strange gcc
strain.




Second: If parallel.parallel is not callable, how are scheduling
parameters
for parallel blocks handled? Is there a reason to not support that? Do
you
think it should stay this way, or will parallel take parameters in the
future?


Well, as I mentioned a while back, you cannot schedule parallel
blocks, there is no worksharing involved. All a parallel block does is
executed a code block in however many threads there are available. The
scheduling parameters are valid for a worksharing for loop only, as
you schedule (read "distribute") the work among the threads.


Perhaps I used the wrong terms; but checking the specs, I guess I meant
"num_threads", which definitely applies to parallel.


Ah, that level of scheduling :) Right, so it doesn't take that, but I
don't think it's a big issue. If dynamic scheduling is enabled, it's
only a suggestion, if dynamic scheduling is disabled (whether it's
turned on or off by default is implementation defined) it will give
the the amount of threads requested, if available.
The user can still use omp_set_num_threads(), although admittedly that
modifies a global setting.


Hmm...I'm not completely happy about this. For now I just worry about 
not shutting off the possibility of adding thread-pool-spawning 
parameters in the future. Specifying the number of threads can be 
useful, and omp_set_num_threads is a bad way of doing as you say.


And other backends than OpenMP may call for something we don't know what 
is yet?


Anyway, all I'm asking is whether we should require trailing () on parallel:

with nogil, parallel(): ...

I think we should, to keep the window open for options. Unless, that is, 
we're OK both with and without trailing () down the line.


Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Vitja Makarov
2011/5/4 Stefan Behnel :
> Vitja Makarov, 04.05.2011 07:09:
>>
>> Jenkins doesn't work for me. It seems that it can't do pull and is
>> running tests again obsolete sources.
>> May be because of forced push.
>>
>> There are only 6 errors here:
>>
>> https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/
>
> According to the build logs of the sdist job, it was previously checking out
> the "master" branch and it seems you reconfigured it to use the
> "unreachable_code" branch now. At least the recent checkouts have used the
> latest snapshot of the branches, so ISTM that everything is working
> correctly. Could you point me to a build where something was going wrong for
> you?
>
> Stefan


I've added the following line to sdist target

+rm -fr $WORKSPACE/dist
$WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar
--cython-profile --no-cython-compile

Hope that should help, that's the only difference between
cython-devel-sdist and cython-vitek-sdist.


See here:
https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/

You can't find w_unreachable in the logs, it seems that cython code
there is outdated.

-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Fused Types

2011-05-04 Thread Dag Sverre Seljebotn

On 05/04/2011 08:13 PM, Robert Bradshaw wrote:

On Wed, May 4, 2011 at 1:47 AM, mark florisson
  wrote:

On 4 May 2011 10:24, Dag Sverre Seljebotn  wrote:

On 05/04/2011 01:07 AM, Greg Ewing wrote:


mark florisson wrote:


cdef func(floating x, floating y):
...

you get a "float, float" version, and a "double, double" version, but
not "float, double" or "double, float".


It's hard to draw conclusions from this example because
it's degenerate. You don't really need multiple versions of a
function like that, because of float<->  double coercions.

A more telling example might be

cdef double dot_product(floating *u, floating *v, int length)

By your current rules, this would give you one version that
takes two float vectors, and another that takes two double
vectors.

But if you want to find the dot product of a float vector and
a double vector, you're out of luck.


First, I'm open for your proposed syntax too...But in the interest of seeing
how we got here:

The argument to the above goes that you *should* be out of luck. For
instance, talking about dot products, BLAS itself has float-float and
double-double, but not float-double AFAIK.

What you are saying that this does not have the full power of C++ templates.
And the answer is that yes, this does not have the full power of C++
templates.

At the same time we discussed this, we also discussed better support for
string-based templating languages (so that, e.g., compilation error messages
could refer to the template file). The two are complementary.

Going back to Greg's syntax: What I don't like is that it makes the simple
unambiguous cases, where this would actually be used in real life, less
readable.

Would it be too complicated to have both? For instance;

  i) You are allowed to use a *single* fused_type on a *function* without
declaration.

def f(floating x, floating *y): # ok

Turns into

def f[floating T](T x, T *y):

This is NOT ok:

def f(floating x, integral y):
# ERROR: Please explicitly declare fused types inside []

  ii) Using more than one fused type, or using it on a cdef class or struct,
you need to use the [] declaration.



I don't think it would be too complicated, but as you mention it's
probably not a very likely case, and if the user does need it, a new
(equivalent) fused type can be created. The current way reads a lot
nicer than the indexed one in my opinion. So I'd be fine with
implementing it, but I find the current way more elegant.


I was actually thinking of exactly the same thing--supporting syntax
(i) for the case of a single type parameter, but the drawback is the
introduction of two distinct syntaxes for essentially the same
feature. Something like this is necessary to give an ordering to the
types for structs and classes, or when a fused type is used for
intermediate results but not in the argument list. I really like the
elegance of the (much more common) single-parameter variant.

Another option is using the with syntax, which was also considered for
supporting C++ templates.


In particular since that will work in pure Python mode. One thing I 
worry about with the func[]()-syntax is that it is not Python compatible.


That's one thing I like about the CEP, that in time we can do

def f(x: floating) -> floating:
...

and have something that's nice in both Python and Cython.

Dag Sverre
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Stefan Behnel

Vitja Makarov, 04.05.2011 21:14:

2011/5/4 Stefan Behnel:

Vitja Makarov, 04.05.2011 07:09:


Jenkins doesn't work for me. It seems that it can't do pull and is
running tests again obsolete sources.
May be because of forced push.

There are only 6 errors here:

https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/


According to the build logs of the sdist job, it was previously checking out
the "master" branch and it seems you reconfigured it to use the
"unreachable_code" branch now. At least the recent checkouts have used the
latest snapshot of the branches, so ISTM that everything is working
correctly. Could you point me to a build where something was going wrong for
you?


I've added the following line to sdist target

+rm -fr $WORKSPACE/dist
$WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar
--cython-profile --no-cython-compile

Hope that should help, that's the only difference between
cython-devel-sdist and cython-vitek-sdist.


See here:
https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/

You can't find w_unreachable in the logs, it seems that cython code
there is outdated.


Ah, right, that's it. You can see the problem here:

https://sage.math.washington.edu:8091/hudson/job/cython-vitek-sdist/97/artifact/dist/Cython-0.14+.tar.gz/*fingerprint*/

It's been using the 0.14+ sdist for ages instead of the 0.14.1+ one.

That could also explain why your CPython regression tests are running much 
faster than in cython-devel.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-05-04 Thread mark florisson
On 4 May 2011 21:13, Dag Sverre Seljebotn  wrote:
> On 05/04/2011 08:07 PM, mark florisson wrote:
>>
>> On 4 May 2011 19:44, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> On 05/04/2011 07:03 PM, mark florisson wrote:

 On 4 May 2011 18:35, Dag Sverre Seljebotn
  wrote:
>
> Moving pull requestion discussion
> (https://github.com/cython/cython/pull/28)
> over here:
>
> First, I got curious why you'd have a strip off "-pthread" from CC. I'd
> think you could just execute with it with "-pthread", which seems
> simpler.

 It needs to end up in a list of arguments, and it's not needed at all
 as I only need the version. I guess I could do (cc + " -v").split()
 but eh.
>>>
>>> OK, that's reassuring, thought perhaps you had encountered a strange gcc
>>> strain.
>>>

> Second: If parallel.parallel is not callable, how are scheduling
> parameters
> for parallel blocks handled? Is there a reason to not support that? Do
> you
> think it should stay this way, or will parallel take parameters in the
> future?

 Well, as I mentioned a while back, you cannot schedule parallel
 blocks, there is no worksharing involved. All a parallel block does is
 executed a code block in however many threads there are available. The
 scheduling parameters are valid for a worksharing for loop only, as
 you schedule (read "distribute") the work among the threads.
>>>
>>> Perhaps I used the wrong terms; but checking the specs, I guess I meant
>>> "num_threads", which definitely applies to parallel.
>>
>> Ah, that level of scheduling :) Right, so it doesn't take that, but I
>> don't think it's a big issue. If dynamic scheduling is enabled, it's
>> only a suggestion, if dynamic scheduling is disabled (whether it's
>> turned on or off by default is implementation defined) it will give
>> the the amount of threads requested, if available.
>> The user can still use omp_set_num_threads(), although admittedly that
>> modifies a global setting.
>
> Hmm...I'm not completely happy about this. For now I just worry about not
> shutting off the possibility of adding thread-pool-spawning parameters in
> the future. Specifying the number of threads can be useful, and
> omp_set_num_threads is a bad way of doing as you say.
>
> And other backends than OpenMP may call for something we don't know what is
> yet?
>
> Anyway, all I'm asking is whether we should require trailing () on parallel:
>
> with nogil, parallel(): ...
>
> I think we should, to keep the window open for options. Unless, that is,
> we're OK both with and without trailing () down the line.

Ok, sure, that's fine with me.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] jenkins problems

2011-05-04 Thread Vitja Makarov
2011/5/4 Stefan Behnel :
> Vitja Makarov, 04.05.2011 21:14:
>>
>> 2011/5/4 Stefan Behnel:
>>>
>>> Vitja Makarov, 04.05.2011 07:09:

 Jenkins doesn't work for me. It seems that it can't do pull and is
 running tests again obsolete sources.
 May be because of forced push.

 There are only 6 errors here:


 https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/
>>>
>>> According to the build logs of the sdist job, it was previously checking
>>> out
>>> the "master" branch and it seems you reconfigured it to use the
>>> "unreachable_code" branch now. At least the recent checkouts have used
>>> the
>>> latest snapshot of the branches, so ISTM that everything is working
>>> correctly. Could you point me to a build where something was going wrong
>>> for
>>> you?
>>
>> I've added the following line to sdist target
>>
>> +rm -fr $WORKSPACE/dist
>> $WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar
>> --cython-profile --no-cython-compile
>>
>> Hope that should help, that's the only difference between
>> cython-devel-sdist and cython-vitek-sdist.
>>
>>
>> See here:
>>
>> https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/
>>
>> You can't find w_unreachable in the logs, it seems that cython code
>> there is outdated.
>
> Ah, right, that's it. You can see the problem here:
>
> https://sage.math.washington.edu:8091/hudson/job/cython-vitek-sdist/97/artifact/dist/Cython-0.14+.tar.gz/*fingerprint*/
>
> It's been using the 0.14+ sdist for ages instead of the 0.14.1+ one.
>
> That could also explain why your CPython regression tests are running much
> faster than in cython-devel.
>

Ok, so I should take a look at pyregr tests closer.

-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Fused Types

2011-05-04 Thread Greg Ewing

Dag Sverre Seljebotn wrote:

The argument to the above goes that you *should* be out of luck. For 
instance, talking about dot products, BLAS itself has float-float and 
double-double, but not float-double AFAIK.


Seems to me that's more because generating lots of versions
of a function in C is hard work, and the designers of BLAS
didn't think it was worth providing more than two versions.
If they'd had a tool that would magically generate all the
combinations for them, they might have made a different choice.

What you seem to be trying to do here is enable compile-time
duck typing, so that you can write a function that "just works"
with a variety of argument types, without having to think
about the details.

With that mindset, seeing a function declared as

  cdef func(floating x, floating y)

one would expect that x and y could be independently chosen
as any of the types classified as "floating", because that's
the way duck typing usually works. For example, if a Python
function is documented as taking two sequences, you expect
that to mean *any* two sequences, not two sequences of the
same type.


What you are saying that this does not have the full power of C++ 
templates. And the answer is that yes, this does not have the full power 
of C++ templates.


What I'm suggesting doesn't have the full power of C++ templates
either, because the range of possible values for each type
parameter would still have to be specified in advance. However,
it makes the dependencies between the type parameters explicit,
rather than being hidden in some rather unintuitive implicit rules.

Would it be feasible to have a Skype 
session sometimes this week where everybody interested in the outcome of 
this come together for an hour and actually decide on something?


I'm not sure that would help much. Reaching good decisions about
things like this requires time to think through all the issues.

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Hudson pyregr testing takes too long

2011-05-04 Thread Vitja Makarov
2011/4/25 Vitja Makarov :
> 2011/4/25 Stefan Behnel :
>> Vitja Makarov, 25.04.2011 11:04:
>>>
>>> 2011/4/25 Stefan Behnel:

 Vitja Makarov, 25.04.2011 08:19:
>
> 2011/4/25 Stefan Behnel:
>>
>> Stefan Behnel, 07.04.2011 13:52:
>>>
>>> Stefan Behnel, 07.04.2011 13:46:

 I just noticed that the CPython pyregr tests have jumped up from ~14
 minutes for a run to ~4 hours when we added generator support.




 https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests-pyregr-py26-c/buildTimeTrend

 I currently have no idea why that is (well, it's likely because we
 compile
 more tests now, but Vitja's branch ran the tests in ~30 minutes). It
 would
 be great if someone could find the time to analyse this problem. The
 current run time makes it basically impossible to keep these tests
 enabled.
>>>
>>> Ok, it looks like this is mostly an issue with the Py2.6 tests. The
>>> Py2.7
>>> tests take 30-45 minutes, which is very long, but not completely out
>>> of
>>> bounds. I've disabled the Py2.6 pyregr tests for now.
>>
>> There seems to be a huge memory leak which almost certainly accounts
>> for
>> this. The Python process that runs the pyregr suite ends up with about
>> 50GB
>> of memory at the end, also in the latest Py3k builds.
>>
>> I have no idea where it may be, but it started to show when we merged
>> the
>> generator support. That's where I noticed the instant jump in the
>> runtime.
>
> That's very strange for my branch it takes about 30 minutes that is ok.

 There's also a second path that's worth investigating. As part of the
 merge,
 there was another change that came in: the CythonPyregrTestCase
 implementation. This means that the regression tests are now being run
 differently than before. The massive memory consumption may simply be due
 to
 the mass of unit tests being loaded into memory.
>>>
>>>    def run_test():
>>> ..
>>>         try:
>>>             module = __import__(self.module)
>>>             if hasattr(module, 'test_main'):
>>>                 module.test_main()
>>>         except (unittest.SkipTest, support.ResourceDenied):
>>>             result.addSkip(self, 'ok')
>>>
>>>
>>> It seems that all the modules stay loaded so may be they should be
>>> unloaded with del sys.modules[module_name]?
>>
>> (Binary) module unloading isn't really supported in CPython. There's PEP
>> 3121 that has the potential to change it, but it's not completely
>> implemented, neither in CPython nor in Cython. A major problem is that
>> unloading a module deletes its globals but not necessarily the code that
>> uses them. For example, instances of types defined in the module can still
>> be alive at that point.
>>
>> The way runtests.py deals with this is forking before loading a module.
>> However, this does not currently work with the "xmlrunner" which we use on
>> Hudson, so we let all tests run in a single process there.
>>
>
>
> Btw when running plain python code with generators total ref counter
> doesn't get back to initial value.
> I tried to trace scope and generator destructors and they are run as
> expected. So I'm not sure about leaks in generators.
>

Recently I've found that pyregr.test_dict (test_mutatingiteration)
test makes it slow:

def test_mutatingiteration():
d = {}
d[1] = 1
for i in d:
print i
d[i+1] = 1

test_mutatingiteration()


In CPython this code raises: RuntimeError: dictionary changed size
during iteration
And in Cython you have infinite loop. So we can disable this test for now.


-- 
vitja.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel