Re: [Cython] Fused Types
On 4 May 2011 01:07, Greg Ewing wrote: > mark florisson wrote: > >> cdef func(floating x, floating y): >> ... >> >> you get a "float, float" version, and a "double, double" version, but >> not "float, double" or "double, float". > > It's hard to draw conclusions from this example because > it's degenerate. You don't really need multiple versions of a > function like that, because of float <-> double coercions. It's only degenerate if you want a real world example, and not one that provides a simple answer to your original question... > A more telling example might be > > cdef double dot_product(floating *u, floating *v, int length) > > By your current rules, this would give you one version that > takes two float vectors, and another that takes two double > vectors. > > But if you want to find the dot product of a float vector and > a double vector, you're out of luck. Sure, so you can create two fused types. I do however somewhat like your proposal with the indexing in the definition. > -- > Greg > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Fused Types
On 05/04/2011 01:07 AM, Greg Ewing wrote: mark florisson wrote: cdef func(floating x, floating y): ... you get a "float, float" version, and a "double, double" version, but not "float, double" or "double, float". It's hard to draw conclusions from this example because it's degenerate. You don't really need multiple versions of a function like that, because of float <-> double coercions. A more telling example might be cdef double dot_product(floating *u, floating *v, int length) By your current rules, this would give you one version that takes two float vectors, and another that takes two double vectors. But if you want to find the dot product of a float vector and a double vector, you're out of luck. First, I'm open for your proposed syntax too...But in the interest of seeing how we got here: The argument to the above goes that you *should* be out of luck. For instance, talking about dot products, BLAS itself has float-float and double-double, but not float-double AFAIK. What you are saying that this does not have the full power of C++ templates. And the answer is that yes, this does not have the full power of C++ templates. At the same time we discussed this, we also discussed better support for string-based templating languages (so that, e.g., compilation error messages could refer to the template file). The two are complementary. Going back to Greg's syntax: What I don't like is that it makes the simple unambiguous cases, where this would actually be used in real life, less readable. Would it be too complicated to have both? For instance; i) You are allowed to use a *single* fused_type on a *function* without declaration. def f(floating x, floating *y): # ok Turns into def f[floating T](T x, T *y): This is NOT ok: def f(floating x, integral y): # ERROR: Please explicitly declare fused types inside [] ii) Using more than one fused type, or using it on a cdef class or struct, you need to use the [] declaration. Finally: It is a bit uncomfortable that we seem to be hashing things out even as Mark is implementing this. Would it be feasible to have a Skype session sometimes this week where everybody interested in the outcome of this come together for an hour and actually decide on something? Mark: How much does this discussion of syntax impact your development? Are you able to treat them just as polish on top and work on the "engine" undisturbed by this? Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Fused Types
On 4 May 2011 10:24, Dag Sverre Seljebotn wrote: > On 05/04/2011 01:07 AM, Greg Ewing wrote: >> >> mark florisson wrote: >> >>> cdef func(floating x, floating y): >>> ... >>> >>> you get a "float, float" version, and a "double, double" version, but >>> not "float, double" or "double, float". >> >> It's hard to draw conclusions from this example because >> it's degenerate. You don't really need multiple versions of a >> function like that, because of float <-> double coercions. >> >> A more telling example might be >> >> cdef double dot_product(floating *u, floating *v, int length) >> >> By your current rules, this would give you one version that >> takes two float vectors, and another that takes two double >> vectors. >> >> But if you want to find the dot product of a float vector and >> a double vector, you're out of luck. > > First, I'm open for your proposed syntax too...But in the interest of seeing > how we got here: > > The argument to the above goes that you *should* be out of luck. For > instance, talking about dot products, BLAS itself has float-float and > double-double, but not float-double AFAIK. > > What you are saying that this does not have the full power of C++ templates. > And the answer is that yes, this does not have the full power of C++ > templates. > > At the same time we discussed this, we also discussed better support for > string-based templating languages (so that, e.g., compilation error messages > could refer to the template file). The two are complementary. > > Going back to Greg's syntax: What I don't like is that it makes the simple > unambiguous cases, where this would actually be used in real life, less > readable. > > Would it be too complicated to have both? For instance; > > i) You are allowed to use a *single* fused_type on a *function* without > declaration. > > def f(floating x, floating *y): # ok > > Turns into > > def f[floating T](T x, T *y): > > This is NOT ok: > > def f(floating x, integral y): > # ERROR: Please explicitly declare fused types inside [] > > ii) Using more than one fused type, or using it on a cdef class or struct, > you need to use the [] declaration. > I don't think it would be too complicated, but as you mention it's probably not a very likely case, and if the user does need it, a new (equivalent) fused type can be created. The current way reads a lot nicer than the indexed one in my opinion. So I'd be fine with implementing it, but I find the current way more elegant. > Finally: It is a bit uncomfortable that we seem to be hashing things out > even as Mark is implementing this. Would it be feasible to have a Skype > session sometimes this week where everybody interested in the outcome of > this come together for an hour and actually decide on something? > > Mark: How much does this discussion of syntax impact your development? Are > you able to treat them just as polish on top and work on the "engine" > undisturbed by this? Thanks for your consideration, I admit it it feels a bit uncomfortable :) But at least this change shouldn't have such a big impact on the code, it would mean some changes in a select few places, so it's definitely polish. In any event, before we settle on this, I'd like to do the cpdef support first and work on indexing from Python space, so I think we have enough time to settle this argument on the ML. Before that, I'm just going to finish up for a pull request for the OpenMP branch, I'd like to see if I can get rid of some warnings. > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 21 April 2011 20:13, Dag Sverre Seljebotn wrote: > On 04/21/2011 10:37 AM, Robert Bradshaw wrote: >> >> On Mon, Apr 18, 2011 at 7:51 AM, mark florisson >> wrote: >>> >>> On 18 April 2011 16:41, Dag Sverre Seljebotn >>> wrote: Excellent! Sounds great! (as I won't have my laptop for some days I can't have a look yet but I will later) You're right about (the current) buffers and the gil. A testcase explicitly for them would be good. Firstprivate etc: i think it'd be nice myself, but it is probably better to take a break from it at this point so that we can think more about that and not do anything rash; perhaps open up a specific thread on them and ask for more general input. Perhaps you want to take a break or task-switch to something else (fused types?) until I can get around to review and merge what you have so far? You'll know best what works for you though. If you decide to implement explicit threadprivate variables because you've got the flow I certainly wom't object myself. >>> Ok, cool, I'll move on :) I already included a test with a prange and >>> a numpy buffer with indexing. >> >> Wow, you're just plowing away at this. Very cool. >> >> +1 to disallowing nested prange, that seems to get really messy with >> little benefit. >> >> In terms of the CEP, I'm still unconvinced that firstprivate is not >> safe to infer, but lets leave the initial values undefined rather than >> specifying them to be NaNs (we can do that as an implementation if you >> want), which will give us flexibility to change later once we've had a >> chance to play around with it. > > I don't see any technical issues with inferring firstprivate, the question > is whether we want to. I suggest not inferring it in order to make this > safer: One should be able to just try to change a loop from "range" to > "prange", and either a) have things fail very hard, or b) just work > correctly and be able to trust the results. > > Note that when I suggest using NaN, it is as initial values for EACH > ITERATION, not per-thread initialization. It is not about "firstprivate" or > not, but about disabling thread-private variables entirely in favor of > "per-iteration" variables. > > I believe that by talking about "readonly" and "per-iteration" variables, > rather than "thread-shared" and "thread-private" variables, this can be used > much more safely and with virtually no knowledge of the details of > threading. Again, what's in my mind are scientific programmers with (too) > little training. > > In the end it's a matter of taste and what is most convenient to more users. > But I believe the case of needing real thread-private variables that > preserves per-thread values across iterations (and thus also can possibly > benefit from firstprivate) is seldomly enough used that an explicit > declaration is OK, in particular when it buys us so much in safety in the > common case. > > To be very precise, > > cdef double x, z > for i in prange(n): > x = f(x) > z = f(i) > ... > > goes to > > cdef double x, z > for i in prange(n): > x = z = nan > x = f(x) > z = f(i) > ... > > and we leave it to the C compiler to (trivially) optimize away "z = nan". > And, yes, it is a stopgap solution until we've got control flow analysis so > that we can outright disallow such uses of x (without threadprivate > declaration, which also gives firstprivate behaviour). > I think the preliminary OpenMP support is ready for review. It supports 'with cython.parallel.parallel:' and 'for i in cython.parallel.prange(...):'. It works in generators and closures and the docs are updated. Support for break/continue/with gil isn't there yet. There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp < nsteps; temp++) { i = start + temp * step; ... } So here it will complain about 'i' being potentially uninitialized, as it might not be assigned to in the loop. However, simply assigning 0 to 'i' can't work either, as you expect zero iterations not to touch it. So for now, we have a bunch of warnings, as I don't see a __attribute__ to suppress it selectively. The second is NaN-ing private variables, NaN isn't part of C. For gcc, the docs ( http://www.delorie.com/gnu/docs/glibc/libc_407.html ) have the following to say: "You can use `#ifdef NAN' to test whether the machine supports NaN. (Of course, you must arrange for GNU extensions to be visible, such as by defining _GNU_SOURCE, and then you must include `math.h'.)" So I'm thinking that if NaN is not available (or the compiler is not GCC), we can use FLT_MAX, DBL_MAX and LDBL_MAX instead from float.h. Would this be the proper way to handle this? _
Re: [Cython] prange CEP updated
On 05/04/2011 12:00 PM, mark florisson wrote: On 21 April 2011 20:13, Dag Sverre Seljebotn wrote: On 04/21/2011 10:37 AM, Robert Bradshaw wrote: On Mon, Apr 18, 2011 at 7:51 AM, mark florisson wrote: On 18 April 2011 16:41, Dag Sverre Seljebotn wrote: Excellent! Sounds great! (as I won't have my laptop for some days I can't have a look yet but I will later) You're right about (the current) buffers and the gil. A testcase explicitly for them would be good. Firstprivate etc: i think it'd be nice myself, but it is probably better to take a break from it at this point so that we can think more about that and not do anything rash; perhaps open up a specific thread on them and ask for more general input. Perhaps you want to take a break or task-switch to something else (fused types?) until I can get around to review and merge what you have so far? You'll know best what works for you though. If you decide to implement explicit threadprivate variables because you've got the flow I certainly wom't object myself. Ok, cool, I'll move on :) I already included a test with a prange and a numpy buffer with indexing. Wow, you're just plowing away at this. Very cool. +1 to disallowing nested prange, that seems to get really messy with little benefit. In terms of the CEP, I'm still unconvinced that firstprivate is not safe to infer, but lets leave the initial values undefined rather than specifying them to be NaNs (we can do that as an implementation if you want), which will give us flexibility to change later once we've had a chance to play around with it. I don't see any technical issues with inferring firstprivate, the question is whether we want to. I suggest not inferring it in order to make this safer: One should be able to just try to change a loop from "range" to "prange", and either a) have things fail very hard, or b) just work correctly and be able to trust the results. Note that when I suggest using NaN, it is as initial values for EACH ITERATION, not per-thread initialization. It is not about "firstprivate" or not, but about disabling thread-private variables entirely in favor of "per-iteration" variables. I believe that by talking about "readonly" and "per-iteration" variables, rather than "thread-shared" and "thread-private" variables, this can be used much more safely and with virtually no knowledge of the details of threading. Again, what's in my mind are scientific programmers with (too) little training. In the end it's a matter of taste and what is most convenient to more users. But I believe the case of needing real thread-private variables that preserves per-thread values across iterations (and thus also can possibly benefit from firstprivate) is seldomly enough used that an explicit declaration is OK, in particular when it buys us so much in safety in the common case. To be very precise, cdef double x, z for i in prange(n): x = f(x) z = f(i) ... goes to cdef double x, z for i in prange(n): x = z = nan x = f(x) z = f(i) ... and we leave it to the C compiler to (trivially) optimize away "z = nan". And, yes, it is a stopgap solution until we've got control flow analysis so that we can outright disallow such uses of x (without threadprivate declaration, which also gives firstprivate behaviour). I think the preliminary OpenMP support is ready for review. It supports 'with cython.parallel.parallel:' and 'for i in cython.parallel.prange(...):'. It works in generators and closures and the docs are updated. Support for break/continue/with gil isn't there yet. There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp< nsteps; temp++) { i = start + temp * step; ... } So here it will complain about 'i' being potentially uninitialized, as it might not be assigned to in the loop. However, simply assigning 0 to 'i' can't work either, as you expect zero iterations not to touch it. So for now, we have a bunch of warnings, as I don't see a __attribute__ to suppress it selectively. Isn't this is orthogonal to OpenMP -- even if it said "range", your testcase could get such a warning? If so, the fix is simply to initialize i in your testcase code. The second is NaN-ing private variables, NaN isn't part of C. For gcc, the docs ( http://www.delorie.com/gnu/docs/glibc/libc_407.html ) have the following to say: "You can use `#ifdef NAN' to test whether the machine supports NaN. (Of course, you must arrange for GNU extensions to be visible, such as by defining _GNU_SOURCE, and then you must include `math.h'.)" So I'm thinking that if NaN is not available (or the compiler is not GCC), we can use FLT_MAX, DBL_MAX and LDBL_MAX instead from float.h. Would this be the proper way to handle this? I think it is sufficient. A relati
Re: [Cython] prange CEP updated
On 4 May 2011 12:45, Dag Sverre Seljebotn wrote: > On 05/04/2011 12:00 PM, mark florisson wrote: >> >> On 21 April 2011 20:13, Dag Sverre Seljebotn >> wrote: >>> >>> On 04/21/2011 10:37 AM, Robert Bradshaw wrote: On Mon, Apr 18, 2011 at 7:51 AM, mark florisson wrote: > > On 18 April 2011 16:41, Dag Sverre > Seljebotn > wrote: >> >> Excellent! Sounds great! (as I won't have my laptop for some days I >> can't >> have a look yet but I will later) >> >> You're right about (the current) buffers and the gil. A testcase >> explicitly >> for them would be good. >> >> Firstprivate etc: i think it'd be nice myself, but it is probably >> better >> to >> take a break from it at this point so that we can think more about >> that >> and >> not do anything rash; perhaps open up a specific thread on them and >> ask >> for >> more general input. Perhaps you want to take a break or task-switch to >> something else (fused types?) until I can get around to review and >> merge >> what you have so far? You'll know best what works for you though. If >> you >> decide to implement explicit threadprivate variables because you've >> got >> the >> flow I certainly wom't object myself. >> > Ok, cool, I'll move on :) I already included a test with a prange and > a numpy buffer with indexing. Wow, you're just plowing away at this. Very cool. +1 to disallowing nested prange, that seems to get really messy with little benefit. In terms of the CEP, I'm still unconvinced that firstprivate is not safe to infer, but lets leave the initial values undefined rather than specifying them to be NaNs (we can do that as an implementation if you want), which will give us flexibility to change later once we've had a chance to play around with it. >>> >>> I don't see any technical issues with inferring firstprivate, the >>> question >>> is whether we want to. I suggest not inferring it in order to make this >>> safer: One should be able to just try to change a loop from "range" to >>> "prange", and either a) have things fail very hard, or b) just work >>> correctly and be able to trust the results. >>> >>> Note that when I suggest using NaN, it is as initial values for EACH >>> ITERATION, not per-thread initialization. It is not about "firstprivate" >>> or >>> not, but about disabling thread-private variables entirely in favor of >>> "per-iteration" variables. >>> >>> I believe that by talking about "readonly" and "per-iteration" variables, >>> rather than "thread-shared" and "thread-private" variables, this can be >>> used >>> much more safely and with virtually no knowledge of the details of >>> threading. Again, what's in my mind are scientific programmers with (too) >>> little training. >>> >>> In the end it's a matter of taste and what is most convenient to more >>> users. >>> But I believe the case of needing real thread-private variables that >>> preserves per-thread values across iterations (and thus also can possibly >>> benefit from firstprivate) is seldomly enough used that an explicit >>> declaration is OK, in particular when it buys us so much in safety in the >>> common case. >>> >>> To be very precise, >>> >>> cdef double x, z >>> for i in prange(n): >>> x = f(x) >>> z = f(i) >>> ... >>> >>> goes to >>> >>> cdef double x, z >>> for i in prange(n): >>> x = z = nan >>> x = f(x) >>> z = f(i) >>> ... >>> >>> and we leave it to the C compiler to (trivially) optimize away "z = nan". >>> And, yes, it is a stopgap solution until we've got control flow analysis >>> so >>> that we can outright disallow such uses of x (without threadprivate >>> declaration, which also gives firstprivate behaviour). >>> >> >> I think the preliminary OpenMP support is ready for review. It >> supports 'with cython.parallel.parallel:' and 'for i in >> cython.parallel.prange(...):'. It works in generators and closures and >> the docs are updated. Support for break/continue/with gil isn't there >> yet. >> >> There are two remaining issue. The first is warnings for potentially >> uninitialized variables for prange(). When you do >> >> for i in prange(start, stop, step): ... >> >> it generates code like >> >> nsteps = (stop - start) / step; >> #pragma omp parallel for lastprivate(i) >> for (temp = 0; temp< nsteps; temp++) { >> i = start + temp * step; >> ... >> } >> >> So here it will complain about 'i' being potentially uninitialized, as >> it might not be assigned to in the loop. However, simply assigning 0 >> to 'i' can't work either, as you expect zero iterations not to touch >> it. So for now, we have a bunch of warnings, as I don't see a >> __attribute__ to suppress it selectively. > > Isn't this is orthogonal to OpenMP -- even if it said "range", your testcase > could get such a warning? If so, the fix is simply to initialize i
Re: [Cython] prange CEP updated
On 05/04/2011 12:59 PM, mark florisson wrote: On 4 May 2011 12:45, Dag Sverre Seljebotn wrote: On 05/04/2011 12:00 PM, mark florisson wrote: There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp0, but the compiler doesn't detect this, so it will still issue a warning even if 'i' is initialized (the warning is at the place of the lastprivate declaration). Ah. But this is then more important than I initially thought it was. You are saying that this is the case: cdef int i = 0 with nogil: for i in prange(n): ... print i # garbage when n == 0? It would be in the interest of less semantic differences w.r.t. range to deal better with this case. Will it silence the warning if we make "i" firstprivate as well as lastprivate? firstprivate would only affect the case of zero iterations, since we overwrite with NaN if the loop is entered... Dag ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 13:15, Dag Sverre Seljebotn wrote: > On 05/04/2011 12:59 PM, mark florisson wrote: >> >> On 4 May 2011 12:45, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 12:00 PM, mark florisson wrote: There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp< nsteps; temp++) { i = start + temp * step; ... } So here it will complain about 'i' being potentially uninitialized, as it might not be assigned to in the loop. However, simply assigning 0 to 'i' can't work either, as you expect zero iterations not to touch it. So for now, we have a bunch of warnings, as I don't see a __attribute__ to suppress it selectively. >>> >>> Isn't this is orthogonal to OpenMP -- even if it said "range", your >>> testcase >>> could get such a warning? If so, the fix is simply to initialize i in >>> your >>> testcase code. >> >> No, the problem is that 'i' needs to be lastprivate, and 'i' is >> assigned to in the loop body. It's irrelevant whether 'i' is assigned >> to before the loop. I think this is the case because the spec says >> that lastprivate variables will get the value of the private variable >> of the last sequential iteration, but it cannot at compile time know >> whether there might be zero iterations, which I believe the spec >> doesn't have anything to say about. So basically we could guard >> against it by checking if nsteps> 0, but the compiler doesn't detect >> this, so it will still issue a warning even if 'i' is initialized (the >> warning is at the place of the lastprivate declaration). > > Ah. But this is then more important than I initially thought it was. You are > saying that this is the case: > > cdef int i = 0 > with nogil: > for i in prange(n): > ... > print i # garbage when n == 0? I think it may be, depending on the implementation. With libgomp it return 0. With the check it should also return 0. > It would be in the interest of less semantic differences w.r.t. range to > deal better with this case. > > Will it silence the warning if we make "i" firstprivate as well as > lastprivate? firstprivate would only affect the case of zero iterations, > since we overwrite with NaN if the loop is entered... Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, that works. So we need both the check and an initialization in there: if (nsteps > 0) { i = 0; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... } Now any subsequent read of 'i' will only issue a warning if 'i' is not initialized before the prange() by the user. So if you leave your index variable uninitialized (because you know in advance nsteps will be greater than zero), you'll still get a warning. But at least you will be able to shut up the compiler :) > Dag > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 01:30 PM, mark florisson wrote: On 4 May 2011 13:15, Dag Sverre Seljebotn wrote: On 05/04/2011 12:59 PM, mark florisson wrote: On 4 May 2011 12:45, Dag Sverre Seljebotn wrote: On 05/04/2011 12:00 PM, mark florisson wrote: There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp< nsteps; temp++) { i = start + temp * step; ... } So here it will complain about 'i' being potentially uninitialized, as it might not be assigned to in the loop. However, simply assigning 0 to 'i' can't work either, as you expect zero iterations not to touch it. So for now, we have a bunch of warnings, as I don't see a __attribute__ to suppress it selectively. Isn't this is orthogonal to OpenMP -- even if it said "range", your testcase could get such a warning? If so, the fix is simply to initialize i in your testcase code. No, the problem is that 'i' needs to be lastprivate, and 'i' is assigned to in the loop body. It's irrelevant whether 'i' is assigned to before the loop. I think this is the case because the spec says that lastprivate variables will get the value of the private variable of the last sequential iteration, but it cannot at compile time know whether there might be zero iterations, which I believe the spec doesn't have anything to say about. So basically we could guard against it by checking if nsteps>0, but the compiler doesn't detect this, so it will still issue a warning even if 'i' is initialized (the warning is at the place of the lastprivate declaration). Ah. But this is then more important than I initially thought it was. You are saying that this is the case: cdef int i = 0 with nogil: for i in prange(n): ... print i # garbage when n == 0? I think it may be, depending on the implementation. With libgomp it return 0. With the check it should also return 0. It would be in the interest of less semantic differences w.r.t. range to deal better with this case. Will it silence the warning if we make "i" firstprivate as well as lastprivate? firstprivate would only affect the case of zero iterations, since we overwrite with NaN if the loop is entered... Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, Doh. that works. So we need both the check and an initialization in there: if (nsteps> 0) { i = 0; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... } Why do you need the if-test? Won't simply #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... do the job -- any initial value will be copied into all threads, including the "last" thread, even if there are no iterations? Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 13:39, Dag Sverre Seljebotn wrote: > On 05/04/2011 01:30 PM, mark florisson wrote: >> >> On 4 May 2011 13:15, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 12:59 PM, mark florisson wrote: On 4 May 2011 12:45, Dag Sverre Seljebotn wrote: > > On 05/04/2011 12:00 PM, mark florisson wrote: >> >> There are two remaining issue. The first is warnings for potentially >> uninitialized variables for prange(). When you do >> >> for i in prange(start, stop, step): ... >> >> it generates code like >> >> nsteps = (stop - start) / step; >> #pragma omp parallel for lastprivate(i) >> for (temp = 0; temp< nsteps; temp++) { >> i = start + temp * step; >> ... >> } >> >> So here it will complain about 'i' being potentially uninitialized, as >> it might not be assigned to in the loop. However, simply assigning 0 >> to 'i' can't work either, as you expect zero iterations not to touch >> it. So for now, we have a bunch of warnings, as I don't see a >> __attribute__ to suppress it selectively. > > Isn't this is orthogonal to OpenMP -- even if it said "range", your > testcase > could get such a warning? If so, the fix is simply to initialize i in > your > testcase code. No, the problem is that 'i' needs to be lastprivate, and 'i' is assigned to in the loop body. It's irrelevant whether 'i' is assigned to before the loop. I think this is the case because the spec says that lastprivate variables will get the value of the private variable of the last sequential iteration, but it cannot at compile time know whether there might be zero iterations, which I believe the spec doesn't have anything to say about. So basically we could guard against it by checking if nsteps> 0, but the compiler doesn't detect this, so it will still issue a warning even if 'i' is initialized (the warning is at the place of the lastprivate declaration). >>> >>> Ah. But this is then more important than I initially thought it was. You >>> are >>> saying that this is the case: >>> >>> cdef int i = 0 >>> with nogil: >>> for i in prange(n): >>> ... >>> print i # garbage when n == 0? >> >> I think it may be, depending on the implementation. With libgomp it >> return 0. With the check it should also return 0. >> >>> It would be in the interest of less semantic differences w.r.t. range to >>> deal better with this case. >>> >>> Will it silence the warning if we make "i" firstprivate as well as >>> lastprivate? firstprivate would only affect the case of zero iterations, >>> since we overwrite with NaN if the loop is entered... >> >> Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, > > Doh. > >> that works. So we need both the check and an initialization in there: >> >> if (nsteps> 0) { >> i = 0; >> #pragma omp parallel for firstprivate(i) lastprivate(i) >> for (temp = 0; ...; ...) ... >> } > > Why do you need the if-test? Won't simply > > #pragma omp parallel for firstprivate(i) lastprivate(i) > for (temp = 0; ...; ...) ... > > do the job -- any initial value will be copied into all threads, including > the "last" thread, even if there are no iterations? It will, but you don't expect your iteration variable to change with zero iterations. > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 01:41 PM, mark florisson wrote: On 4 May 2011 13:39, Dag Sverre Seljebotn wrote: On 05/04/2011 01:30 PM, mark florisson wrote: On 4 May 2011 13:15, Dag Sverre Seljebotn wrote: On 05/04/2011 12:59 PM, mark florisson wrote: On 4 May 2011 12:45, Dag Sverre Seljebotn wrote: On 05/04/2011 12:00 PM, mark florisson wrote: There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp0, but the compiler doesn't detect this, so it will still issue a warning even if 'i' is initialized (the warning is at the place of the lastprivate declaration). Ah. But this is then more important than I initially thought it was. You are saying that this is the case: cdef int i = 0 with nogil: for i in prange(n): ... print i # garbage when n == 0? I think it may be, depending on the implementation. With libgomp it return 0. With the check it should also return 0. It would be in the interest of less semantic differences w.r.t. range to deal better with this case. Will it silence the warning if we make "i" firstprivate as well as lastprivate? firstprivate would only affect the case of zero iterations, since we overwrite with NaN if the loop is entered... Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, Doh. that works. So we need both the check and an initialization in there: if (nsteps>0) { i = 0; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... } Why do you need the if-test? Won't simply #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... do the job -- any initial value will be copied into all threads, including the "last" thread, even if there are no iterations? It will, but you don't expect your iteration variable to change with zero iterations. Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? DS ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 13:45, Dag Sverre Seljebotn wrote: > On 05/04/2011 01:41 PM, mark florisson wrote: >> >> On 4 May 2011 13:39, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 01:30 PM, mark florisson wrote: On 4 May 2011 13:15, Dag Sverre Seljebotn wrote: > > On 05/04/2011 12:59 PM, mark florisson wrote: >> >> On 4 May 2011 12:45, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 12:00 PM, mark florisson wrote: There are two remaining issue. The first is warnings for potentially uninitialized variables for prange(). When you do for i in prange(start, stop, step): ... it generates code like nsteps = (stop - start) / step; #pragma omp parallel for lastprivate(i) for (temp = 0; temp< nsteps; temp++) { i = start + temp * step; ... } So here it will complain about 'i' being potentially uninitialized, as it might not be assigned to in the loop. However, simply assigning 0 to 'i' can't work either, as you expect zero iterations not to touch it. So for now, we have a bunch of warnings, as I don't see a __attribute__ to suppress it selectively. >>> >>> Isn't this is orthogonal to OpenMP -- even if it said "range", your >>> testcase >>> could get such a warning? If so, the fix is simply to initialize i in >>> your >>> testcase code. >> >> No, the problem is that 'i' needs to be lastprivate, and 'i' is >> assigned to in the loop body. It's irrelevant whether 'i' is assigned >> to before the loop. I think this is the case because the spec says >> that lastprivate variables will get the value of the private variable >> of the last sequential iteration, but it cannot at compile time know >> whether there might be zero iterations, which I believe the spec >> doesn't have anything to say about. So basically we could guard >> against it by checking if nsteps> 0, but the compiler doesn't >> detect >> this, so it will still issue a warning even if 'i' is initialized (the >> warning is at the place of the lastprivate declaration). > > Ah. But this is then more important than I initially thought it was. > You > are > saying that this is the case: > > cdef int i = 0 > with nogil: > for i in prange(n): > ... > print i # garbage when n == 0? I think it may be, depending on the implementation. With libgomp it return 0. With the check it should also return 0. > It would be in the interest of less semantic differences w.r.t. range > to > deal better with this case. > > Will it silence the warning if we make "i" firstprivate as well as > lastprivate? firstprivate would only affect the case of zero > iterations, > since we overwrite with NaN if the loop is entered... Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, >>> >>> Doh. >>> that works. So we need both the check and an initialization in there: if (nsteps> 0) { i = 0; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... } >>> >>> Why do you need the if-test? Won't simply >>> >>> #pragma omp parallel for firstprivate(i) lastprivate(i) >>> for (temp = 0; ...; ...) ... >>> >>> do the job -- any initial value will be copied into all threads, >>> including >>> the "last" thread, even if there are no iterations? >> >> It will, but you don't expect your iteration variable to change with >> zero iterations. > > Look. > > i = 42 > for i in prange(n): > f(i) > print i # want 42 whenever n == 0 > > Now, translate this to: > > i = 42; > #pragma omp parallel for firstprivate(i) lastprivate(i) > for (temp = 0; ...; ...) { > i = ... > } > #pragma omp parallel end > /* At this point, i == 42 if n == 0 */ > > Am I missing something? Yes, 'i' may be uninitialized with nsteps > 0 (this should be valid code). So if nsteps > 0, we need to initialize 'i' to something to get correct behaviour with firstprivate. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 13:47, mark florisson wrote: > On 4 May 2011 13:45, Dag Sverre Seljebotn wrote: >> On 05/04/2011 01:41 PM, mark florisson wrote: >>> >>> On 4 May 2011 13:39, Dag Sverre Seljebotn >>> wrote: On 05/04/2011 01:30 PM, mark florisson wrote: > > On 4 May 2011 13:15, Dag Sverre Seljebotn > wrote: >> >> On 05/04/2011 12:59 PM, mark florisson wrote: >>> >>> On 4 May 2011 12:45, Dag Sverre Seljebotn >>> wrote: On 05/04/2011 12:00 PM, mark florisson wrote: > > There are two remaining issue. The first is warnings for potentially > uninitialized variables for prange(). When you do > > for i in prange(start, stop, step): ... > > it generates code like > > nsteps = (stop - start) / step; > #pragma omp parallel for lastprivate(i) > for (temp = 0; temp< nsteps; temp++) { > i = start + temp * step; > ... > } > > So here it will complain about 'i' being potentially uninitialized, > as > it might not be assigned to in the loop. However, simply assigning 0 > to 'i' can't work either, as you expect zero iterations not to touch > it. So for now, we have a bunch of warnings, as I don't see a > __attribute__ to suppress it selectively. Isn't this is orthogonal to OpenMP -- even if it said "range", your testcase could get such a warning? If so, the fix is simply to initialize i in your testcase code. >>> >>> No, the problem is that 'i' needs to be lastprivate, and 'i' is >>> assigned to in the loop body. It's irrelevant whether 'i' is assigned >>> to before the loop. I think this is the case because the spec says >>> that lastprivate variables will get the value of the private variable >>> of the last sequential iteration, but it cannot at compile time know >>> whether there might be zero iterations, which I believe the spec >>> doesn't have anything to say about. So basically we could guard >>> against it by checking if nsteps> 0, but the compiler doesn't >>> detect >>> this, so it will still issue a warning even if 'i' is initialized (the >>> warning is at the place of the lastprivate declaration). >> >> Ah. But this is then more important than I initially thought it was. >> You >> are >> saying that this is the case: >> >> cdef int i = 0 >> with nogil: >> for i in prange(n): >> ... >> print i # garbage when n == 0? > > I think it may be, depending on the implementation. With libgomp it > return 0. With the check it should also return 0. > >> It would be in the interest of less semantic differences w.r.t. range >> to >> deal better with this case. >> >> Will it silence the warning if we make "i" firstprivate as well as >> lastprivate? firstprivate would only affect the case of zero >> iterations, >> since we overwrite with NaN if the loop is entered... > > Well, it wouldn't be NaN, it would be start + step * temp :) But, yes, Doh. > that works. So we need both the check and an initialization in there: > > if (nsteps> 0) { > i = 0; > #pragma omp parallel for firstprivate(i) lastprivate(i) > for (temp = 0; ...; ...) ... > } Why do you need the if-test? Won't simply #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) ... do the job -- any initial value will be copied into all threads, including the "last" thread, even if there are no iterations? >>> >>> It will, but you don't expect your iteration variable to change with >>> zero iterations. >> >> Look. >> >> i = 42 >> for i in prange(n): >> f(i) >> print i # want 42 whenever n == 0 >> >> Now, translate this to: >> >> i = 42; >> #pragma omp parallel for firstprivate(i) lastprivate(i) >> for (temp = 0; ...; ...) { >> i = ... >> } >> #pragma omp parallel end >> /* At this point, i == 42 if n == 0 */ >> >> Am I missing something? > > Yes, 'i' may be uninitialized with nsteps > 0 (this should be valid > code). So if nsteps > 0, we need to initialize 'i' to something to get > correct behaviour with firstprivate. > And of course, if you initialize 'i' unconditionally, you change 'i' whereas you might have to leave it unaffected. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 01:48 PM, mark florisson wrote: On 4 May 2011 13:47, mark florisson wrote: On 4 May 2011 13:45, Dag Sverre Seljebotn wrote: Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? Yes, 'i' may be uninitialized with nsteps> 0 (this should be valid code). So if nsteps> 0, we need to initialize 'i' to something to get correct behaviour with firstprivate. This I don't see. I think I need to be spoon-fed on this one. And of course, if you initialize 'i' unconditionally, you change 'i' whereas you might have to leave it unaffected. This I see. Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 13:54, Dag Sverre Seljebotn wrote: > On 05/04/2011 01:48 PM, mark florisson wrote: >> >> On 4 May 2011 13:47, mark florisson wrote: >>> >>> On 4 May 2011 13:45, Dag Sverre Seljebotn >>> wrote: > Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? >>> >>> Yes, 'i' may be uninitialized with nsteps> 0 (this should be valid >>> code). So if nsteps> 0, we need to initialize 'i' to something to get >>> correct behaviour with firstprivate. > > This I don't see. I think I need to be spoon-fed on this one. So assume this code cdef int i for i in prange(10): ... Now if we transform this without the guard we get int i; #pragma omp parallel for firstprivate(i) lastprivate(i) for (...) { ...} This is invalid C code, but valid Cython code. So we need to initialize 'i', but then we get our "leave it unaffected for 0 iterations" paradox. So we need a guard. >> And of course, if you initialize 'i' unconditionally, you change 'i' >> whereas you might have to leave it unaffected. > > This I see. > > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 01:59 PM, mark florisson wrote: On 4 May 2011 13:54, Dag Sverre Seljebotn wrote: On 05/04/2011 01:48 PM, mark florisson wrote: On 4 May 2011 13:47, mark florissonwrote: On 4 May 2011 13:45, Dag Sverre Seljebotn wrote: Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? Yes, 'i' may be uninitialized with nsteps>0 (this should be valid code). So if nsteps>0, we need to initialize 'i' to something to get correct behaviour with firstprivate. This I don't see. I think I need to be spoon-fed on this one. So assume this code cdef int i for i in prange(10): ... Now if we transform this without the guard we get int i; #pragma omp parallel for firstprivate(i) lastprivate(i) for (...) { ...} This is invalid C code, but valid Cython code. So we need to initialize 'i', but then we get our "leave it unaffected for 0 iterations" paradox. So we need a guard. You mean C code won't compile if i is firstprivate and not initialized? (Sorry, I'm not aware of such things.) My first instinct is to initialize i to 0xbadabada. After all, its value is not specified -- we're not violating any Cython specs by initializing it to garbage ourselves. OTOH, I see that your approach with an if-test is more Valgrind-friendly, so I'm OK with that. Would it work to do if (nsteps > 0) { #pragma omp parallel i = 0; #pragma omp for lastprivate(i) for (temp = 0; ...) ... ... } instead, to get rid of the warning without using a firstprivate? Not sure if there's an efficiency difference here, I suppose a good C compiler could compile them to the same thing. Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 14:10, Dag Sverre Seljebotn wrote: > On 05/04/2011 01:59 PM, mark florisson wrote: >> >> On 4 May 2011 13:54, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 01:48 PM, mark florisson wrote: On 4 May 2011 13:47, mark florisson wrote: > > On 4 May 2011 13:45, Dag Sverre Seljebotn > wrote: >>> >> Look. >> >> i = 42 >> for i in prange(n): >> f(i) >> print i # want 42 whenever n == 0 >> >> Now, translate this to: >> >> i = 42; >> #pragma omp parallel for firstprivate(i) lastprivate(i) >> for (temp = 0; ...; ...) { >> i = ... >> } >> #pragma omp parallel end >> /* At this point, i == 42 if n == 0 */ >> >> Am I missing something? > > Yes, 'i' may be uninitialized with nsteps> 0 (this should be valid > code). So if nsteps> 0, we need to initialize 'i' to something to > get > correct behaviour with firstprivate. >>> >>> This I don't see. I think I need to be spoon-fed on this one. >> >> So assume this code >> >> cdef int i >> >> for i in prange(10): ... >> >> Now if we transform this without the guard we get >> >> int i; >> >> #pragma omp parallel for firstprivate(i) lastprivate(i) >> for (...) { ...} >> >> This is invalid C code, but valid Cython code. So we need to >> initialize 'i', but then we get our "leave it unaffected for 0 >> iterations" paradox. So we need a guard. > > You mean C code won't compile if i is firstprivate and not initialized? > (Sorry, I'm not aware of such things.) It will compile and warn, but it is technically invalid, as you're reading an uninitialized variable, which has undefined behavior. If e.g. the variable contains a trap representation on a certain architecture, it might halt the program (I'm not sure which architecture that would be, but I believe they exist). > My first instinct is to initialize i to 0xbadabada. After all, its value is > not specified -- we're not violating any Cython specs by initializing it to > garbage ourselves. The problem is that we don't know whether the user has initialized the variable. So if we want firstprivate to suppress warnings, we should assume that the user hasn't and do it ourselves. > OTOH, I see that your approach with an if-test is more Valgrind-friendly, so > I'm OK with that. > > Would it work to do > > if (nsteps > 0) { > #pragma omp parallel > i = 0; > #pragma omp for lastprivate(i) > for (temp = 0; ...) ... > ... > } I'm assuming you mean #pragma omp parallel private(i), otherwise you have a race (I'm not sure how much that matters for assignment). In any case, with the private() clause 'i' would be uninitialized afterwards. In either case it won't do anything useful. > instead, to get rid of the warning without using a firstprivate? Not sure if > there's an efficiency difference here, I suppose a good C compiler could > compile them to the same thing. > > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 14:17, mark florisson wrote: > On 4 May 2011 14:10, Dag Sverre Seljebotn wrote: >> On 05/04/2011 01:59 PM, mark florisson wrote: >>> >>> On 4 May 2011 13:54, Dag Sverre Seljebotn >>> wrote: On 05/04/2011 01:48 PM, mark florisson wrote: > > On 4 May 2011 13:47, mark florisson wrote: >> >> On 4 May 2011 13:45, Dag Sverre Seljebotn >> wrote: >>> Look. >>> >>> i = 42 >>> for i in prange(n): >>> f(i) >>> print i # want 42 whenever n == 0 >>> >>> Now, translate this to: >>> >>> i = 42; >>> #pragma omp parallel for firstprivate(i) lastprivate(i) >>> for (temp = 0; ...; ...) { >>> i = ... >>> } >>> #pragma omp parallel end >>> /* At this point, i == 42 if n == 0 */ >>> >>> Am I missing something? >> >> Yes, 'i' may be uninitialized with nsteps> 0 (this should be valid >> code). So if nsteps> 0, we need to initialize 'i' to something to >> get >> correct behaviour with firstprivate. This I don't see. I think I need to be spoon-fed on this one. >>> >>> So assume this code >>> >>> cdef int i >>> >>> for i in prange(10): ... >>> >>> Now if we transform this without the guard we get >>> >>> int i; >>> >>> #pragma omp parallel for firstprivate(i) lastprivate(i) >>> for (...) { ...} >>> >>> This is invalid C code, but valid Cython code. So we need to >>> initialize 'i', but then we get our "leave it unaffected for 0 >>> iterations" paradox. So we need a guard. >> >> You mean C code won't compile if i is firstprivate and not initialized? >> (Sorry, I'm not aware of such things.) > > It will compile and warn, but it is technically invalid, as you're > reading an uninitialized variable, which has undefined behavior. If > e.g. the variable contains a trap representation on a certain > architecture, it might halt the program (I'm not sure which > architecture that would be, but I believe they exist). > >> My first instinct is to initialize i to 0xbadabada. After all, its value is >> not specified -- we're not violating any Cython specs by initializing it to >> garbage ourselves. > > The problem is that we don't know whether the user has initialized the > variable. So if we want firstprivate to suppress warnings, we should > assume that the user hasn't and do it ourselves. The alternative would be to give 'cdef int i' initialized semantics, to whatever value we please. So instead of generating 'int i;' code, we could always generate 'int i = ...;'. But currently we don't do that. >> OTOH, I see that your approach with an if-test is more Valgrind-friendly, so >> I'm OK with that. >> >> Would it work to do >> >> if (nsteps > 0) { >> #pragma omp parallel >> i = 0; >> #pragma omp for lastprivate(i) >> for (temp = 0; ...) ... >> ... >> } > > I'm assuming you mean #pragma omp parallel private(i), otherwise you > have a race (I'm not sure how much that matters for assignment). In > any case, with the private() clause 'i' would be uninitialized > afterwards. In either case it won't do anything useful. > >> instead, to get rid of the warning without using a firstprivate? Not sure if >> there's an efficiency difference here, I suppose a good C compiler could >> compile them to the same thing. >> >> Dag Sverre >> ___ >> cython-devel mailing list >> cython-devel@python.org >> http://mail.python.org/mailman/listinfo/cython-devel >> > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 02:17 PM, mark florisson wrote: On 4 May 2011 14:10, Dag Sverre Seljebotn wrote: On 05/04/2011 01:59 PM, mark florisson wrote: On 4 May 2011 13:54, Dag Sverre Seljebotn wrote: On 05/04/2011 01:48 PM, mark florisson wrote: On 4 May 2011 13:47, mark florisson wrote: On 4 May 2011 13:45, Dag Sverre Seljebotn wrote: Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? Yes, 'i' may be uninitialized with nsteps> 0 (this should be valid code). So if nsteps> 0, we need to initialize 'i' to something to get correct behaviour with firstprivate. This I don't see. I think I need to be spoon-fed on this one. So assume this code cdef int i for i in prange(10): ... Now if we transform this without the guard we get int i; #pragma omp parallel for firstprivate(i) lastprivate(i) for (...) { ...} This is invalid C code, but valid Cython code. So we need to initialize 'i', but then we get our "leave it unaffected for 0 iterations" paradox. So we need a guard. You mean C code won't compile if i is firstprivate and not initialized? (Sorry, I'm not aware of such things.) It will compile and warn, but it is technically invalid, as you're reading an uninitialized variable, which has undefined behavior. If e.g. the variable contains a trap representation on a certain architecture, it might halt the program (I'm not sure which architecture that would be, but I believe they exist). My first instinct is to initialize i to 0xbadabada. After all, its value is not specified -- we're not violating any Cython specs by initializing it to garbage ourselves. The problem is that we don't know whether the user has initialized the variable. So if we want firstprivate to suppress warnings, we should assume that the user hasn't and do it ourselves. I meant that if we don't care about Valgrindability, we can initialize i at the top of our function (i.e. where it says "int __pyx_v_i"). OTOH, I see that your approach with an if-test is more Valgrind-friendly, so I'm OK with that. Would it work to do if (nsteps> 0) { #pragma omp parallel i = 0; #pragma omp for lastprivate(i) for (temp = 0; ...) ... ... } I'm assuming you mean #pragma omp parallel private(i), otherwise you have a race (I'm not sure how much that matters for assignment). In any case, with the private() clause 'i' would be uninitialized afterwards. In either case it won't do anything useful. Sorry, I meant that lastprivate(i) should go on the parallel line. if (nsteps> 0) { #pragma omp parallel lastprivate(i) i = 0; #pragma omp for for (temp = 0; ...) ... ... } won't this silence the warning? At any rate, it's obvious you have a better handle on this than me, so I'll shut up now and leave you to it :-) Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 14:23, Dag Sverre Seljebotn wrote: > On 05/04/2011 02:17 PM, mark florisson wrote: >> >> On 4 May 2011 14:10, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 01:59 PM, mark florisson wrote: On 4 May 2011 13:54, Dag Sverre Seljebotn wrote: > > On 05/04/2011 01:48 PM, mark florisson wrote: >> >> On 4 May 2011 13:47, mark florisson >> wrote: >>> >>> On 4 May 2011 13:45, Dag Sverre Seljebotn >>> wrote: > Look. i = 42 for i in prange(n): f(i) print i # want 42 whenever n == 0 Now, translate this to: i = 42; #pragma omp parallel for firstprivate(i) lastprivate(i) for (temp = 0; ...; ...) { i = ... } #pragma omp parallel end /* At this point, i == 42 if n == 0 */ Am I missing something? >>> >>> Yes, 'i' may be uninitialized with nsteps> 0 (this should be >>> valid >>> code). So if nsteps> 0, we need to initialize 'i' to something >>> to >>> get >>> correct behaviour with firstprivate. > > This I don't see. I think I need to be spoon-fed on this one. So assume this code cdef int i for i in prange(10): ... Now if we transform this without the guard we get int i; #pragma omp parallel for firstprivate(i) lastprivate(i) for (...) { ...} This is invalid C code, but valid Cython code. So we need to initialize 'i', but then we get our "leave it unaffected for 0 iterations" paradox. So we need a guard. >>> >>> You mean C code won't compile if i is firstprivate and not initialized? >>> (Sorry, I'm not aware of such things.) >> >> It will compile and warn, but it is technically invalid, as you're >> reading an uninitialized variable, which has undefined behavior. If >> e.g. the variable contains a trap representation on a certain >> architecture, it might halt the program (I'm not sure which >> architecture that would be, but I believe they exist). >> >>> My first instinct is to initialize i to 0xbadabada. After all, its value >>> is >>> not specified -- we're not violating any Cython specs by initializing it >>> to >>> garbage ourselves. >> >> The problem is that we don't know whether the user has initialized the >> variable. So if we want firstprivate to suppress warnings, we should >> assume that the user hasn't and do it ourselves. > > I meant that if we don't care about Valgrindability, we can initialize i at > the top of our function (i.e. where it says "int __pyx_v_i"). Indeed, but as the current semantics don't do this, I think we also shouldn't. The good thing is that if we don't do it, the user will see warnings from the C compiler if used uninitialized. >>> OTOH, I see that your approach with an if-test is more Valgrind-friendly, >>> so >>> I'm OK with that. >>> >>> Would it work to do >>> >>> if (nsteps> 0) { >>> #pragma omp parallel >>> i = 0; >>> #pragma omp for lastprivate(i) >>> for (temp = 0; ...) ... >>> ... >>> } >> >> I'm assuming you mean #pragma omp parallel private(i), otherwise you >> have a race (I'm not sure how much that matters for assignment). In >> any case, with the private() clause 'i' would be uninitialized >> afterwards. In either case it won't do anything useful. > > Sorry, I meant that lastprivate(i) should go on the parallel line. > > if (nsteps> 0) { > #pragma omp parallel lastprivate(i) > i = 0; > #pragma omp for > for (temp = 0; ...) ... > ... > } > > won't this silence the warning? At any rate, it's obvious you have a better > handle on this than me, so I'll shut up now and leave you to it :-) lastprivate() is not valid on a plain parallel constructs, as it's not a loop. There's only private() and shared(). ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
Moving pull requestion discussion (https://github.com/cython/cython/pull/28) over here: First, I got curious why you'd have a strip off "-pthread" from CC. I'd think you could just execute with it with "-pthread", which seems simpler. Second: If parallel.parallel is not callable, how are scheduling parameters for parallel blocks handled? Is there a reason to not support that? Do you think it should stay this way, or will parallel take parameters in the future? Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 18:35, Dag Sverre Seljebotn wrote: > Moving pull requestion discussion (https://github.com/cython/cython/pull/28) > over here: > > First, I got curious why you'd have a strip off "-pthread" from CC. I'd > think you could just execute with it with "-pthread", which seems simpler. It needs to end up in a list of arguments, and it's not needed at all as I only need the version. I guess I could do (cc + " -v").split() but eh. > Second: If parallel.parallel is not callable, how are scheduling parameters > for parallel blocks handled? Is there a reason to not support that? Do you > think it should stay this way, or will parallel take parameters in the > future? Well, as I mentioned a while back, you cannot schedule parallel blocks, there is no worksharing involved. All a parallel block does is executed a code block in however many threads there are available. The scheduling parameters are valid for a worksharing for loop only, as you schedule (read "distribute") the work among the threads. > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 07:03 PM, mark florisson wrote: On 4 May 2011 18:35, Dag Sverre Seljebotn wrote: Moving pull requestion discussion (https://github.com/cython/cython/pull/28) over here: First, I got curious why you'd have a strip off "-pthread" from CC. I'd think you could just execute with it with "-pthread", which seems simpler. It needs to end up in a list of arguments, and it's not needed at all as I only need the version. I guess I could do (cc + " -v").split() but eh. OK, that's reassuring, thought perhaps you had encountered a strange gcc strain. Second: If parallel.parallel is not callable, how are scheduling parameters for parallel blocks handled? Is there a reason to not support that? Do you think it should stay this way, or will parallel take parameters in the future? Well, as I mentioned a while back, you cannot schedule parallel blocks, there is no worksharing involved. All a parallel block does is executed a code block in however many threads there are available. The scheduling parameters are valid for a worksharing for loop only, as you schedule (read "distribute") the work among the threads. Perhaps I used the wrong terms; but checking the specs, I guess I meant "num_threads", which definitely applies to parallel. Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 19:44, Dag Sverre Seljebotn wrote: > On 05/04/2011 07:03 PM, mark florisson wrote: >> >> On 4 May 2011 18:35, Dag Sverre Seljebotn >> wrote: >>> >>> Moving pull requestion discussion >>> (https://github.com/cython/cython/pull/28) >>> over here: >>> >>> First, I got curious why you'd have a strip off "-pthread" from CC. I'd >>> think you could just execute with it with "-pthread", which seems >>> simpler. >> >> It needs to end up in a list of arguments, and it's not needed at all >> as I only need the version. I guess I could do (cc + " -v").split() >> but eh. > > OK, that's reassuring, thought perhaps you had encountered a strange gcc > strain. > >> >>> Second: If parallel.parallel is not callable, how are scheduling >>> parameters >>> for parallel blocks handled? Is there a reason to not support that? Do >>> you >>> think it should stay this way, or will parallel take parameters in the >>> future? >> >> Well, as I mentioned a while back, you cannot schedule parallel >> blocks, there is no worksharing involved. All a parallel block does is >> executed a code block in however many threads there are available. The >> scheduling parameters are valid for a worksharing for loop only, as >> you schedule (read "distribute") the work among the threads. > > Perhaps I used the wrong terms; but checking the specs, I guess I meant > "num_threads", which definitely applies to parallel. Ah, that level of scheduling :) Right, so it doesn't take that, but I don't think it's a big issue. If dynamic scheduling is enabled, it's only a suggestion, if dynamic scheduling is disabled (whether it's turned on or off by default is implementation defined) it will give the the amount of threads requested, if available. The user can still use omp_set_num_threads(), although admittedly that modifies a global setting. > Dag Sverre > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Fused Types
On Wed, May 4, 2011 at 1:47 AM, mark florisson wrote: > On 4 May 2011 10:24, Dag Sverre Seljebotn wrote: >> On 05/04/2011 01:07 AM, Greg Ewing wrote: >>> >>> mark florisson wrote: >>> cdef func(floating x, floating y): ... you get a "float, float" version, and a "double, double" version, but not "float, double" or "double, float". >>> >>> It's hard to draw conclusions from this example because >>> it's degenerate. You don't really need multiple versions of a >>> function like that, because of float <-> double coercions. >>> >>> A more telling example might be >>> >>> cdef double dot_product(floating *u, floating *v, int length) >>> >>> By your current rules, this would give you one version that >>> takes two float vectors, and another that takes two double >>> vectors. >>> >>> But if you want to find the dot product of a float vector and >>> a double vector, you're out of luck. >> >> First, I'm open for your proposed syntax too...But in the interest of seeing >> how we got here: >> >> The argument to the above goes that you *should* be out of luck. For >> instance, talking about dot products, BLAS itself has float-float and >> double-double, but not float-double AFAIK. >> >> What you are saying that this does not have the full power of C++ templates. >> And the answer is that yes, this does not have the full power of C++ >> templates. >> >> At the same time we discussed this, we also discussed better support for >> string-based templating languages (so that, e.g., compilation error messages >> could refer to the template file). The two are complementary. >> >> Going back to Greg's syntax: What I don't like is that it makes the simple >> unambiguous cases, where this would actually be used in real life, less >> readable. >> >> Would it be too complicated to have both? For instance; >> >> i) You are allowed to use a *single* fused_type on a *function* without >> declaration. >> >> def f(floating x, floating *y): # ok >> >> Turns into >> >> def f[floating T](T x, T *y): >> >> This is NOT ok: >> >> def f(floating x, integral y): >> # ERROR: Please explicitly declare fused types inside [] >> >> ii) Using more than one fused type, or using it on a cdef class or struct, >> you need to use the [] declaration. >> > > I don't think it would be too complicated, but as you mention it's > probably not a very likely case, and if the user does need it, a new > (equivalent) fused type can be created. The current way reads a lot > nicer than the indexed one in my opinion. So I'd be fine with > implementing it, but I find the current way more elegant. I was actually thinking of exactly the same thing--supporting syntax (i) for the case of a single type parameter, but the drawback is the introduction of two distinct syntaxes for essentially the same feature. Something like this is necessary to give an ordering to the types for structs and classes, or when a fused type is used for intermediate results but not in the argument list. I really like the elegance of the (much more common) single-parameter variant. Another option is using the with syntax, which was also considered for supporting C++ templates. >> Finally: It is a bit uncomfortable that we seem to be hashing things out >> even as Mark is implementing this. Would it be feasible to have a Skype >> session sometimes this week where everybody interested in the outcome of >> this come together for an hour and actually decide on something? >> >> Mark: How much does this discussion of syntax impact your development? Are >> you able to treat them just as polish on top and work on the "engine" >> undisturbed by this? > > Thanks for your consideration, I admit it it feels a bit uncomfortable > :) But at least this change shouldn't have such a big impact on the > code, it would mean some changes in a select few places, so it's > definitely polish. In any event, before we settle on this, I'd like to > do the cpdef support first and work on indexing from Python space, so > I think we have enough time to settle this argument on the ML. > Before that, I'm just going to finish up for a pull request for the > OpenMP branch, I'd like to see if I can get rid of some warnings. Yes, please feel free to focus on the back end and move onto other the things while the syntax is still in limbo, rather than implementing every whim of the mailing list :). - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
On Wed, 2011-05-04 at 10:35 +0400, Vitja Makarov wrote: > > Can you please provide me jenkins account and I'll try to fix the issues > > myself? > > > > It's better to use: > > $ git fetch origin > $ git checkout -f origin/master > > Instead of git pull Or $ git fetch origin $ git reset --hard origin/master which is what we used for our buildbot. -- Sincerely yours, Yury V. Zaytsev ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
2011/5/4 Yury V. Zaytsev : > On Wed, 2011-05-04 at 10:35 +0400, Vitja Makarov wrote: >> > Can you please provide me jenkins account and I'll try to fix the issues >> > myself? >> > >> >> It's better to use: >> >> $ git fetch origin >> $ git checkout -f origin/master >> >> Instead of git pull > > Or > > $ git fetch origin > $ git reset --hard origin/master > > which is what we used for our buildbot. > > -- > Sincerely yours, > Yury V. Zaytsev > Thanks! Am I right: when you do reset '--hard origin/master' you are on the master branch and when you do checkout you are in a 'detached state'? But it seems to me that the problem is somewhere in the jenkins configuration. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
On Wed, 2011-05-04 at 22:43 +0400, Vitja Makarov wrote: > Thanks! Am I right: when you do reset '--hard origin/master' you are > on the master branch and when you do checkout you are in a 'detached > state'? Yes, I think that you are right, that's why we used to do reset instead: $ git fetch origin $ git checkout master $ git reset --hard origin/master By the way, you can also do $ git clean -dfx to make sure that EVERYTHING that doesn't belong to the tree is plainly wiped out (don't do that on your real checkouts unless you definitively have nothing to lose). > But it seems to me that the problem is somewhere in the jenkins configuration. I didn't mean to say that there's no problem with Jenkins, just wanted to suggest a possibly better way of updating the CI checkout :-) -- Sincerely yours, Yury V. Zaytsev ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
Vitja Makarov, 04.05.2011 07:09: Jenkins doesn't work for me. It seems that it can't do pull and is running tests again obsolete sources. May be because of forced push. There are only 6 errors here: https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/ According to the build logs of the sdist job, it was previously checking out the "master" branch and it seems you reconfigured it to use the "unreachable_code" branch now. At least the recent checkouts have used the latest snapshot of the branches, so ISTM that everything is working correctly. Could you point me to a build where something was going wrong for you? Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 05/04/2011 08:07 PM, mark florisson wrote: On 4 May 2011 19:44, Dag Sverre Seljebotn wrote: On 05/04/2011 07:03 PM, mark florisson wrote: On 4 May 2011 18:35, Dag Sverre Seljebotn wrote: Moving pull requestion discussion (https://github.com/cython/cython/pull/28) over here: First, I got curious why you'd have a strip off "-pthread" from CC. I'd think you could just execute with it with "-pthread", which seems simpler. It needs to end up in a list of arguments, and it's not needed at all as I only need the version. I guess I could do (cc + " -v").split() but eh. OK, that's reassuring, thought perhaps you had encountered a strange gcc strain. Second: If parallel.parallel is not callable, how are scheduling parameters for parallel blocks handled? Is there a reason to not support that? Do you think it should stay this way, or will parallel take parameters in the future? Well, as I mentioned a while back, you cannot schedule parallel blocks, there is no worksharing involved. All a parallel block does is executed a code block in however many threads there are available. The scheduling parameters are valid for a worksharing for loop only, as you schedule (read "distribute") the work among the threads. Perhaps I used the wrong terms; but checking the specs, I guess I meant "num_threads", which definitely applies to parallel. Ah, that level of scheduling :) Right, so it doesn't take that, but I don't think it's a big issue. If dynamic scheduling is enabled, it's only a suggestion, if dynamic scheduling is disabled (whether it's turned on or off by default is implementation defined) it will give the the amount of threads requested, if available. The user can still use omp_set_num_threads(), although admittedly that modifies a global setting. Hmm...I'm not completely happy about this. For now I just worry about not shutting off the possibility of adding thread-pool-spawning parameters in the future. Specifying the number of threads can be useful, and omp_set_num_threads is a bad way of doing as you say. And other backends than OpenMP may call for something we don't know what is yet? Anyway, all I'm asking is whether we should require trailing () on parallel: with nogil, parallel(): ... I think we should, to keep the window open for options. Unless, that is, we're OK both with and without trailing () down the line. Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
2011/5/4 Stefan Behnel : > Vitja Makarov, 04.05.2011 07:09: >> >> Jenkins doesn't work for me. It seems that it can't do pull and is >> running tests again obsolete sources. >> May be because of forced push. >> >> There are only 6 errors here: >> >> https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/ > > According to the build logs of the sdist job, it was previously checking out > the "master" branch and it seems you reconfigured it to use the > "unreachable_code" branch now. At least the recent checkouts have used the > latest snapshot of the branches, so ISTM that everything is working > correctly. Could you point me to a build where something was going wrong for > you? > > Stefan I've added the following line to sdist target +rm -fr $WORKSPACE/dist $WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar --cython-profile --no-cython-compile Hope that should help, that's the only difference between cython-devel-sdist and cython-vitek-sdist. See here: https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/ You can't find w_unreachable in the logs, it seems that cython code there is outdated. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Fused Types
On 05/04/2011 08:13 PM, Robert Bradshaw wrote: On Wed, May 4, 2011 at 1:47 AM, mark florisson wrote: On 4 May 2011 10:24, Dag Sverre Seljebotn wrote: On 05/04/2011 01:07 AM, Greg Ewing wrote: mark florisson wrote: cdef func(floating x, floating y): ... you get a "float, float" version, and a "double, double" version, but not "float, double" or "double, float". It's hard to draw conclusions from this example because it's degenerate. You don't really need multiple versions of a function like that, because of float<-> double coercions. A more telling example might be cdef double dot_product(floating *u, floating *v, int length) By your current rules, this would give you one version that takes two float vectors, and another that takes two double vectors. But if you want to find the dot product of a float vector and a double vector, you're out of luck. First, I'm open for your proposed syntax too...But in the interest of seeing how we got here: The argument to the above goes that you *should* be out of luck. For instance, talking about dot products, BLAS itself has float-float and double-double, but not float-double AFAIK. What you are saying that this does not have the full power of C++ templates. And the answer is that yes, this does not have the full power of C++ templates. At the same time we discussed this, we also discussed better support for string-based templating languages (so that, e.g., compilation error messages could refer to the template file). The two are complementary. Going back to Greg's syntax: What I don't like is that it makes the simple unambiguous cases, where this would actually be used in real life, less readable. Would it be too complicated to have both? For instance; i) You are allowed to use a *single* fused_type on a *function* without declaration. def f(floating x, floating *y): # ok Turns into def f[floating T](T x, T *y): This is NOT ok: def f(floating x, integral y): # ERROR: Please explicitly declare fused types inside [] ii) Using more than one fused type, or using it on a cdef class or struct, you need to use the [] declaration. I don't think it would be too complicated, but as you mention it's probably not a very likely case, and if the user does need it, a new (equivalent) fused type can be created. The current way reads a lot nicer than the indexed one in my opinion. So I'd be fine with implementing it, but I find the current way more elegant. I was actually thinking of exactly the same thing--supporting syntax (i) for the case of a single type parameter, but the drawback is the introduction of two distinct syntaxes for essentially the same feature. Something like this is necessary to give an ordering to the types for structs and classes, or when a fused type is used for intermediate results but not in the argument list. I really like the elegance of the (much more common) single-parameter variant. Another option is using the with syntax, which was also considered for supporting C++ templates. In particular since that will work in pure Python mode. One thing I worry about with the func[]()-syntax is that it is not Python compatible. That's one thing I like about the CEP, that in time we can do def f(x: floating) -> floating: ... and have something that's nice in both Python and Cython. Dag Sverre ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
Vitja Makarov, 04.05.2011 21:14: 2011/5/4 Stefan Behnel: Vitja Makarov, 04.05.2011 07:09: Jenkins doesn't work for me. It seems that it can't do pull and is running tests again obsolete sources. May be because of forced push. There are only 6 errors here: https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/ According to the build logs of the sdist job, it was previously checking out the "master" branch and it seems you reconfigured it to use the "unreachable_code" branch now. At least the recent checkouts have used the latest snapshot of the branches, so ISTM that everything is working correctly. Could you point me to a build where something was going wrong for you? I've added the following line to sdist target +rm -fr $WORKSPACE/dist $WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar --cython-profile --no-cython-compile Hope that should help, that's the only difference between cython-devel-sdist and cython-vitek-sdist. See here: https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/ You can't find w_unreachable in the logs, it seems that cython code there is outdated. Ah, right, that's it. You can see the problem here: https://sage.math.washington.edu:8091/hudson/job/cython-vitek-sdist/97/artifact/dist/Cython-0.14+.tar.gz/*fingerprint*/ It's been using the 0.14+ sdist for ages instead of the 0.14.1+ one. That could also explain why your CPython regression tests are running much faster than in cython-devel. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] prange CEP updated
On 4 May 2011 21:13, Dag Sverre Seljebotn wrote: > On 05/04/2011 08:07 PM, mark florisson wrote: >> >> On 4 May 2011 19:44, Dag Sverre Seljebotn >> wrote: >>> >>> On 05/04/2011 07:03 PM, mark florisson wrote: On 4 May 2011 18:35, Dag Sverre Seljebotn wrote: > > Moving pull requestion discussion > (https://github.com/cython/cython/pull/28) > over here: > > First, I got curious why you'd have a strip off "-pthread" from CC. I'd > think you could just execute with it with "-pthread", which seems > simpler. It needs to end up in a list of arguments, and it's not needed at all as I only need the version. I guess I could do (cc + " -v").split() but eh. >>> >>> OK, that's reassuring, thought perhaps you had encountered a strange gcc >>> strain. >>> > Second: If parallel.parallel is not callable, how are scheduling > parameters > for parallel blocks handled? Is there a reason to not support that? Do > you > think it should stay this way, or will parallel take parameters in the > future? Well, as I mentioned a while back, you cannot schedule parallel blocks, there is no worksharing involved. All a parallel block does is executed a code block in however many threads there are available. The scheduling parameters are valid for a worksharing for loop only, as you schedule (read "distribute") the work among the threads. >>> >>> Perhaps I used the wrong terms; but checking the specs, I guess I meant >>> "num_threads", which definitely applies to parallel. >> >> Ah, that level of scheduling :) Right, so it doesn't take that, but I >> don't think it's a big issue. If dynamic scheduling is enabled, it's >> only a suggestion, if dynamic scheduling is disabled (whether it's >> turned on or off by default is implementation defined) it will give >> the the amount of threads requested, if available. >> The user can still use omp_set_num_threads(), although admittedly that >> modifies a global setting. > > Hmm...I'm not completely happy about this. For now I just worry about not > shutting off the possibility of adding thread-pool-spawning parameters in > the future. Specifying the number of threads can be useful, and > omp_set_num_threads is a bad way of doing as you say. > > And other backends than OpenMP may call for something we don't know what is > yet? > > Anyway, all I'm asking is whether we should require trailing () on parallel: > > with nogil, parallel(): ... > > I think we should, to keep the window open for options. Unless, that is, > we're OK both with and without trailing () down the line. Ok, sure, that's fine with me. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] jenkins problems
2011/5/4 Stefan Behnel : > Vitja Makarov, 04.05.2011 21:14: >> >> 2011/5/4 Stefan Behnel: >>> >>> Vitja Makarov, 04.05.2011 07:09: Jenkins doesn't work for me. It seems that it can't do pull and is running tests again obsolete sources. May be because of forced push. There are only 6 errors here: https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/ >>> >>> According to the build logs of the sdist job, it was previously checking >>> out >>> the "master" branch and it seems you reconfigured it to use the >>> "unreachable_code" branch now. At least the recent checkouts have used >>> the >>> latest snapshot of the branches, so ISTM that everything is working >>> correctly. Could you point me to a build where something was going wrong >>> for >>> you? >> >> I've added the following line to sdist target >> >> +rm -fr $WORKSPACE/dist >> $WORKSPACE/python/bin/python setup.py clean sdist --formats=gztar >> --cython-profile --no-cython-compile >> >> Hope that should help, that's the only difference between >> cython-devel-sdist and cython-vitek-sdist. >> >> >> See here: >> >> https://sage.math.washington.edu:8091/hudson/view/cython-vitek/job/cython-vitek-tests-py27-c/166/ >> >> You can't find w_unreachable in the logs, it seems that cython code >> there is outdated. > > Ah, right, that's it. You can see the problem here: > > https://sage.math.washington.edu:8091/hudson/job/cython-vitek-sdist/97/artifact/dist/Cython-0.14+.tar.gz/*fingerprint*/ > > It's been using the 0.14+ sdist for ages instead of the 0.14.1+ one. > > That could also explain why your CPython regression tests are running much > faster than in cython-devel. > Ok, so I should take a look at pyregr tests closer. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Fused Types
Dag Sverre Seljebotn wrote: The argument to the above goes that you *should* be out of luck. For instance, talking about dot products, BLAS itself has float-float and double-double, but not float-double AFAIK. Seems to me that's more because generating lots of versions of a function in C is hard work, and the designers of BLAS didn't think it was worth providing more than two versions. If they'd had a tool that would magically generate all the combinations for them, they might have made a different choice. What you seem to be trying to do here is enable compile-time duck typing, so that you can write a function that "just works" with a variety of argument types, without having to think about the details. With that mindset, seeing a function declared as cdef func(floating x, floating y) one would expect that x and y could be independently chosen as any of the types classified as "floating", because that's the way duck typing usually works. For example, if a Python function is documented as taking two sequences, you expect that to mean *any* two sequences, not two sequences of the same type. What you are saying that this does not have the full power of C++ templates. And the answer is that yes, this does not have the full power of C++ templates. What I'm suggesting doesn't have the full power of C++ templates either, because the range of possible values for each type parameter would still have to be specified in advance. However, it makes the dependencies between the type parameters explicit, rather than being hidden in some rather unintuitive implicit rules. Would it be feasible to have a Skype session sometimes this week where everybody interested in the outcome of this come together for an hour and actually decide on something? I'm not sure that would help much. Reaching good decisions about things like this requires time to think through all the issues. -- Greg ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Hudson pyregr testing takes too long
2011/4/25 Vitja Makarov : > 2011/4/25 Stefan Behnel : >> Vitja Makarov, 25.04.2011 11:04: >>> >>> 2011/4/25 Stefan Behnel: Vitja Makarov, 25.04.2011 08:19: > > 2011/4/25 Stefan Behnel: >> >> Stefan Behnel, 07.04.2011 13:52: >>> >>> Stefan Behnel, 07.04.2011 13:46: I just noticed that the CPython pyregr tests have jumped up from ~14 minutes for a run to ~4 hours when we added generator support. https://sage.math.washington.edu:8091/hudson/job/cython-devel-tests-pyregr-py26-c/buildTimeTrend I currently have no idea why that is (well, it's likely because we compile more tests now, but Vitja's branch ran the tests in ~30 minutes). It would be great if someone could find the time to analyse this problem. The current run time makes it basically impossible to keep these tests enabled. >>> >>> Ok, it looks like this is mostly an issue with the Py2.6 tests. The >>> Py2.7 >>> tests take 30-45 minutes, which is very long, but not completely out >>> of >>> bounds. I've disabled the Py2.6 pyregr tests for now. >> >> There seems to be a huge memory leak which almost certainly accounts >> for >> this. The Python process that runs the pyregr suite ends up with about >> 50GB >> of memory at the end, also in the latest Py3k builds. >> >> I have no idea where it may be, but it started to show when we merged >> the >> generator support. That's where I noticed the instant jump in the >> runtime. > > That's very strange for my branch it takes about 30 minutes that is ok. There's also a second path that's worth investigating. As part of the merge, there was another change that came in: the CythonPyregrTestCase implementation. This means that the regression tests are now being run differently than before. The massive memory consumption may simply be due to the mass of unit tests being loaded into memory. >>> >>> def run_test(): >>> .. >>> try: >>> module = __import__(self.module) >>> if hasattr(module, 'test_main'): >>> module.test_main() >>> except (unittest.SkipTest, support.ResourceDenied): >>> result.addSkip(self, 'ok') >>> >>> >>> It seems that all the modules stay loaded so may be they should be >>> unloaded with del sys.modules[module_name]? >> >> (Binary) module unloading isn't really supported in CPython. There's PEP >> 3121 that has the potential to change it, but it's not completely >> implemented, neither in CPython nor in Cython. A major problem is that >> unloading a module deletes its globals but not necessarily the code that >> uses them. For example, instances of types defined in the module can still >> be alive at that point. >> >> The way runtests.py deals with this is forking before loading a module. >> However, this does not currently work with the "xmlrunner" which we use on >> Hudson, so we let all tests run in a single process there. >> > > > Btw when running plain python code with generators total ref counter > doesn't get back to initial value. > I tried to trace scope and generator destructors and they are run as > expected. So I'm not sure about leaks in generators. > Recently I've found that pyregr.test_dict (test_mutatingiteration) test makes it slow: def test_mutatingiteration(): d = {} d[1] = 1 for i in d: print i d[i+1] = 1 test_mutatingiteration() In CPython this code raises: RuntimeError: dictionary changed size during iteration And in Cython you have infinite loop. So we can disable this test for now. -- vitja. ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel