Re: [Cython] Test runner

2011-04-13 Thread mark florisson
Another, different but related issue: how can we get useful output
from the test runner? e.g. I'm running my test with a
'@cython.test_assert_path_exists("...")' and I get this error output:

==
ERROR: runTest (__main__.CythonRunTestCase)
compiling (c) and running parallel
--
Traceback (most recent call last):
  File "runtests.py", line 555, in run
self.runCompileTest()
  File "runtests.py", line 386, in runCompileTest
self.test_directory, self.expect_errors, self.annotate)
  File "runtests.py", line 532, in compile
self.assertEquals(None, unexpected_error)
AssertionError: None != u'9:0: Compiler crash in TreeAssertVisitor'

So I'm seeing a traceback from the test runner (which I'm not really
interested in :), but the actual traceback is not displayed.

Can I also specify special link and compiler flags for a certain test,
like in http://wiki.cython.org/enhancements/distutils_preprocessing ?
Or do I have to export LDFLAGS and CFLAGS in my environment?
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Test runner

2011-04-13 Thread Robert Bradshaw
On Wed, Apr 13, 2011 at 4:07 AM, mark florisson
 wrote:
> Another, different but related issue: how can we get useful output
> from the test runner? e.g. I'm running my test with a
> '@cython.test_assert_path_exists("...")' and I get this error output:
>
> ==
> ERROR: runTest (__main__.CythonRunTestCase)
> compiling (c) and running parallel
> --
> Traceback (most recent call last):
>  File "runtests.py", line 555, in run
>    self.runCompileTest()
>  File "runtests.py", line 386, in runCompileTest
>    self.test_directory, self.expect_errors, self.annotate)
>  File "runtests.py", line 532, in compile
>    self.assertEquals(None, unexpected_error)
> AssertionError: None != u'9:0: Compiler crash in TreeAssertVisitor'
>
> So I'm seeing a traceback from the test runner (which I'm not really
> interested in :), but the actual traceback is not displayed.

I agree this could be improved, but I'm not sure the best way to do it.

> Can I also specify special link and compiler flags for a certain test,
> like in http://wiki.cython.org/enhancements/distutils_preprocessing ?
> Or do I have to export LDFLAGS and CFLAGS in my environment?

You can't right now, but it would probably be worth adding. I'm not
sure how we would handle missing dependancies in that case though.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.

2011-04-13 Thread Stefan Behnel

Robert Bradshaw, 12.04.2011 22:42:

On Tue, Apr 12, 2011 at 11:22 AM, Stefan Behnel wrote:

Arthur de Souza Ribeiro, 12.04.2011 14:59:

The code is in this repository:
https://github.com/arthursribeiro/JSON-module your feedback would be very
important, so that I could improve my skills to get more and more able to
work sooner in the project.


I'd strongly suggest implementing this in pure Python (.py files instead of
.pyx files), with externally provided static types for performance. A single
code base is very advantageous for a large project like CPython, much more
than the ultimate 5% better performance.


While this is advantageous for the final product, it may not be the
easiest to get up and running with.


Agreed. Arthur, it's fine if you write Cython code in a .pyx file to get 
started. You can just extract the declarations later.


Stefan
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-04-13 Thread mark florisson
On 5 April 2011 22:29, Dag Sverre Seljebotn  wrote:
> I've done a pretty major revision to the prange CEP, bringing in a lot of
> the feedback.
>
> Thread-private variables are now split in two cases:
>
>  i) The safe cases, which really require very little technical knowledge ->
> automatically inferred
>
>  ii) As an advanced feature, unsafe cases that requires some knowledge of
> threading -> must be explicitly declared
>
> I think this split simplifies things a great deal.
>
> I'm rather excited over this now; this could turn out to be a really
> user-friendly and safe feature that would not only allow us to support
> OpenMP-like threading, but be more convenient to use in a range of common
> cases.
>
> http://wiki.cython.org/enhancements/prange
>
> Dag Sverre
>
> ___
> cython-devel mailing list
> cython-devel@python.org
> http://mail.python.org/mailman/listinfo/cython-devel
>
>

If we want to support cython.parallel.threadsavailable outside of
parallel regions (which does not depend on the schedule used for
worksharing constructs!), then we have to disable dynamic scheduling.
For instance, if OpenMP sees some OpenMP threads are already busy,
then with dynamic scheduling it dynamically establishes how many
threads to use for any parallel region.
So basically, if you put omp_get_num_threads() in a parallel region,
you have a race when you depend on that result in a subsequent
parallel region, because the number of busy OpenMP threads may have
changed.

So basically, to make threadsavailable() work outside parallel
regions, we'd have to disable dynamic scheduling (omp_set_dynamic(0)).
Of course, when OpenMP cannot request the amount of threads desired
(because they are bounded by a configurable thread limit (and the OS
of course)), the behaviour will be implementation defined. So then we
could just put a warning in the docs for that, and users can check for
this in the parallel region using threadsavailable() if it's really
important.

Does that sound like a good idea? And should I update the CEP?
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-04-13 Thread Dag Sverre Seljebotn

On 04/13/2011 09:31 PM, mark florisson wrote:

On 5 April 2011 22:29, Dag Sverre Seljebotn  wrote:

I've done a pretty major revision to the prange CEP, bringing in a lot of
the feedback.

Thread-private variables are now split in two cases:

  i) The safe cases, which really require very little technical knowledge ->
automatically inferred

  ii) As an advanced feature, unsafe cases that requires some knowledge of
threading ->  must be explicitly declared

I think this split simplifies things a great deal.

I'm rather excited over this now; this could turn out to be a really
user-friendly and safe feature that would not only allow us to support
OpenMP-like threading, but be more convenient to use in a range of common
cases.

http://wiki.cython.org/enhancements/prange

Dag Sverre

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel




If we want to support cython.parallel.threadsavailable outside of
parallel regions (which does not depend on the schedule used for
worksharing constructs!), then we have to disable dynamic scheduling.
For instance, if OpenMP sees some OpenMP threads are already busy,
then with dynamic scheduling it dynamically establishes how many
threads to use for any parallel region.
So basically, if you put omp_get_num_threads() in a parallel region,
you have a race when you depend on that result in a subsequent
parallel region, because the number of busy OpenMP threads may have
changed.


Ah, I don't know why I thought there wouldn't be a race condition. I 
wonder if the whole threadsavailable() idea should just be ditched and 
that we should think of something else. It's not a very common usecase. 
Starting to disable some forms of scheduling just to, essentially, 
shoehorn in one particular syntax, doesn't seem like the way to go.


Perhaps this calls for support for the critical(?) block then, after 
all. I'm at least +1 on dropping threadsavailable() and instead require 
that you call numthreads() in a critical block:


with parallel:
with critical:
# call numthreads() and allocate global buffer
# calling threadid() not allowed, if we can manage that
# get buffer slice for each thread


So basically, to make threadsavailable() work outside parallel
regions, we'd have to disable dynamic scheduling (omp_set_dynamic(0)).
Of course, when OpenMP cannot request the amount of threads desired
(because they are bounded by a configurable thread limit (and the OS
of course)), the behaviour will be implementation defined. So then we
could just put a warning in the docs for that, and users can check for
this in the parallel region using threadsavailable() if it's really
important.


Do you have any experience with what actually happen with, say, GNU 
OpenMP? I blindly assumed from the specs that it was an error condition 
("flag an error any way you like"), but I guess that may be wrong.


Just curious, I think we can just fall back to OpenMP behaviour; unless 
it terminates the interpreter in an error condition, in which case we 
should look into how expensive it is to check for the condition up front...



Dag Sverre

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] prange CEP updated

2011-04-13 Thread mark florisson
On 13 April 2011 21:57, Dag Sverre Seljebotn  wrote:
> On 04/13/2011 09:31 PM, mark florisson wrote:
>>
>> On 5 April 2011 22:29, Dag Sverre Seljebotn
>>  wrote:
>>>
>>> I've done a pretty major revision to the prange CEP, bringing in a lot of
>>> the feedback.
>>>
>>> Thread-private variables are now split in two cases:
>>>
>>>  i) The safe cases, which really require very little technical knowledge
>>> ->
>>> automatically inferred
>>>
>>>  ii) As an advanced feature, unsafe cases that requires some knowledge of
>>> threading ->  must be explicitly declared
>>>
>>> I think this split simplifies things a great deal.
>>>
>>> I'm rather excited over this now; this could turn out to be a really
>>> user-friendly and safe feature that would not only allow us to support
>>> OpenMP-like threading, but be more convenient to use in a range of common
>>> cases.
>>>
>>> http://wiki.cython.org/enhancements/prange
>>>
>>> Dag Sverre
>>>
>>> ___
>>> cython-devel mailing list
>>> cython-devel@python.org
>>> http://mail.python.org/mailman/listinfo/cython-devel
>>>
>>>
>>
>> If we want to support cython.parallel.threadsavailable outside of
>> parallel regions (which does not depend on the schedule used for
>> worksharing constructs!), then we have to disable dynamic scheduling.
>> For instance, if OpenMP sees some OpenMP threads are already busy,
>> then with dynamic scheduling it dynamically establishes how many
>> threads to use for any parallel region.
>> So basically, if you put omp_get_num_threads() in a parallel region,
>> you have a race when you depend on that result in a subsequent
>> parallel region, because the number of busy OpenMP threads may have
>> changed.
>
> Ah, I don't know why I thought there wouldn't be a race condition. I wonder
> if the whole threadsavailable() idea should just be ditched and that we
> should think of something else. It's not a very common usecase. Starting to
> disable some forms of scheduling just to, essentially, shoehorn in one
> particular syntax, doesn't seem like the way to go.
>
> Perhaps this calls for support for the critical(?) block then, after all.
> I'm at least +1 on dropping threadsavailable() and instead require that you
> call numthreads() in a critical block:
>
> with parallel:
>    with critical:
>        # call numthreads() and allocate global buffer
>        # calling threadid() not allowed, if we can manage that
>    # get buffer slice for each thread

In that case I think you'd want single + a barrier. 'critical' means
that all threads execute the section, but exclusively. I think you
usually want to allocate either a shared worksharing buffer, or a
private thread-local buffer. In the former case you can allocate your
buffer outside any parallel section, in the latter case within the
parallel section. It the latter case the buffer will just not be
available outside of the parallel section.

We can still support any write-back to shared variables that are
explicitly declared later on (supposing we'd also support single and
barriers. Then the code would read as follows

cdef shared(void *) buf
cdef void *localbuf

with nogil, parallel:
with single:
buf = malloc(n * numthreads())

barrier()

localbuf = buf + n * threadid()


# localbuf undefined here
# buf is well-defined here

However, I don't believe it's very common to want to use private
buffers after the loop. If you have a buffer in terms of your loop
size, you want it shared, but I can't imagine a case where you want to
examine buffers that were allocated specifically for each thread after
the parallel section. So I'm +1 on dropping threadsavailable outside
parallel sections, but currently -1 on supporting this case, because
we can solve it later on with support for explicitly declared
variables + single + barriers.

>> So basically, to make threadsavailable() work outside parallel
>> regions, we'd have to disable dynamic scheduling (omp_set_dynamic(0)).
>> Of course, when OpenMP cannot request the amount of threads desired
>> (because they are bounded by a configurable thread limit (and the OS
>> of course)), the behaviour will be implementation defined. So then we
>> could just put a warning in the docs for that, and users can check for
>> this in the parallel region using threadsavailable() if it's really
>> important.
>
> Do you have any experience with what actually happen with, say, GNU OpenMP?
> I blindly assumed from the specs that it was an error condition ("flag an
> error any way you like"), but I guess that may be wrong.
>
> Just curious, I think we can just fall back to OpenMP behaviour; unless it
> terminates the interpreter in an error condition, in which case we should
> look into how expensive it is to check for the condition up front...

With libgomp you just get the maximum amount of available threads, up
to the number requested. So this code

  1 #include 
  2 #include 
  3
  4 int main(void) {
  5 printf("The thr

Re: [Cython] prange CEP updated

2011-04-13 Thread mark florisson
On 13 April 2011 22:53, mark florisson  wrote:
> On 13 April 2011 21:57, Dag Sverre Seljebotn  
> wrote:
>> On 04/13/2011 09:31 PM, mark florisson wrote:
>>>
>>> On 5 April 2011 22:29, Dag Sverre Seljebotn
>>>  wrote:

 I've done a pretty major revision to the prange CEP, bringing in a lot of
 the feedback.

 Thread-private variables are now split in two cases:

  i) The safe cases, which really require very little technical knowledge
 ->
 automatically inferred

  ii) As an advanced feature, unsafe cases that requires some knowledge of
 threading ->  must be explicitly declared

 I think this split simplifies things a great deal.

 I'm rather excited over this now; this could turn out to be a really
 user-friendly and safe feature that would not only allow us to support
 OpenMP-like threading, but be more convenient to use in a range of common
 cases.

 http://wiki.cython.org/enhancements/prange

 Dag Sverre

 ___
 cython-devel mailing list
 cython-devel@python.org
 http://mail.python.org/mailman/listinfo/cython-devel


>>>
>>> If we want to support cython.parallel.threadsavailable outside of
>>> parallel regions (which does not depend on the schedule used for
>>> worksharing constructs!), then we have to disable dynamic scheduling.
>>> For instance, if OpenMP sees some OpenMP threads are already busy,
>>> then with dynamic scheduling it dynamically establishes how many
>>> threads to use for any parallel region.
>>> So basically, if you put omp_get_num_threads() in a parallel region,
>>> you have a race when you depend on that result in a subsequent
>>> parallel region, because the number of busy OpenMP threads may have
>>> changed.
>>
>> Ah, I don't know why I thought there wouldn't be a race condition. I wonder
>> if the whole threadsavailable() idea should just be ditched and that we
>> should think of something else. It's not a very common usecase. Starting to
>> disable some forms of scheduling just to, essentially, shoehorn in one
>> particular syntax, doesn't seem like the way to go.
>>
>> Perhaps this calls for support for the critical(?) block then, after all.
>> I'm at least +1 on dropping threadsavailable() and instead require that you
>> call numthreads() in a critical block:
>>
>> with parallel:
>>    with critical:
>>        # call numthreads() and allocate global buffer
>>        # calling threadid() not allowed, if we can manage that
>>    # get buffer slice for each thread
>
> In that case I think you'd want single + a barrier. 'critical' means
> that all threads execute the section, but exclusively. I think you
> usually want to allocate either a shared worksharing buffer, or a
> private thread-local buffer. In the former case you can allocate your
> buffer outside any parallel section, in the latter case within the
> parallel section. It the latter case the buffer will just not be
> available outside of the parallel section.
>
> We can still support any write-back to shared variables that are
> explicitly declared later on (supposing we'd also support single and
> barriers. Then the code would read as follows
>
> cdef shared(void *) buf
> cdef void *localbuf
>
> with nogil, parallel:
>    with single:
>        buf = malloc(n * numthreads())
>
>    barrier()
>
>    localbuf = buf + n * threadid()
>    
>
> # localbuf undefined here
> # buf is well-defined here
>
> However, I don't believe it's very common to want to use private
> buffers after the loop. If you have a buffer in terms of your loop
> size, you want it shared, but I can't imagine a case where you want to
> examine buffers that were allocated specifically for each thread after
> the parallel section. So I'm +1 on dropping threadsavailable outside
> parallel sections, but currently -1 on supporting this case, because
> we can solve it later on with support for explicitly declared
> variables + single + barriers.
>
>>> So basically, to make threadsavailable() work outside parallel
>>> regions, we'd have to disable dynamic scheduling (omp_set_dynamic(0)).
>>> Of course, when OpenMP cannot request the amount of threads desired
>>> (because they are bounded by a configurable thread limit (and the OS
>>> of course)), the behaviour will be implementation defined. So then we
>>> could just put a warning in the docs for that, and users can check for
>>> this in the parallel region using threadsavailable() if it's really
>>> important.
>>
>> Do you have any experience with what actually happen with, say, GNU OpenMP?
>> I blindly assumed from the specs that it was an error condition ("flag an
>> error any way you like"), but I guess that may be wrong.
>>
>> Just curious, I think we can just fall back to OpenMP behaviour; unless it
>> terminates the interpreter in an error condition, in which case we should
>> look into how expensive it is to check for the condition up front.