Re: [2/2] OpenACC routine support

2015-11-03 Thread Jakub Jelinek
On Tue, Nov 03, 2015 at 10:56:37AM -0500, Nathan Sidwell wrote: > On 11/03/15 10:38, Jakub Jelinek wrote: > >On Mon, Nov 02, 2015 at 02:23:19PM -0500, Nathan Sidwell wrote: > >>Here are the tests for the routine support. The compiler tests check > >>invalid combinations of gang, worker, vector & s

Re: [2/2] OpenACC routine support

2015-11-03 Thread Nathan Sidwell
On 11/03/15 10:38, Jakub Jelinek wrote: On Mon, Nov 02, 2015 at 02:23:19PM -0500, Nathan Sidwell wrote: Here are the tests for the routine support. The compiler tests check invalid combinations of gang, worker, vector & seq. The libgomp execution tests check the expected partioning occurs with

Re: [2/2] OpenACC routine support

2015-11-03 Thread Jakub Jelinek
On Mon, Nov 02, 2015 at 02:23:19PM -0500, Nathan Sidwell wrote: > Here are the tests for the routine support. The compiler tests check > invalid combinations of gang, worker, vector & seq. The libgomp execution > tests check the expected partioning occurs within loops. As with the > reduction t

Re: [2/2] OpenACC routine support

2015-11-02 Thread Nathan Sidwell
On 11/02/15 14:41, Jakub Jelinek wrote: Does this work even with -O0? I mean, the assembler is invalid for any target other than PTX, so you are relying on aggressively folding this away. Correct. As thread identification is inherently target-specific, I don't see how to do otherwise. We

Re: [2/2] OpenACC routine support

2015-11-02 Thread Jakub Jelinek
On Mon, Nov 02, 2015 at 02:23:19PM -0500, Nathan Sidwell wrote: > +#pragma acc routine gang > +void __attribute__ ((noinline)) gang (int ary[N]) > +{ > +#pragma acc loop gang > +for (unsigned ix = 0; ix < N; ix++) > + { > + if (__builtin_acc_on_device (5)) > + { > + int g

Re: [2/2] OpenACC routine support

2015-11-02 Thread Nathan Sidwell
Here are the tests for the routine support. The compiler tests check invalid combinations of gang, worker, vector & seq. The libgomp execution tests check the expected partioning occurs within loops. As with the reduction tests, these ones are taken from the execution model loop tests. ok