Hi! On Mon, 18 Jan 2016 14:07:11 +0100, Tom de Vries <[email protected]> wrote: > Add oacc_kernels_p argument to pass_parallelize_loops
> --- a/gcc/tree-parloops.c
> +++ b/gcc/tree-parloops.c
> @@ -2315,6 +2367,9 @@ gen_parallel_loop (struct loop *loop,
| /* Ensure that the exit condition is the first statement in the loop.
| The common case is that latch of the loop is empty (apart from the
| increment) and immediately follows the loop exit test. Attempt to move
the
| entry of the loop directly before the exit check and increase the number
of
| iterations of the loop by one. */
| if (try_transform_to_exit_first_loop_alt (loop, reduction_list, nit))
| {
| if (dump_file
| && (dump_flags & TDF_DETAILS))
| fprintf (dump_file,
| "alternative exit-first loop transform succeeded"
| " for loop %d\n", loop->num);
| }
| else
| {
> + if (oacc_kernels_p)
> + n_threads = 1;
> +
| /* Fall back on the method that handles more cases, but duplicates the
| loop body: move the exit condition of LOOP to the beginning of its
| header, and duplicate the part of the last iteration that gets disabled
| to the exit of the loop. */
| transform_to_exit_first_loop (loop, reduction_list, nit);
| }
Just for my own education: this pessimization "n_threads = 1" for OpenACC
kernels is because the duplicated loop bodies generated by
transform_to_exit_first_loop are not appropriate for parallel OpenACC
offloading execution? (Might add a source code comment here?) Testing
on gomp-4_0-branch, there are no changes in the testsuite if I remove
this hunk.
Grüße
Thomas
signature.asc
Description: PGP signature
