Hi,
this patch series for stage1 trunk adds support to:
- parallelize oacc kernels regions using parloops, and
- map the loops onto the oacc gang dimension.
The patch series contains these patches:
1 Insert new exit block only when needed in
transform_to_exit_first_loop_alt
2 Make create_parallel_loop return void
3 Ignore reduction clause on kernels directive
4 Implement -foffload-alias
5 Add in_oacc_kernels_region in struct loop
6 Add pass_oacc_kernels
7 Add pass_dominator_oacc_kernels
8 Add pass_ch_oacc_kernels
9 Add pass_parallelize_loops_oacc_kernels
10 Add pass_oacc_kernels pass group in passes.def
11 Update testcases after adding kernels pass group
12 Handle acc loop directive
13 Add c-c++-common/goacc/kernels-*.c
14 Add gfortran.dg/goacc/kernels-*.f95
15 Add libgomp.oacc-c-c++-common/kernels-*.c
16 Add libgomp.oacc-fortran/kernels-*.f95
The first 9 patches are more or less independent, but patches 10-16 are
intended to be committed at the same time.
Bootstrapped and reg-tested on x86_64.
Build and reg-tested with nvidia accelerator, in combination with a
patch that enables accelerator testing (which is submitted at
https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ).
I'll post the individual patches in reply to this message.
Thanks,
- Tom
---
1
Insert new exit block only when needed in transform_to_exit_first_loop_alt
2015-06-30 Tom de Vries <t...@codesourcery.com>
* tree-parloops.c (transform_to_exit_first_loop_alt): Insert new exit
block only when needed.
---
2
Make create_parallel_loop return void
2015-11-09 Tom de Vries <t...@codesourcery.com>
* tree-parloops.c (create_parallel_loop): Return void.
---
3
Ignore reduction clause on kernels directive
2015-11-08 Tom de Vries <t...@codesourcery.com>
* c-omp.c (c_oacc_split_loop_clauses): Don't copy OMP_CLAUSE_REDUCTION,
classify as loop clause.
---
4
Implement -foffload-alias
2015-11-03 Tom de Vries <t...@codesourcery.com>
* common.opt (foffload-alias): New option.
* flag-types.h (enum offload_alias): New enum.
* omp-low.c (install_var_field): Handle flag_offload_alias.
* doc/invoke.texi (@item Code Generation Options): Add -foffload-alias.
(@item -foffload-alias): New item.
* c-c++-common/goacc/kernels-loop-offload-alias-none.c: New test.
* c-c++-common/goacc/kernels-loop-offload-alias-ptr.c: New test.
---
5
Add in_oacc_kernels_region in struct loop
2015-11-09 Tom de Vries <t...@codesourcery.com>
* cfgloop.h (struct loop): Add in_oacc_kernels_region field.
* omp-low.c (mark_loops_in_oacc_kernels_region): New function.
(expand_omp_target): Call mark_loops_in_oacc_kernels_region.
---
6
Add pass_oacc_kernels
2015-11-09 Tom de Vries <t...@codesourcery.com>
* tree-pass.h (make_pass_oacc_kernels): Declare.
* tree-ssa-loop.c (gate_oacc_kernels): New static function.
(pass_data_oacc_kernels): New pass_data.
(class pass_oacc_kernels): New pass.
(make_pass_oacc_kernels): New function.
---
7
Add pass_dominator_oacc_kernels
2015-11-09 Tom de Vries <t...@codesourcery.com>
* tree-pass.h (make_pass_dominator_oacc_kernels): Declare.
* tree-ssa-dom.c (class dominator_base): New class. Factor out of ...
(class pass_dominator): ... here.
(dominator_base::may_peel_loop_headers_p)
(pass_dominator::may_peel_loop_headers_p): New function.
(pass_dominator_oacc_kernels): New pass.
(make_pass_dominator_oacc_kernels): New function.
(dominator_base::execute): Use may_peel_loop_headers_p.
---
8
Add pass_ch_oacc_kernels
2015-11-09 Tom de Vries <t...@codesourcery.com>
* tree-pass.h (make_pass_ch_oacc_kernels): Declare.
* tree-ssa-loop-ch.c (pass_ch::pass_ch (pass_data, gcc::context)): New
constructor.
(pass_data_ch_oacc_kernels): New pass_data.
(class pass_ch_oacc_kernels): New pass.
(pass_ch_oacc_kernels::process_loop_p): New function.
(make_pass_ch_oacc_kernels): New function.
---
9
Add pass_parallelize_loops_oacc_kernels
2015-11-09 Tom de Vries <t...@codesourcery.com>
* omp-low.c (set_oacc_fn_attrib): Make extern.
* omp-low.c (expand_omp_atomic_fetch_op): Release defs of update stmt.
* omp-low.h (set_oacc_fn_attrib): Declare.
* tree-parloops.c (struct reduction_info): Add reduc_addr field.
(create_call_for_reduction_1): Handle case that reduc_addr is non-NULL.
(create_parallel_loop, gen_parallel_loop, try_create_reduction_list):
Add and handle function parameter oacc_kernels_p.
(get_omp_data_i_param): New function.
(ref_conflicts_with_region, oacc_entry_exit_ok_1)
(oacc_entry_exit_single_gang, oacc_entry_exit_ok): New function.
(parallelize_loops): Add and handle function parameter oacc_kernels_p.
Calculate dominance info. Skip loops that are not in a kernels region
in oacc_kernels_p mode. Skip inner loops of parallelized loops.
(pass_parallelize_loops::execute): Call parallelize_loops with false
argument.
(pass_data_parallelize_loops_oacc_kernels): New pass_data.
(class pass_parallelize_loops_oacc_kernels): New pass.
(pass_parallelize_loops_oacc_kernels::execute)
(make_pass_parallelize_loops_oacc_kernels): New function.
* tree-pass.h (make_pass_parallelize_loops_oacc_kernels): Declare.
---
10
Add pass_oacc_kernels pass group in passes.def
2015-11-09 Tom de Vries <t...@codesourcery.com>
* omp-low.c (pass_expand_omp_ssa::clone): New function.
* tree-ssa-loop.c (pass_scev_cprop::clone, pass_tree_loop_init::clone)
(pass_tree_loop_done::clone): New function.
* passes.def: Add pass_oacc_kernels pass group.
---
11
Update testcases after adding kernels pass group
2015-11-09 Tom de Vries <t...@codesourcery.com>
* c-c++-common/restrict-2.c: Update after adding pass_oacc_kernels pass
group.
* c-c++-common/restrict-4.c: Same.
* g++.dg/tree-ssa/copyprop-1.C: Same.
* g++.dg/tree-ssa/pr33615.C: Same.
* g++.dg/tree-ssa/restrict1.C: Same.
* gcc.dg/gomp/notify-new-function-3.c: Same.
* gcc.dg/pr23911.c: Same.
* gcc.dg/pr41488.c: Same.
* gcc.dg/tm/pub-safety-1.c: Same.
* gcc.dg/tm/reg-promotion.c: Same.
* gcc.dg/tree-ssa/20030709-2.c: Same.
* gcc.dg/tree-ssa/20030731-2.c: Same.
* gcc.dg/tree-ssa/20040729-1.c: Same.
* gcc.dg/tree-ssa/20050314-1.c: Same.
* gcc.dg/tree-ssa/cfgcleanup-1.c: Same.
* gcc.dg/tree-ssa/loop-17.c: Same.
* gcc.dg/tree-ssa/loop-32.c: Same.
* gcc.dg/tree-ssa/loop-33.c: Same.
* gcc.dg/tree-ssa/loop-34.c: Same.
* gcc.dg/tree-ssa/loop-35.c: Same.
* gcc.dg/tree-ssa/loop-36.c: Same.
* gcc.dg/tree-ssa/loop-39.c: Same.
* gcc.dg/tree-ssa/loop-7.c: Same.
* gcc.dg/tree-ssa/pr21086.c: Same.
* gcc.dg/tree-ssa/pr23109.c: Same.
* gcc.dg/tree-ssa/restrict-3.c: Same.
* gcc.dg/tree-ssa/restrict-5.c: Same.
* gcc.dg/tree-ssa/scev-7.c: Same.
* gcc.dg/tree-ssa/ssa-dce-1.c: Same.
* gcc.dg/tree-ssa/ssa-dce-2.c: Same.
* gcc.dg/tree-ssa/ssa-lim-1.c: Same.
* gcc.dg/tree-ssa/ssa-lim-10.c: Same.
* gcc.dg/tree-ssa/ssa-lim-11.c: Same.
* gcc.dg/tree-ssa/ssa-lim-12.c: Same.
* gcc.dg/tree-ssa/ssa-lim-2.c: Same.
* gcc.dg/tree-ssa/ssa-lim-3.c: Same.
* gcc.dg/tree-ssa/ssa-lim-6.c: Same.
* gcc.dg/tree-ssa/ssa-lim-7.c: Same.
* gcc.dg/tree-ssa/ssa-lim-8.c: Same.
* gcc.dg/tree-ssa/ssa-lim-9.c: Same.
* gcc.dg/tree-ssa/structopt-1.c: Same.
* gcc.dg/vect/pr26359.c: Same.
* gfortran.dg/pr32921.f: Same.
---
12
Handle acc loop directive
2015-11-09 Tom de Vries <t...@codesourcery.com>
* omp-low.c (struct omp_region): Add inside_kernels_p field.
(expand_omp_for_generic): Only set address taken for istart0
and end0 unless necessary. Adjust to generate a 'sequential' loop
when GOMP builtin arguments are BUILT_IN_NONE.
(expand_omp_for): Use expand_omp_for_generic() to generate a
non-parallelized loop for OMP_FORs inside OpenACC kernels regions.
(expand_omp): Mark inside_kernels_p field true for regions
nested inside OpenACC kernels constructs.
---
13
Add c-c++-common/goacc/kernels-*.c
2015-11-09 Tom de Vries <t...@codesourcery.com>
* c-c++-common/goacc/kernels-acc-loop-reduction.c: New test.
* c-c++-common/goacc/kernels-acc-loop-smaller-equal.c: New test.
* c-c++-common/goacc/kernels-counter-var-redundant-load.c: New test.
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
* c-c++-common/goacc/kernels-double-reduction.c: New test.
* c-c++-common/goacc/kernels-empty.c: New test.
* c-c++-common/goacc/kernels-eternal.c: New test.
* c-c++-common/goacc/kernels-loop-2-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-2.c: New test.
* c-c++-common/goacc/kernels-loop-3-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-3.c: New test.
* c-c++-common/goacc/kernels-loop-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-data-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: New test.
* c-c++-common/goacc/kernels-loop-data-update.c: New test.
* c-c++-common/goacc/kernels-loop-data.c: New test.
* c-c++-common/goacc/kernels-loop-g.c: New test.
* c-c++-common/goacc/kernels-loop-mod-not-zero.c: New test.
* c-c++-common/goacc/kernels-loop-n-acc-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n.c: New test.
* c-c++-common/goacc/kernels-loop-nest.c: New test.
* c-c++-common/goacc/kernels-loop.c: New test.
* c-c++-common/goacc/kernels-noreturn.c: New test.
* c-c++-common/goacc/kernels-one-counter-var.c: New test.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: New test.
* c-c++-common/goacc/kernels-reduction.c: New test.
---
14
Add gfortran.dg/goacc/kernels-*.f95
2015-11-09 Tom de Vries <t...@codesourcery.com>
* gfortran.dg/goacc/kernels-loop-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-update.f95: New test.
* gfortran.dg/goacc/kernels-loop-data.f95: New test.
* gfortran.dg/goacc/kernels-loop.f95: New test.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: New test.
---
15
Add libgomp.oacc-c-c++-common/kernels-*.c
2015-11-09 Tom de Vries <t...@codesourcery.com>
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-2.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit-2.c:
Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit.c:
Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-update.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c: Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Same.
*
testsuite/libgomp.oacc-c-c++-common/kernels-parallel-loop-data-enter-exit.c:
Same.
* testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c: Same.
---
16
Add libgomp.oacc-fortran/kernels-*.f95
2015-11-09 Tom de Vries <t...@codesourcery.com>
* testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95: Same.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
Same.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95: Same.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95: Same.
* testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: Same.
* testsuite/libgomp.oacc-fortran/kernels-loop.f95: Same.
*
testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95:
Same.
---