On 09/11/15 16:35, Tom de Vries wrote:
Hi,this patch series for stage1 trunk adds support to: - parallelize oacc kernels regions using parloops, and - map the loops onto the oacc gang dimension. The patch series contains these patches: 1 Insert new exit block only when needed in transform_to_exit_first_loop_alt 2 Make create_parallel_loop return void 3 Ignore reduction clause on kernels directive 4 Implement -foffload-alias 5 Add in_oacc_kernels_region in struct loop 6 Add pass_oacc_kernels 7 Add pass_dominator_oacc_kernels 8 Add pass_ch_oacc_kernels 9 Add pass_parallelize_loops_oacc_kernels 10 Add pass_oacc_kernels pass group in passes.def 11 Update testcases after adding kernels pass group 12 Handle acc loop directive 13 Add c-c++-common/goacc/kernels-*.c 14 Add gfortran.dg/goacc/kernels-*.f95 15 Add libgomp.oacc-c-c++-common/kernels-*.c 16 Add libgomp.oacc-fortran/kernels-*.f95 The first 9 patches are more or less independent, but patches 10-16 are intended to be committed at the same time. Bootstrapped and reg-tested on x86_64. Build and reg-tested with nvidia accelerator, in combination with a patch that enables accelerator testing (which is submitted at https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). I'll post the individual patches in reply to this message.
this patch adds and initializes the field in_oacc_kernels_region field in struct loop.
The field is used to signal to subsequent passes that we're dealing with a loop in a kernels region that we're trying parallelize.
Note that we do not parallelize kernels regions with more than one loop nest. [ In general, kernels regions with more than one loop nest should be split up into seperate kernels regions, but that's not supported atm. ]
Thanks, - Tom
Add in_oacc_kernels_region in struct loop 2015-11-09 Tom de Vries <[email protected]> * cfgloop.h (struct loop): Add in_oacc_kernels_region field. * omp-low.c (mark_loops_in_oacc_kernels_region): New function. (expand_omp_target): Call mark_loops_in_oacc_kernels_region. --- gcc/cfgloop.h | 3 +++ gcc/omp-low.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+) diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 6af6893..ee73bf9 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -191,6 +191,9 @@ struct GTY ((chain_next ("%h.next"))) loop { /* True if we should try harder to vectorize this loop. */ bool force_vectorize; + /* True if the loop is part of an oacc kernels region. */ + bool in_oacc_kernels_region; + /* For SIMD loops, this is a unique identifier of the loop, referenced by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE builtins. */ diff --git a/gcc/omp-low.c b/gcc/omp-low.c index d052c13..7121d73 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -12429,6 +12429,61 @@ get_oacc_ifn_dim_arg (const gimple *stmt) return (int) axis; } +/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending + at REGION_EXIT. */ + +static void +mark_loops_in_oacc_kernels_region (basic_block region_entry, + basic_block region_exit) +{ + bitmap dominated_bitmap = BITMAP_GGC_ALLOC (); + bitmap excludes_bitmap = BITMAP_GGC_ALLOC (); + unsigned di; + basic_block bb; + + bitmap_clear (dominated_bitmap); + bitmap_clear (excludes_bitmap); + + /* Get all the blocks dominated by the region entry. That will include the + entire region. */ + vec<basic_block> dominated + = get_all_dominated_blocks (CDI_DOMINATORS, region_entry); + FOR_EACH_VEC_ELT (dominated, di, bb) + bitmap_set_bit (dominated_bitmap, bb->index); + + /* Exclude all the blocks which are not in the region: the blocks dominated by + the region exit. */ + if (region_exit != NULL) + { + vec<basic_block> excludes + = get_all_dominated_blocks (CDI_DOMINATORS, region_exit); + FOR_EACH_VEC_ELT (excludes, di, bb) + bitmap_set_bit (excludes_bitmap, bb->index); + } + + /* Don't parallelize the kernels region if it contains more than one outer + loop. */ + unsigned int nr_outer_loops = 0; + struct loop *loop; + FOR_EACH_LOOP (loop, 0) + { + if (loop_outer (loop) != current_loops->tree_root) + continue; + + if (bitmap_bit_p (dominated_bitmap, loop->header->index) + && !bitmap_bit_p (excludes_bitmap, loop->header->index)) + nr_outer_loops++; + } + if (nr_outer_loops != 1) + return; + + /* Mark the loops in the region. */ + FOR_EACH_LOOP (loop, 0) + if (bitmap_bit_p (dominated_bitmap, loop->header->index) + && !bitmap_bit_p (excludes_bitmap, loop->header->index)) + loop->in_oacc_kernels_region = true; +} + /* Expand the GIMPLE_OMP_TARGET starting at REGION. */ static void @@ -12483,6 +12538,9 @@ expand_omp_target (struct omp_region *region) entry_bb = region->entry; exit_bb = region->exit; + if (gimple_omp_target_kind (entry_stmt) == GF_OMP_TARGET_KIND_OACC_KERNELS) + mark_loops_in_oacc_kernels_region (region->entry, region->exit); + if (offloaded) { unsigned srcidx, dstidx, num; -- 1.9.1
