On Mon, 9 Nov 2015, Tom de Vries wrote: > On 09/11/15 16:35, Tom de Vries wrote: > > Hi, > > > > this patch series for stage1 trunk adds support to: > > - parallelize oacc kernels regions using parloops, and > > - map the loops onto the oacc gang dimension. > > > > The patch series contains these patches: > > > > 1 Insert new exit block only when needed in > > transform_to_exit_first_loop_alt > > 2 Make create_parallel_loop return void > > 3 Ignore reduction clause on kernels directive > > 4 Implement -foffload-alias > > 5 Add in_oacc_kernels_region in struct loop > > 6 Add pass_oacc_kernels > > 7 Add pass_dominator_oacc_kernels > > 8 Add pass_ch_oacc_kernels > > 9 Add pass_parallelize_loops_oacc_kernels > > 10 Add pass_oacc_kernels pass group in passes.def > > 11 Update testcases after adding kernels pass group > > 12 Handle acc loop directive > > 13 Add c-c++-common/goacc/kernels-*.c > > 14 Add gfortran.dg/goacc/kernels-*.f95 > > 15 Add libgomp.oacc-c-c++-common/kernels-*.c > > 16 Add libgomp.oacc-fortran/kernels-*.f95 > > > > The first 9 patches are more or less independent, but patches 10-16 are > > intended to be committed at the same time. > > > > Bootstrapped and reg-tested on x86_64. > > > > Build and reg-tested with nvidia accelerator, in combination with a > > patch that enables accelerator testing (which is submitted at > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). > > > > I'll post the individual patches in reply to this message. > > this patch adds and initializes the field in_oacc_kernels_region field in > struct loop. > > The field is used to signal to subsequent passes that we're dealing with a > loop in a kernels region that we're trying parallelize. > > Note that we do not parallelize kernels regions with more than one loop nest. > [ In general, kernels regions with more than one loop nest should be split up > into seperate kernels regions, but that's not supported atm. ]
I think mark_loops_in_oacc_kernels_region can be greatly simplified. Both region entry and exit should have the same ->loop_father (a SESE region). Then you can just walk that loops inner (and their sibling) loops checking their header domination relation with the region entry exit (only necessary for direct inner loops). Richard. > Thanks, > - Tom > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)