On Tue, 24 Nov 2015, Tom de Vries wrote: > On 19/11/15 14:50, Tom de Vries wrote: > > On 11/11/15 11:58, Richard Biener wrote: > > > On Mon, 9 Nov 2015, Tom de Vries wrote: > > > > > > > On 09/11/15 16:35, Tom de Vries wrote: > > > > > Hi, > > > > > > > > > > this patch series for stage1 trunk adds support to: > > > > > - parallelize oacc kernels regions using parloops, and > > > > > - map the loops onto the oacc gang dimension. > > > > > > > > > > The patch series contains these patches: > > > > > > > > > > 1 Insert new exit block only when needed in > > > > > transform_to_exit_first_loop_alt > > > > > 2 Make create_parallel_loop return void > > > > > 3 Ignore reduction clause on kernels directive > > > > > 4 Implement -foffload-alias > > > > > 5 Add in_oacc_kernels_region in struct loop > > > > > 6 Add pass_oacc_kernels > > > > > 7 Add pass_dominator_oacc_kernels > > > > > 8 Add pass_ch_oacc_kernels > > > > > 9 Add pass_parallelize_loops_oacc_kernels > > > > > 10 Add pass_oacc_kernels pass group in passes.def > > > > > 11 Update testcases after adding kernels pass group > > > > > 12 Handle acc loop directive > > > > > 13 Add c-c++-common/goacc/kernels-*.c > > > > > 14 Add gfortran.dg/goacc/kernels-*.f95 > > > > > 15 Add libgomp.oacc-c-c++-common/kernels-*.c > > > > > 16 Add libgomp.oacc-fortran/kernels-*.f95 > > > > > > > > > > The first 9 patches are more or less independent, but patches 10-16 > > > > > are > > > > > intended to be committed at the same time. > > > > > > > > > > Bootstrapped and reg-tested on x86_64. > > > > > > > > > > Build and reg-tested with nvidia accelerator, in combination with a > > > > > patch that enables accelerator testing (which is submitted at > > > > > https://gcc.gnu.org/ml/gcc-patches/2015-10/msg01771.html ). > > > > > > > > > > I'll post the individual patches in reply to this message. > > > > > > > > this patchs add a pass group pass_oacc_kernels (which will be added > > > > to the > > > > pass list as a whole in patch 10). > > > > > > Just to understand (while also skimming the HSA patches). > > > > > > You are basically relying on autopar for what the HSA patches call > > > "gridification"? That is, OMP lowering produces loopy kernels > > > and autopar then will basically strip the outermost loop? > > > > Short answer: no. In more detail... > <SNIP> > > Reposting patch, after splitting the pass group into two.
Ok. Richard. > Thanks, > - TOm > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)