On Thu, 18 Jun 2015, Tom de Vries wrote:

> Hi,
> 
> I ran into a problem with fortran loops in oacc kernels regions not being
> parallelized, after introducting transform_to_exit_first_loop_alt.
> 
> For gfortran.dg/goacc/kernels-loop.f95, we get:
> ...
> #pragma omp target oacc_parallel num_gangs(1)
> ...
> instead of the desired num_gangs (32).
> 
> transform_to_exit_first_loop_alt fails because nit is _135, where nit is
> defined by:
> ...
> *_105 = 0;
> D__lsm.27_50 = *_105;
> _32 = (unsigned int) D__lsm.27_50;
> _135 = 1023 - _32;
> ...
> 
> pass_fre would manage to propagate the '*105 = 0' assignment. But in the
> current pass order, pass_fre is run before pass_lim, where this pattern is
> introduced:
> ...
>               NEXT_PASS (pass_ch_oacc_kernels);
>               NEXT_PASS (pass_fre);
>               NEXT_PASS (pass_tree_loop_init);
>               NEXT_PASS (pass_lim);
>               NEXT_PASS (pass_copy_prop);
>               NEXT_PASS (pass_scev_cprop);
>               NEXT_PASS (pass_parallelize_loops_oacc_kernels);
>               NEXT_PASS (pass_expand_omp_ssa);
>               NEXT_PASS (pass_tree_loop_done);
> ...
> 
> The patch moves pass_fre to the location of pass_copy_prop, and replaces it.
> Furthermore, it adds scans to the fortran test-cases to make sure they get
> properly parallelized.

You may now figure out that LIM needs FRE to detect equal memory
references to apply store-motion.  But maybe the issues oacc
lowering introduces are limited and under your control.

Richard.

> Bootstrapped and reg-tested on x86_64.
> 
> Committed to gomp-4_0-branch.
> 
> Thanks,
> - Tom
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Reply via email to