On Thu, Jul 21, 2022 at 02:33:32PM +0200, Tobias Burnus wrote:
> OpenMP: Support reverse offload (middle end part)
> 
> gcc/ChangeLog:
> 
>       * internal-fn.cc (expand_GOMP_TARGET_REV): New.
>       * internal-fn.def (GOMP_TARGET_REV): New.
>       * lto-cgraph.cc (lto_output_node, verify_node_partition): Mark
>       'omp target device_ancestor_host' as in_other_partition and don't
>       error if absent.
>       * omp-low.cc (create_omp_child_function): Mark as 'noclone'.
>       * omp-expand.cc (expand_omp_target): For reverse offload, remove
>       sorry, use device = GOMP_DEVICE_HOST_FALLBACK and create
>       empty-body nohost function.
>       * omp-offload.cc (execute_omp_device_lower): Handle
>       IFN_GOMP_TARGET_REV.
>       (pass_omp_target_link::execute): For ACCEL_COMPILER, don't
>       nullify fn argument for reverse offload
> 
> libgomp/ChangeLog:
> 
>       * libgomp.texi (OpenMP 5.0): Mark 'ancestor' as implemented but
>       refer to 'requires'.
>       * testsuite/libgomp.c-c++-common/reverse-offload-1-aux.c: New test.
>       * testsuite/libgomp.c-c++-common/reverse-offload-1.c: New test.
>       * testsuite/libgomp.fortran/reverse-offload-1-aux.f90: New test.
>       * testsuite/libgomp.fortran/reverse-offload-1.f90: New test.
> 
> gcc/testsuite/ChangeLog:
> 
>       * c-c++-common/gomp/reverse-offload-1.c: Remove dg-sorry.
>       * c-c++-common/gomp/target-device-ancestor-4.c: Likewise.
>       * gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.
>       * gfortran.dg/gomp/target-device-ancestor-5.f90: Likewise.
>       * c-c++-common/goacc/classify-kernels-parloops.c: Add 'noclone' to
>       scan-tree-dump-times.
>       * c-c++-common/goacc/classify-kernels-unparallelized-parloops.c:
>       Likewise.
>       * c-c++-common/goacc/classify-kernels-unparallelized.c: Likewise.
>       * c-c++-common/goacc/classify-kernels.c: Likewise.
>       * c-c++-common/goacc/classify-parallel.c: Likewise.
>       * c-c++-common/goacc/classify-serial.c: Likewise.
>       * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-2.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-3.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-data-2.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-data-enter-exit.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-data-update.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-data.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-g.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-n.c: Likewise.
>       * c-c++-common/goacc/kernels-loop-nest.c: Likewise.
>       * c-c++-common/goacc/kernels-loop.c: Likewise.
>       * c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
>       * c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: Likewise.
>       * gfortran.dg/goacc/classify-kernels-parloops.f95: Likewise.
>       * gfortran.dg/goacc/classify-kernels-unparallelized-parloops.f95:
>       Likewise.
>       * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
>       * gfortran.dg/goacc/classify-kernels.f95: Likewise.
>       * gfortran.dg/goacc/classify-parallel.f95: Likewise.
>       * gfortran.dg/goacc/classify-serial.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-2.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop-n.f95: Likewise.
>       * gfortran.dg/goacc/kernels-loop.f95: Likewise.
>       * gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: Likewise.

Ok for trunk, just a comment regarding the FIXME below (can be handled
incrementally).

> +       case IFN_GOMP_TARGET_REV:
> +         {
> +#ifndef ACCEL_COMPILER
> +           gimple_stmt_iterator gsi2 = gsi;
> +           gsi_next (&gsi2);
> +           gcc_assert (!gsi_end_p (gsi2));
> +           gcc_assert (gimple_call_builtin_p (gsi_stmt (gsi2),
> +                                              BUILT_IN_GOMP_TARGET));
> +           tree old_decl
> +             = TREE_OPERAND (gimple_call_arg (gsi_stmt (gsi2), 1), 0);
> +           tree new_decl = gimple_call_arg (gsi_stmt (gsi), 0);
> +           gimple_call_set_arg (gsi_stmt (gsi2), 1, new_decl);
> +           update_stmt (gsi_stmt (gsi2));
> +           new_decl = TREE_OPERAND (new_decl, 0);
> +           unsigned i;
> +           unsigned num_funcs = vec_safe_length (offload_funcs);
> +           for (i = 0; i < num_funcs; i++)
> +             {
> +               if ((*offload_funcs)[i] == old_decl)
> +                 {
> +                   (*offload_funcs)[i] = new_decl;
> +                   break;
> +                 }
> +               else if ((*offload_funcs)[i] == new_decl)
> +                 break;  /* This can happen due to inlining.  */
> +             }
> +           gcc_assert (i < num_funcs);
> +#else
> +           tree old_decl = TREE_OPERAND (gimple_call_arg (gsi_stmt (gsi), 0),
> +                                         0);
> +#endif
> +           /* FIXME: Find a way to actually prevent outputting the empty-body
> +              old_decl as debug symbol + function in the assembly file.  */

The debug stuff ought to be through DECL_IGNORED_P on the FUNCTION_DECL.
If you want it set just on one side and clear on the other side, perhaps set
or clear it during lto streaming it in in offload lto1?
As for emitting it, perhaps turning it into an external declaration from
definition afterwards?

        Jakub

Reply via email to