On Thu, Nov 10, 2016 at 08:11:45PM +0300, Alexander Monakov wrote:
> libgomp/
>
> * Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c.
> * Makefile.in. Regenerate.
> * configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it...
> (LIBGOMP_USE_PTHREADS): ...here; new define.
> * configure: Regenerate.
> * config.h.in: Likewise.
> * config/posix/affinity.c: Move to...
> * affinity.c: ...here (new file). Guard use of PThreads-specific
Never seen pthreads capitalized this way, use pthreads or Pthreads
or POSIX Threads.
> interface by LIBGOMP_USE_PTHREADS.
> * critical.c: Split out GOMP_atomic_{start,end} into...
> * atomic.c: ...here (new file).
> * env.c: Split out ICV definitions into...
> * icv.c: ...here (new file) and...
> * icv-device.c: ...here. New file.
> * config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c.
> (gomp_destroy_lock_30): Ditto.
> (gomp_set_lock_30): Ditto.
> (gomp_unset_lock_30): Ditto.
> (gomp_test_lock_30): Ditto.
> (gomp_init_nest_lock_30): Ditto.
> (gomp_destroy_nest_lock_30): Ditto.
> (gomp_set_nest_lock_30): Ditto.
> (gomp_unset_nest_lock_30): Ditto.
> (gomp_test_nest_lock_30): Ditto.
> * lock.c: New.
> * config/nvptx/lock.c: New.
> * config/nvptx/bar.c: New.
> * config/nvptx/bar.h: New.
> * config/nvptx/doacross.h: New.
> * config/nvptx/error.c: New.
> * config/nvptx/icv-device.c: New.
> * config/nvptx/mutex.h: New.
> * config/nvptx/pool.h: New.
> * config/nvptx/proc.c: New.
> * config/nvptx/ptrlock.h: New.
> * config/nvptx/sem.h: New.
> * config/nvptx/simple-bar.h: New.
> * config/nvptx/target.c: New.
> * config/nvptx/task.c: New.
> * config/nvptx/team.c: New.
> * config/nvptx/time.c: New.
> * config/posix/simple-bar.h: New.
> * libgomp.h: Guard pthread.h inclusion. Include simple-bar.h.
> (gomp_num_teams_var): Declare.
> (struct gomp_thread_pool): Change threads_dock member to
> gomp_simple_barrier_t.
> [__nvptx__] (gomp_thread): New implementation.
> (gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS.
> (gomp_thread_destructor): Ditto.
> (gomp_init_thread_affinity): Ditto.
> * team.c: Guard uses of PThreads-specific interfaces by
Ditto.
> LIBGOMP_USE_PTHREADS. Adjust all uses of threads_dock.
> (gomp_free_thread) [__nvptx__]: Do not call 'free'.
>
> * config/nvptx/alloc.c: Delete.
> * config/nvptx/barrier.c: Ditto.
> * config/nvptx/fortran.c: Ditto.
> * config/nvptx/iter.c: Ditto.
> * config/nvptx/iter_ull.c: Ditto.
> * config/nvptx/loop.c: Ditto.
> * config/nvptx/loop_ull.c: Ditto.
> * config/nvptx/ordered.c: Ditto.
> * config/nvptx/parallel.c: Ditto.
> * config/nvptx/section.c: Ditto.
> * config/nvptx/single.c: Ditto.
> * config/nvptx/splay-tree.c: Ditto.
> * config/nvptx/work.c: Ditto.
>
> * testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass
> -foffload=-lgfortran in addition to -lgfortran.
> * testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto.
>
> * plugin/plugin-nvptx.c: Include <limits.h>.
> (struct targ_fn_descriptor): Add new fields.
> (struct ptx_device): Ditto. Set them...
> (nvptx_open_device): ...here.
> (nvptx_adjust_launch_bounds): New.
> (nvptx_host2dev): Allow NULL 'nvthd'.
> (nvptx_dev2host): Ditto.
> (GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400.
> (link_ptx): Adjust log sizes.
> (nvptx_host2dev): Allow NULL 'nvthd'.
> (nvptx_dev2host): Ditto.
> (nvptx_set_clocktick): New. Use it...
> (GOMP_OFFLOAD_load_image): ...here. Set new targ_fn_descriptor
> fields.
> (GOMP_OFFLOAD_dev2dev): New.
> (nvptx_adjust_launch_bounds): New.
> (nvptx_stacks_size): New.
> (nvptx_stacks_alloc): New.
> (nvptx_stacks_free): New.
> (GOMP_OFFLOAD_run): New.
> (GOMP_OFFLOAD_async_run): New (stub).
Ok for trunk, assuming the config/nvptx bits it relies on are checked
in first. Two nits inline, the first one can be handled incrementally,
the latter one probably just remove the #if 0 stuff and if needed, replace
with something different incrementally.
> +void
> +gomp_barrier_wait_last (gomp_barrier_t *bar)
> +{
> +#if 0
> + gomp_barrier_state_t state = gomp_barrier_wait_start (bar);
> + if (state & BAR_WAS_LAST)
> + gomp_barrier_wait_end (bar, state);
> +#else
> + gomp_barrier_wait (bar);
> +#endif
> +}
~~~
Any plans to change that later, or shall the #if 0 stuff be just removed?
> +/* NVPTX is an accelerator-only target, so this should never be called. */
> +
> +bool
> +gomp_target_task_fn (void *data)
> +{
> + __builtin_unreachable ();
> +}
~~~
Not sure if we don't want to gomp_fatal instead or something similarly
loud.
On a related topic, it might be useful to #ifdef out parts of task.c
- gomp_target_task_completion, GOMP_PLUGIN_target_task_completion,
gomp_create_target_task for nvptx libgomp.a - the first one should be
stubbed, the rest left out. And perhaps at least for now simplify the
task priority stuff, as OMP_MAX_TASK_PRIORITY var will not be present
on the offloading side anyway. Can be done incrementally.
Jakub