On Tue, Apr 14, 2015 at 05:43:26PM +0200, Thomas Schwinge wrote:
> On Tue, 14 Apr 2015 15:15:02 +0100, Julian Brown <[email protected]>
> wrote:
> > On Wed, 8 Apr 2015 17:58:56 +0300
> > Ilya Verbin <[email protected]> wrote:
> > > I see several regressions:
> > > FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/acc_on_device-1.c
> > > -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test
> > > FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/if-1.c
> > > -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test
> >
> > I think there may be multiple issues here. The attached patch addresses
> > one -- acc_device_type not distinguishing between "offloaded" and host
> > code with the host_nonshm plugin.
>
> (You mean acc_on_device?)
>
> > --- libgomp/oacc-init.c (revision 221922)
> > +++ libgomp/oacc-init.c (working copy)
> > @@ -548,7 +549,14 @@ ialias (acc_set_device_num)
> > int
> > acc_on_device (acc_device_t dev)
> > {
> > - if (acc_get_device_type () == acc_device_host_nonshm)
> > + struct goacc_thread *thr = goacc_thread ();
> > +
> > + /* We only want to appear to be the "host_nonshm" plugin from "offloaded"
> > + code -- i.e. within a parallel region. Test a flag set by the
> > + openacc_parallel hook of the host_nonshm plugin to determine that. */
> > + if (acc_get_device_type () == acc_device_host_nonshm
> > + && thr && thr->target_tls
> > + && ((struct nonshm_thread *)thr->target_tls)->nonshm_exec)
> > return dev == acc_device_host_nonshm || dev == acc_device_not_host;
> >
> > /* Just rely on the compiler builtin. */
>
> Really, acc_on_device is implemented as a compiler builtin (which is just
> disabled for a few libgomp test cases, in order to test the acc_on_device
> library function in libgomp), and I never understood why the "fallback"
> implementation in libgomp (cited above) should be doing anything
> different from the GCC builtin. Is the "problem" actually, that some
The question is if the builtin expansion isn't wrong, at least as long as
the host_nonshm device is meant to be supported. The
#ifdef ACCEL_COMPILER
case is easier, at least as long as ACCEL_COMPILER compiled code is not
meant to be able to offload to other devices (or host again), but the
non-ACCEL_COMPILER case means the code is either on the host, or
host_nonshm, or e.g. with Intel MIC you could have some shared library be
compiled by the host compiler, but then actuall linked into the MIC
offloaded path. In all those cases, I think it is just the library that
can determine the return value.
E.g. OpenMP omp_is_initial_device function is also only implemented in the
library, perhaps at some point I could expand it for #ifdef ACCEL_COMPILER
as builtin, but not for the host code, at least not due to the host-nonshm
plugin.
Jakub