On Wed, Jun 29, 2022 at 08:10:10PM +0200, Tobias Burnus wrote:
> > > +  if (output_requires)
> > > +    {
> > > +      HOST_WIDE_INT val = ((HOST_WIDE_INT) omp_requires_mask
> > > +                       & (OMP_REQUIRES_UNIFIED_ADDRESS
> > > +                          | OMP_REQUIRES_UNIFIED_SHARED_MEMORY
> > > +                          | OMP_REQUIRES_REVERSE_OFFLOAD
> > > +                          | OMP_REQUIRES_TARGET_USED));
> > Why is the OMP_REQUIRES_TARGET_USED bit saved there?  It is always set
> > if output_requires...
> > If we want to make sure it is set in omp_requires_mask after streaming
> > in, we can just or it in back there.
> 
> It is there because it is later needed: With -flto, we otherwise
> wouldn't generate the variable in omp-offload.cc. And as this value is
> only used internally, I thought it could just stay there. However, it
> could also be excluded here and ...

Ok, let's keep it then.  Wanted to save that 1 bit somewhere, but as it
isn't a pack, it is insignificant anyway.
More could be saved by reordering the bitmasks such that the atomic stuff is
in upper bits.

> > > +              char buf[64], buf2[64];
> > > +              omp_requires_to_name (buf, sizeof (buf), 
> > > omp_requires_mask);
> > > +              omp_requires_to_name (buf2, sizeof (buf2), val);
> > > +              error ("OpenMP %<requires%> directive with non-identical "
> > > +                     "clauses in multiple compilation units: %qs vs. 
> > > %qs",
> > > +                     buf, buf2);
> > I think the user should be told also where, so file_data->file_name
> 
> With -save-temps, the filename is indeed helpful, e.g.:
> "a-requires-1.o". However, without, I get names like: "/tmp/ccIgRkOW.o".

Even when using
gcc -fopenmp -c foo.c -o foo.o
gcc -fopenmp -c bar.c -o bar.o
gcc -fopenmp -o foo foo.o bar.o
?
Anyway, I think it would be good to ask Richi or Honza what is the best to
get at the TU's filename.
Perhaps location_t from TRANSLATION_UNIT_DECL or something similar?

> > Can't we arrange for GOMP_offload_register_ver to see the bitmasks somewhere
> > instead (and force GOMP_offload_register_ver even if there are no vars or
> > funcs and just the requires mask)?
> 
> This probably works – but means some restructuring.
> GOMP_offload_register_ver assumes that num_devices is already available
> – and we need to exclude those for which the 'omp requires' cannot be
> fulfilled.

GOMP_offload_register_ver is called from initializers of programs and shared
libraries.  For the program and non-dlopened shared libraries, the usual
case is that it is called before we initialize devices, so we just record it
in offload_images and continue.  When the devices are initialized, we push
it to those devices.

Another case is registration after number of devices is initialized.

The former should just enqueue also those bitmasks, and when we initialize
devices we should complain on mismatches and/or filter out unsuitable
devices.

For the latter case, in theory (at least when we also catch calls to the
device routines with the meaning of device related omp_* APIs) that means
some TU already had the mask and so the later loaded masks should be the
same, so we should just complain on mismatches.

Now, the target data I'm afraid is device specific, but for GOMP_VERSION 2
we could e.g. have the pointer passed to the function point to target data
with the mask at offset -4 bytes before it, or can just have always the
bitmask followed by real target data.

        Jakub

Reply via email to