Hi! On Thu, 5 Sep 2013 18:11:05 +0200, Jakub Jelinek <[email protected]> wrote: > 3) I figured out we need to tell the runtime library not just > address, size and kind, but also alignment (we won't need that for > the #pragma omp declare target global vars though), so that the > runtime library can properly align it. As TYPE_ALIGN/DECL_ALIGN > is in bits and is 32 bit wide, when that is in bytes and we only care > about power of twos, I've decided to encode it in the upper 5 bits > of the kind (lower 3 bits are used for OMP_CLAUSE_MAP_* kind).
Unfortunately, this scheme breaks down with OpenACC: we need an
additional bit to codify a flag for present_or_* map clauses (meaning:
only map the data (allocate/to/from/tofrom, as for OpenMP) if not already
present on the device).
With five bits available for the OpenMP case, we can describe alignments
up to 2 GiB, and I've empirically found on my development system that the
largest possible alignment is MAX_OFILE_ALIGNMENT, 256 MiB for ELF
systems, so that's fine. But with only four bits available, we get to
describe alignments up to 1 << ((1 << 4) - 1) = 32 KiB, which is too
small -- even though it'd be fine for "normal" usage of __attribute__
((aligned (x))).
So it seems our options are to use a bigger datatype for the kinds array,
to split off from the kinds array a new alignments array, or to generally
switch to using an array of a struct containing hostaddr, size,
alignment, kind. The latter would require additional changes in the
child_fn.
As it's an ABI change no matter what, would you like to see this limited
to OpenACC? Changing it also for OpenMP's GOMP_target would have the
advantage to have them not diverge (especially at the generating side in
omp-low.c's lowering functions), but I'm not sure whether such an ABI
change would easily be possible now, with the OpenMP 4 support merged
into trunk -- though, it is not yet part of a regular GCC release?
> --- gcc/omp-low.c.jj 2013-09-05 09:19:03.000000000 +0200
> +++ gcc/omp-low.c 2013-09-05 17:11:14.693638660 +0200
> @@ -9342,6 +9349,11 @@ lower_omp_target (gimple_stmt_iterator *
| unsigned char tkind = 0;
| switch (OMP_CLAUSE_CODE (c))
| {
| case OMP_CLAUSE_MAP:
| tkind = OMP_CLAUSE_MAP_KIND (c);
| break;
| case OMP_CLAUSE_TO:
| tkind = OMP_CLAUSE_MAP_TO;
| break;
| case OMP_CLAUSE_FROM:
| tkind = OMP_CLAUSE_MAP_FROM;
| break;
> default:
> gcc_unreachable ();
> }
> + unsigned int talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar));
> + if (DECL_P (ovar) && DECL_ALIGN_UNIT (ovar) > talign)
> + talign = DECL_ALIGN_UNIT (ovar);
> + talign = ceil_log2 (talign);
> + tkind |= talign << 3;
> CONSTRUCTOR_APPEND_ELT (vkind, purpose,
> build_int_cst (unsigned_char_type_node,
> tkind));
The use of OMP_CLAUSE_MAP_* on the generating and integer numerals on the
receiving (libgomp) side is a bit unesthetic, likewise for the hard-coded
3 in the bit shift. What would be the standard GCC way of sharing a
description of the tkind layout between gcc/omp-low.c and
libgomp/target.c? Are we allowed to #include (a new header file)
libgomp/target.h from gcc/omp-low.c?
To avoid silent breakage should alignments bigger than 2 GiB be allowed
in a distant future, would a check like the following be appropriate?
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -10378,6 +10383,11 @@ lower_omp_target (gimple_stmt_iterator *gsi_p,
omp_context *ctx)
unsigned int talign = TYPE_ALIGN_UNIT (TREE_TYPE (ovar));
if (DECL_P (ovar) && DECL_ALIGN_UNIT (ovar) > talign)
talign = DECL_ALIGN_UNIT (ovar);
+ const unsigned int talign_max
+ = 1 << ((1 << (BITS_PER_UNIT - 3)) - 1);
+ if (talign > talign_max)
+ sorry ("can't encode alignment of %u bytes, which is bigger than "
+ "%u bytes", talign, talign_max);
talign = ceil_log2 (talign);
tkind |= talign << 3;
CONSTRUCTOR_APPEND_ELT (vkind, purpose,
Grüße,
Thomas
pgp_bUwzkKWWX.pgp
Description: PGP signature
