https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102571

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|libgomp                     |tree-optimization

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems like tree DSE bug to me.
We have before dse1:
  D.2119 = 0.0;
  MEM <char[6]> [(long double *)&D.2119 + 10B] = {};
  __builtin_GOMP_atomic_start ();
  _58 = MEM[(long double * {ref-all})&ld];
  D.2118 = _58;
  MEM <char[6]> [(long double *)&D.2118 + 10B] = {};
  _13 = __builtin_memcmp (&D.2118, &D.2119, 16);
where D.2119 and D.2118 have long double type.
But dse1 says:
  Deleted redundant store: MEM <char[6]> [(long double *)&D.2119 + 10B] = {};
and removes that store, but that isn't really redundant, because long double
has padding bits.
For OpenMP, perhaps I could get away around this and if I prove the non-padding
bits in the type are contiguous followed by padding bits the boundary between
the two is on a byte boundary, I could optimize the
  __builtin_clear_padding (&arg1); __builtin_clear_padding (&arg2); x =
__builtin_memcmp (&arg1, &arg2, 16); // or 12 for ia32 into just
x = __builtin_memcmp (&arg1, &arg2, 10);
(and I'll probably implement it tomorrow).
But it won't really help libstdc++, which will once the
https://gcc.gnu.org/pipermail/libstdc++/2021-September/053210.html patch is in,
use __builtin_clear_padding + __atomic_*, most likely in different inline
functions and will almost certainly suffer from this.
In
  val = 0;
  MEM ... [&val + ...] = {};
the second stmt is only redundant if the type doesn't have any padding bits...
I guess gimple-fold.c could provide an even cheaper helper whether the type has
padding bits than what is currently provided.
Then there is the question on unions, __builtin_clear_padding needs to be
conservative for unions and only clear bits that are proven to be padding in
all union members.  But to disable the DSE optimization, we probably need to
ask a different question, instead of "does the type have any padding bits that
would be cleared by __builtin_clear_padding?" ask "can the type have any
padding bits?" where for unions the answer would be true if any member has them
rather than if all the members have them in the same positions.

Reply via email to