[Bug tree-optimization/82142] struct zeroing should use wide stores instead of avoiding overwriting padding

rguenth at gcc dot gnu.org Tue, 12 Sep 2017 02:21:03 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82142


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2017-09-12
                 CC|                            |jamborm at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I believe this is SRAs fault to expose the padding in the first place.  Later
we do try to merge the stores "back" but obviously we can't do that.

Eventually we could make SRA emit CLOBBERs for the padding but not sure if
that alone would help.

I think we may have duplicates for this SRA "issue".  W/o vectorization
we end up with

assignzero (struct foo * p)
{
  <bb 2> [100.00%] [count: INV]:
  MEM[(struct foo *)p_2(D)] = 0;
  MEM[(struct foo *)p_2(D) + 8B] = 0;
  MEM[(struct foo *)p_2(D) + 12B] = 0;
  MEM[(struct foo *)p_2(D) + 14B] = 0;
  MEM[(struct foo *)p_2(D) + 16B] = 0;
  MEM[(struct foo *)p_2(D) + 24B] = 0;
  MEM[(struct foo *)p_2(D) + 32B] = 0;
  MEM[(struct foo *)p_2(D) + 40B] = 0;
  MEM[(struct foo *)p_2(D) + 48B] = 0;
  MEM[(struct foo *)p_2(D) + 56B] = 0;
  return;

with vectorization:

assignzero (struct foo * p)
{
  <bb 2> [100.00%] [count: INV]:
  MEM[(struct foo *)p_2(D)] = 0;
  MEM[(struct foo *)p_2(D) + 8B] = 0;
  MEM[(struct foo *)p_2(D) + 12B] = 0;
  MEM[(struct foo *)p_2(D) + 14B] = 0;
  MEM[(struct foo *)p_2(D) + 16B] = { 0, 0, 0, 0, 0, 0, 0, 0 };
  MEM[(struct foo *)p_2(D) + 48B] = 0;
  MEM[(struct foo *)p_2(D) + 56B] = 0;

I think store-merging could make use of

  MEM[(...)p_2(D) + CST] = CLOBBER;

for the holes.

Note that strictly speaking as we optimize memset(.., 0, ..) to = {} SRA
cannot simply replace = {} with setting only non-padding members to zero.
Likewise we optimize memcpy (A, B,...) to *A = *B, same arguments for
padding apply.  Ok, so we don't actually fold that aggressive.

[Bug tree-optimization/82142] struct zeroing should use wide stores instead of avoiding overwriting padding

Reply via email to