https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102436

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Target Milestone|---                         |11.3
           Priority|P3                          |P2
           Keywords|                            |missed-optimization
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
   Last reconfirmed|                            |2021-09-22
             Status|UNCONFIRMED                 |ASSIGNED

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Memory reference 3: numb_moves
Memory reference 4: _24->from
...
Querying dependency of refs 3 and 4: dependent.
Querying SM WAW dependencies of ref 3 in loop 1: dependent

the issue is that we require conditional executed stores to be independent
on all other stores as we cannot re-issue other stores on exit in the proper
order.

Now, in this case the dependent stores are executed under the same condition
and in fact ordered in a way that we don't have to re-issue any dependent
store.

We're failing to handle this special case after the store-motion re-write that
fixed the TBAA issues.

Smaller testcase where we can just issue the conditional store to 'p':

unsigned p;
void foo (float *q)
{
  for (int i = 0; i < 256; ++i)
    {
      if (p)
        {
          unsigned a = p;
          *(q++) = 1.;
          p = a + 1;
        }
    }
}

the following are what's very much more difficult to handle
(we have to issue a conditional sequence of two stores, and remember the
location the non-invariant store stored to _and_ verify we can re-emit that
out-of-order, and we have to remember the value stored):

unsigned p;
void foo (float *q)
{
  for (int i = 0; i < 256; ++i)
    {
      if (p)
        {
          unsigned a = p;
          p = a + 1;
          *(q++) = 1.;
        }
    }
}

a bit easier (the store we have to re-issue is always executed after the
last conditional store):

unsigned p;
void foo (float *q)
{
  for (int i = 0; i < 256; ++i)
    {
      if (p)
        {
          unsigned a = p;
          p = a + 1;
        }
      *(q++) = 1.;
    }
}

impossible / invalid:

unsigned p;
void foo (float *q)
{
  for (int i = 0; i < 256; ++i)
    {
      *(q++) = 1.;
      if (p)
        {
          unsigned a = p;
          p = a + 1;
        }
    }
}

I will see how difficult it is to teach the already interwinded code the
"trivial" case and whether the bit easier case falls out naturally.

Reply via email to