http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60538

            Bug ID: 60538
           Summary: [SH] improve support for cmp/str insn
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
            Target: sh*-*-*

Currently the cmp/str insn is only utilized in some builtin string functions. 
There are some known higher level operations that could be implemented via the
cmp/str insn, most notably a 'contains zero byte' function, which can be done
in various ways:

bool haszero (unsigned int x)
{
  return (x - 0x01010101) & ~x & 0x80808080;
}

bool haszero2 (unsigned int x)
{
  return ~((((x & 0x7F7F7F7F) + 0x7F7F7F7F) | x) | 0x7F7F7F7F);
}

bool haszero3 (unsigned int x)
{
  return ((x & 0xFF) && (x & 0xFF00) && (x & 0xFF0000) && (x & 0xFF000000));
}

bool haszero4 (unsigned int x)
{
  return ((x & 0xFF) & ((x >> 8) & 0xFF)
         & ((x >> 16) & 0xFF) & ((x >> 24) & 0xFF)) == 0;
}

bool haszero5 (unsigned int x)
{
  // this one looks quite difficult for combine.
  const unsigned char* p = (const unsigned char*)&x;
  return (*p && *(p + 1) && *(p + 2) && *(p + 3)) == 0;
}

bool haszero6 (unsigned int x)
{
  // this one is probably impossible to do with combine patterns
  // due to the conditional branch involved.

  bool r = ((x + 0x7EFEFEFF) ^ ~x) & 0x81010100;
  if (r)
    r = ~((((x & 0x7F7F7F7F) + 0x7F7F7F7F) | v) | 0x7F7F7F7F);

  return r;
}

bool haszero7 (unsigned int a)
{
  return ((a & 0xFF) == 0
      | (a & 0xFF00) == 0
      | (a & 0xFF0000) == 0
      | (a & 0xFF000000) == 0);
}

The expected minimal SH code for the functions above those would be:

    mov       #0,r1
    cmp/str   r1,r4
    rts
    movt      r0


It would also be nice if the actual cmp/str operation was recognized, although
that's a bit more difficult to match.

A while ago I've already tried out adding combine patterns for those kinds of
things, with limited success.  The problem is that combine would try out only 3
or 4 insns at once and matching such complex operations require lots of combine
bridge patterns.
The annoying thing with combine bridge patterns are that they have to be split
again manually if they don't make it into the final complex pattern.

Moreover, if a combine bridge pattern is matched but doesn't make it into the
final pattern, there will be missed combine opportunities which would have been
taken if the bridge pattern wasn't matched.

One possible solution could be to have explicit combine bridge patterns.  If a
bridge pattern doesn't get combined with another (bridge) pattern, combine
should discard the match and try out other options.
Combine bridge patterns could be identified by e.g. special insn attributes or
by adding a 'define_bridge_insn' or something like that.

Another rather brute force option would be to run the combine pass (and split
pass) again after split1.

Reply via email to