http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
The shll trick above will not work properly in the following case:

int
test_00 (unsigned char* a, int b)
{
  return a[0] - (a[0] < 128);
}

results in:
        mov.b   @r4,r1
        extu.b  r1,r0
        exts.b  r1,r1
        cmp/pz  r1
        movt    r1
        rts
        sub     r1,r0

It is initially expanded as

(insn 10 9 11 2 (set (reg:SI 173)
        (not:SI (reg:SI 172))) sh_tmp.cpp:188 -1
     (nil))
(insn 11 10 12 2 (parallel [
            (set (reg:SI 171 [ D.1381+-3 ])
                (lshiftrt:SI (reg:SI 173)
                    (const_int 31 [0x1f])))
            (clobber (reg:SI 147 t))
        ]) sh_tmp.cpp:188 -1
     (nil))
(insn 12 11 13 2 (set (reg:SI 170 [ D.1379 ])
        (minus:SI (reg:SI 160 [ D.1378+-3 ])
            (reg:SI 171 [ D.1381+-3 ]))) sh_tmp.cpp:188 -1
     (nil))

and then combine will not try to integrate the not insn and only try:
Failed to match this instruction:
(set (reg:SI 170 [ D.1379 ])
    (minus:SI (zero_extend:SI (reg:QI 169 [ *a_2(D) ]))
        (lshiftrt:SI (reg:SI 173)
            (const_int 31 [0x1f]))))

The cmp/pz insn will be split out after combine and thus it will not be
combined into a subc insn.

One problem is that the function emit_store_flag_1 in expmed.c always expands
the not-shift because the assumption there is that it's cheaper.  This is not
true for SH and ideally it should check insn costs during expansion.



On the other hand, there are other cases such as

unsigned int
test_00 (unsigned int a)
{
  return (a >> 31) ^ 1;
}

which currently results in:
        shll    r4
        movt    r0
        rts
        xor     #1,r0

and can be done simpler as:
        cmp/pz  r4
        rts
        movt    r0

In this case combine will try to merge the shift and the xor into:
Failed to match this instruction:
(set (reg:SI 164 [ D.1394 ])
    (ge:SI (reg:SI 4 r4 [ a ])
        (const_int 0 [0])))

Adding such a pattern will also result in cmp/pz being used for all the other
cases above (without the shll trick), but it will fail to combine with any
other insn that takes the result as an operand in the T_REG, such as addc, subc
or rotcl/rotcr.  In order to get that working all the insns that can take T_REG
as an operand would require insn_and_split variants to include (ge:SI (reg:SI
(const_int 0)), which will be quite a lot of patterns.

Maybe it would be better to do a kind of SH specific like RTL pass that
performs insn pre-combination before the generic combine pass and catches such
special cases.  It could also probably be used to address issues mentioned in
PR 59291.

Alternatively the combine pass could be ran twice on SH, although it seems to
cause some problems.  If it worked, I'm afraid it would probably have a
negative impact on compilation time

Reply via email to