http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533
--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> --- The shll trick above will not work properly in the following case: int test_00 (unsigned char* a, int b) { return a[0] - (a[0] < 128); } results in: mov.b @r4,r1 extu.b r1,r0 exts.b r1,r1 cmp/pz r1 movt r1 rts sub r1,r0 It is initially expanded as (insn 10 9 11 2 (set (reg:SI 173) (not:SI (reg:SI 172))) sh_tmp.cpp:188 -1 (nil)) (insn 11 10 12 2 (parallel [ (set (reg:SI 171 [ D.1381+-3 ]) (lshiftrt:SI (reg:SI 173) (const_int 31 [0x1f]))) (clobber (reg:SI 147 t)) ]) sh_tmp.cpp:188 -1 (nil)) (insn 12 11 13 2 (set (reg:SI 170 [ D.1379 ]) (minus:SI (reg:SI 160 [ D.1378+-3 ]) (reg:SI 171 [ D.1381+-3 ]))) sh_tmp.cpp:188 -1 (nil)) and then combine will not try to integrate the not insn and only try: Failed to match this instruction: (set (reg:SI 170 [ D.1379 ]) (minus:SI (zero_extend:SI (reg:QI 169 [ *a_2(D) ])) (lshiftrt:SI (reg:SI 173) (const_int 31 [0x1f])))) The cmp/pz insn will be split out after combine and thus it will not be combined into a subc insn. One problem is that the function emit_store_flag_1 in expmed.c always expands the not-shift because the assumption there is that it's cheaper. This is not true for SH and ideally it should check insn costs during expansion. On the other hand, there are other cases such as unsigned int test_00 (unsigned int a) { return (a >> 31) ^ 1; } which currently results in: shll r4 movt r0 rts xor #1,r0 and can be done simpler as: cmp/pz r4 rts movt r0 In this case combine will try to merge the shift and the xor into: Failed to match this instruction: (set (reg:SI 164 [ D.1394 ]) (ge:SI (reg:SI 4 r4 [ a ]) (const_int 0 [0]))) Adding such a pattern will also result in cmp/pz being used for all the other cases above (without the shll trick), but it will fail to combine with any other insn that takes the result as an operand in the T_REG, such as addc, subc or rotcl/rotcr. In order to get that working all the insns that can take T_REG as an operand would require insn_and_split variants to include (ge:SI (reg:SI (const_int 0)), which will be quite a lot of patterns. Maybe it would be better to do a kind of SH specific like RTL pass that performs insn pre-combination before the generic combine pass and catches such special cases. It could also probably be used to address issues mentioned in PR 59291. Alternatively the combine pass could be ran twice on SH, although it seems to cause some problems. If it worked, I'm afraid it would probably have a negative impact on compilation time