https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52898
--- Comment #10 from Oleg Endo <olegendo at gcc dot gnu.org> --- For a case like int test3 (long long a) { return a == 40; } what happens on SH2+ during RTL expansion: cstoredi4 -> sh_emit_compare_and_set -> sh_emit_scc_to_t -> force operands to regs -> emit cmpeqdi_t insn Then combine tries e.g. Trying 6 -> 7: Failed to match this instruction: (set (reg:SI 147 t) (eq:SI (reg:DI 4 r4 [ a ]) (const_int 40 [0x28]))) and in split1 this pattern (define_split [(set (reg:SI T_REG) (eq:SI (match_operand:DI 0 "arith_reg_operand" "") (match_operand:DI 1 "arith_reg_or_0_operand" "")))] splits everything up and the resulting code becomes: mov #0,r3 cmp/eq r3,r5 bt.s .L5 mov #40,r2 rts movt r0 .align 1 .L5: cmp/eq r2,r4 rts movt r0 if the split pattern is disabled, the cmpeqdi_t pattern survives until the end: mov #40,r2 mov #0,r3 cmp/eq r3,r5 bf 0f cmp/eq r2,r4 0: rts movt r0 which is obviously less code, but has one more branch in the execution path. This pattern probably should be used when optimizing for size or when zero-displacement branches are fast. On SH1 the cstoredi4 pattern is disabled because it might result in e.g. cmpgtdi_t which needs branches with delay slots. Because of that the middle end expands some target independent code like: mov #40,r1 xor r1,r4 or r4,r5 tst r5,r5 rts movt r0 which is actually a good branch-less alternative.