[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

olegendo at gcc dot gnu.org Sun, 08 Dec 2013 05:47:22 -0800

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263


--- Comment #21 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #18)
> 
> It seems that combine is trying to look for the following patterns:
> 
> Failed to match this instruction:
> (set (pc)
>     (if_then_else (ne (and:SI (reg:SI 5 r5 [ xb ])
>                 (const_int 85 [0x55]))
>             (const_int 0 [0]))
>         (label_ref:SI 15)
>         (pc)))

Implementing such a combine pattern like ...
(define_insn_and_split "*tst_cbranch"
  [(set (pc)
    (if_then_else (ne (and:SI (match_operand:SI 0 "logical_operand")
                  (match_operand:SI 1 "const_int_operand"))
              (const_int 0))
              (label_ref (match_operand 2))
              (pc)))
   (clobber (reg:SI T_REG))]
  "TARGET_SH1"
  "#"
  "&& 1"
  [(set (reg:SI T_REG) (eq:SI (and:SI (match_dup 0) (match_dup 1))
                  (const_int 0)))
   (set (pc) (if_then_else (eq (reg:SI T_REG) (const_int 0))
               (label_ref (match_dup 2))
               (pc)))])


results in code such as following code:
        mov     #33,r1
        mov     r5,r0
        tst     #33,r0
        bf/s    .L3
        and     r5,r1
        mov.l   r1,@r4
.L3:
        rts
        nop

which is worse.
What happens is that the sequence is expanded to RTL as follows:

(insn 7 4 8 2 (set (reg:SI 163 [ D.1856 ])
        (and:SI (reg/v:SI 162 [ xb ])
            (const_int 33 [0x21]))) sh_tmp.cpp:17 -1
     (nil))
(insn 8 7 9 2 (set (reg:SI 147 t)
        (eq:SI (reg:SI 163 [ D.1856 ])
            (const_int 0 [0]))) sh_tmp.cpp:17 -1
     (nil))
(jump_insn 9 8 10 2 (set (pc)
        (if_then_else (eq (reg:SI 147 t)
                (const_int 0 [0]))
            (label_ref:SI 15)
            (pc))) sh_tmp.cpp:17 301 {*cbranch_t}
     (int_list:REG_BR_PROB 3900 (nil))
 -> 15)
(note 10 9 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 11 10 12 4 (set (reg:SI 164)
        (const_int 0 [0])) sh_tmp.cpp:18 -1
     (nil))
(insn 12 11 15 4 (set (mem:SI (reg/v/f:SI 161 [ x ]) [2 *x_5(D)+0 S4 A32])
        (reg:SI 164)) sh_tmp.cpp:18 -1
     (nil))


and the cse1 pass decides that the result of the and operation can be shared
and replaces the operand in insn 12 with reg:SI 163:

(insn 12 11 15 3 (set (mem:SI (reg/v/f:SI 161 [ x ]) [2 *x_5(D)+0 S4 A32])
        (reg:SI 163 [ D.1856 ])) sh_tmp.cpp:18 258 {movsi_ie}
     (expr_list:REG_DEAD (reg:SI 164)
        (expr_list:REG_DEAD (reg/v/f:SI 161 [ x ])
            (nil))))

and insn 11 becomes dead code and is eliminated.
All of that happens long time before combine, so the tst combine patterns have
no chance to reconstruct the original code.

A sequence such as

        mov     r5,r0
        mov     #0,r1
        tst     #33,r0
        bf      .L3
        mov.l   r1,@r4
.L3:
        rts
        nop

could probably be achieved by combining insn 7 and insn 8 shortly after RTL
expansion, or even during the expansion of insn 8 (by looking at previous
already expanded insns and emitting a tst insn directly).
The idea would be to reduce dependencies on the tested register which allows
better scheduling.  In addition to that, on SH4A "mov #imm8,Rn" is an MT group
instruction which has a higher probability of being executed in parallel.

[Bug target/49263] SH Target: underutilized "TST #imm, R0" instruction

Reply via email to