http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #22 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #21)
> What happens is that the sequence is expanded to RTL as follows:
> 
> (insn 7 4 8 2 (set (reg:SI 163 [ D.1856 ])
>         (and:SI (reg/v:SI 162 [ xb ])
>             (const_int 33 [0x21]))) sh_tmp.cpp:17 -1
>      (nil))
> (insn 8 7 9 2 (set (reg:SI 147 t)
>         (eq:SI (reg:SI 163 [ D.1856 ])
>             (const_int 0 [0]))) sh_tmp.cpp:17 -1
>      (nil))
> (jump_insn 9 8 10 2 (set (pc)
>         (if_then_else (eq (reg:SI 147 t)
>                 (const_int 0 [0]))
>             (label_ref:SI 15)
>             (pc))) sh_tmp.cpp:17 301 {*cbranch_t}
>      (int_list:REG_BR_PROB 3900 (nil))
>  -> 15)
> (note 10 9 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> (insn 11 10 12 4 (set (reg:SI 164)
>         (const_int 0 [0])) sh_tmp.cpp:18 -1
>      (nil))
> (insn 12 11 15 4 (set (mem:SI (reg/v/f:SI 161 [ x ]) [2 *x_5(D)+0 S4 A32])
>         (reg:SI 164)) sh_tmp.cpp:18 -1
>      (nil))
> 
> 
> and insn 11 becomes dead code and is eliminated.
> All of that happens long time before combine, so the tst combine patterns
> have no chance to reconstruct the original code.
> 

Adding an early peephole pass as described in
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59533#c2 and then adding the
following peephole:

;; Peephole after initial expansion.
(define_peephole2
  [(set (match_operand:SI 0 "arith_reg_dest")
    (and:SI (match_operand:SI 1 "arith_reg_operand")
        (match_operand:SI 2 "logical_operand")))
   (set (reg:SI T_REG) (eq:SI (match_dup 0) (const_int 0)))]
  "TARGET_SH1 && can_create_pseudo_p ()"
  [(set (reg:SI T_REG) (eq:SI (and:SI (match_dup 1) (match_dup 2))
                  (const_int 0)))
   (set (match_dup 0) (and:SI (match_dup 1) (match_dup 2)))])

... fixes the problem and results in more uses of the tst #imm,r0 insn
according to the CSiBE set.  On the other hand there is a total code size
increase of 792 bytes on the whole set.  Below are some things that get worse
in the Linux source (mm/filemap.c):

        mov.b   @(15,r1),r0    ->    mov.b   @(15,r1),r0
        cmp/pz  r0                   tst     #128,r0     // cmp/pz has less
        bf      .L1016               bf      .L1001      // pressure on r0


        mov.b   @(15,r0),r0     ->   mov.b   @(15,r0),r0
        tst     #4,r0                shar    r0
        bf      .L107                shar    r0
                                     tst     #1,r0


        add     #16,r0          ->   add     #16,r0
        mov.b   @(15,r0),r0          mov.b   @(15,r0),r0
        tst     #16,r0               mov     #-4,r1
        bf/s    .L509                shad    r1,r0
                                     tst     #1,r0
                                     bf/s    .L509

Reply via email to