https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117111

--- Comment #2 from Kazumoto Kojima <kkojima at gcc dot gnu.org> ---
dbr_schedule is trying to fill the delay slot of

(jump_insn 17 16 42 (set (pc)
        (if_then_else (eq (reg:SI 147 t)
                (const_int 0 [0]))
            (label_ref:SI 94)
            (pc))) "fpone.c":7:11 232 {*cbranch_t}
     (int_list:REG_BR_PROB 536870916 (nil))
 -> 94)

by fill_slots_from_thread function.  It takes the insn for the slot from the
thread

(insn 46 39 79 (set (reg:SI 8 r8 [orig:167 _1 ] [167])
        (reg:SI 147 t)) "fpone.c":7:11 303 {movt}
     (expr_list:REG_DEAD (reg:SI 147 t)
        (nil)))
(insn 79 46 80 (set (reg/i:SI 0 r0)
        (reg:SI 8 r8 [orig:167 _1 ] [167])) "fpone.c":8:1 191 {movsi_ie}
     (expr_list:REG_DEAD (reg:SI 8 r8 [orig:167 _1 ] [167])
        (nil)))

and makes a candidate insn

(insn 79 46 80 (set (reg/i:SI 0 r0)
        (reg:SI 147 t)) "fpone.c":8:1 303 {movt}
     (expr_list:REG_DEAD (reg:SI 8 r8 [orig:167 _1 ] [167])
        (nil)))

for trial.  fill_slots_from_thread calls try_split for this candidate first.
Here is a gdb backtrace where trial is the insn 79 above.

#0  try_split (pat=pat@entry=0x7ffff6f5ec48, trial=trial@entry=0x7ffff6f5d400, 
    last=last@entry=0) at /git/gcc/gcc/emit-rtl.cc:3932
#1  0x0000000000f5a748 in fill_slots_from_thread (insn=0x7ffff6e0cd38, 
    condition=0x7ffff6f5f0d8, thread_or_return=<optimized out>, 
    opposite_thread=<optimized out>, likely=false, 
    thread_if_true=<optimized out>, own_thread=true, 
    slots_to_fill=<optimized out>, pslots_filled=0x7fffffffd7ac, 
    delay_list=0x7fffffffd7d0) at /git/gcc/gcc/reorg.cc:2430
#2  0x0000000000f5d381 in fill_eager_delay_slots ()
    at /git/gcc/gcc/reorg.cc:2843
#3  dbr_schedule (first=<optimized out>) at /git/gcc/gcc/reorg.cc:3705

and try_split applies the splitter

sh.md: 11147
;; This is not a peephole, but it's here because it's actually supposed
;; to be one.  It tries to convert a sequence such as
;;      movt    r2      ->      movt    r2
;;      movt    r13             mov     r2,r13
;; This gives the schduler a bit more freedom to hoist a following
;; comparison insn.  Moreover, it the reg-reg mov insn is MT group which has
;; better chances for parallel execution.
;; We can do this with a peephole2 pattern, but then the cprop_hardreg
;; pass will revert the change.  See also PR 64331.
;; Thus do it manually in one of the split passes after register allocation.
;; Sometimes the cprop_hardreg pass might also eliminate the reg-reg copy.
(define_split
  [(set (match_operand:SI 0 "arith_reg_dest")
        (match_operand:SI 1 "t_reg_operand"))]
...

and  returns

(insn 113 46 80 (set (reg/i:SI 0 r0)
        (reg:SI 8 r8 [orig:167 _1 ] [167])) "fpone.c":8:1 -1
     (nil))

Thus fill_slots_from_thread fills the slot with it.  This situation would be
unexpected with both fill_slots_from_thread and the above splitter.

BTW, old RA makes a bit worse code for the thread:

.L6:
        movt    r1
        mov.l   r1,@r15
        mov.l   @r15,r0
        add     #4,r15
        lds.l   @r15+,pr

and the splitter cloudn't be applied.

Reply via email to