https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93055

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems it is LRA doing this.
We have:
(insn 128 127 129 2 (parallel [
            (set (reg/f:DI 1118)
                (plus:DI (reg/f:DI 19 frame)
                    (const_int -8000 [0xffffffffffffe0c0])))
            (clobber (reg:CC 17 flags))
        ])
"/aux/hubicka/trunk-install/include/c++/10.0.0/bits/stl_deque.h":1772:24 186
{*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (expr_list:REG_EQUIV (plus:DI (reg/f:DI 19 frame)
                (const_int -8000 [0xffffffffffffe0c0]))
            (nil))))
somewhere early in the function and then:
(code_label 254 5751 251 21 1895 (nil) [1 uses])
(note 251 254 252 21 [bb 21] NOTE_INSN_BASIC_BLOCK)
(insn 252 251 253 21 (set (reg:V4SI 857 [ vect_result_2400.5003 ])
        (plus:V4SI (reg:V4SI 857 [ vect_result_2400.5003 ])
            (mem:V4SI (reg:DI 2697 [orig:714 ivtmp.5596 ] [714]) [7 MEM[base:
_2182, offset: 0B]+0 S16 A128]))) "benchmark_algorithms.h":260:34 340
0 {*addv4si3}
     (nil))
(insn 253 252 255 21 (parallel [
            (set (reg:DI 2697 [orig:714 ivtmp.5596 ] [714])
                (plus:DI (reg:DI 2697 [orig:714 ivtmp.5596 ] [714])
                    (const_int 16 [0x10])))
            (clobber (reg:CC 17 flags))
        ]) 186 {*adddi_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
(insn 255 253 256 21 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg/f:DI 1118)
            (reg:DI 2697 [orig:714 ivtmp.5596 ] [714]))) 12 {*cmpdi_1}
     (nil))
(jump_insn 256 255 257 21 (set (pc)
        (if_then_else (ne (reg:CCZ 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 254)
            (pc))) 724 {*jcc}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (int_list:REG_BR_PROB 894784862 (nil)))
 -> 254)
as the inner loop, and LRA emits the lea inside of the loop:
(code_label 254 7 251 21 1895 (nil) [1 uses])
(note 251 254 252 21 [bb 21] NOTE_INSN_BASIC_BLOCK)
(insn 252 251 253 21 (set (reg:V4SI 20 xmm0 [orig:857 vect_result_2400.5003 ]
[857])
        (plus:V4SI (reg:V4SI 20 xmm0 [orig:857 vect_result_2400.5003 ] [857])
            (mem:V4SI (reg:DI 0 ax [orig:714 ivtmp.5596 ] [714]) [7 MEM[base:
_2182, offset: 0B]+0 S16 A128]))) "benchmark_algorithms.h":260:34 340
0 {*addv4si3}
     (nil))
(insn 253 252 6971 21 (parallel [
            (set (reg:DI 0 ax [orig:714 ivtmp.5596 ] [714])
                (plus:DI (reg:DI 0 ax [orig:714 ivtmp.5596 ] [714])
                    (const_int 16 [0x10])))
            (clobber (reg:CC 17 flags))
        ]) 186 {*adddi_1}
     (nil))
(insn 6971 253 255 21 (set (reg:DI 5 di [2731])
        (plus:DI (reg/f:DI 7 sp)
            (const_int 8448 [0x2100]))) 182 {*leadi}
     (nil))
(insn 255 6971 256 21 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:DI 5 di [2731])
            (reg:DI 0 ax [orig:714 ivtmp.5596 ] [714]))) 12 {*cmpdi_1}
     (nil))
(jump_insn 256 255 257 21 (set (pc)
        (if_then_else (ne (reg:CCZ 17 flags)
                (const_int 0 [0]))
            (label_ref:DI 254)
            (pc))) 724 {*jcc}
     (int_list:REG_BR_PROB 894784862 (nil))
 -> 254)

Now, not sure what is easier, if IRA/LRA should find out that it can actually
put the lea before the loop, or if we should have some post-reload loop
invariant motion (would be harder if the chosen register is used inside of the
loop, but in this case the register has just single setter and single use in
the loop and so should be very easy to move the invariant before.

Reply via email to