https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org
           Keywords|                            |ra

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
So coming back here.  We're presenting RA with a quite hard problem given we
have

(insn 7 4 8 2 (set (reg:TI 84 [ _9 ])
        (mem:TI (reg:DI 101) [0 MEM <__int128 unsigned> [(char *
{ref-all})in_8(D)]+0 S16 A8])) 73 {*movti_internal}
     (expr_list:REG_DEAD (reg:DI 101)
        (nil)))
(insn 8 7 9 2 (parallel [
            (set (reg:DI 95)
                (lshiftrt:DI (subreg:DI (reg:TI 84 [ _9 ]) 8)
                    (const_int 63 [0x3f])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":7:26 703 {*lshrdi3_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
        (nil)))
..
(insn 10 9 11 2 (parallel [
            (set (reg:DI 97)
                (lshiftrt:DI (subreg:DI (reg:TI 84 [ _9 ]) 0)
                    (const_int 63 [0x3f])))
            (clobber (reg:CC 17 flags))
        ]) "t.c":8:30 703 {*lshrdi3_1}
     (expr_list:REG_UNUSED (reg:CC 17 flags)
..
(insn 12 11 13 2 (set (reg:V2DI 98 [ vect__5.3 ])
        (ashift:V2DI (subreg:V2DI (reg:TI 84 [ _9 ]) 0)
            (const_int 1 [0x1]))) "t.c":9:16 3611 {ashlv2di3}
     (expr_list:REG_DEAD (reg:TI 84 [ _9 ])
        (nil)))

where I wonder why we keep the (subreg:DI (reg:TI 84 ...) 8) around
for so long.  Probably the subreg pass gives up because of the V2DImode
subreg of that reg.

That said RA chooses xmm for reg:84 but then spills it immediately
to fulfil the subregs even though there's mov and pextrd that could
be used or the reload could use the original mem.  That we reload
even the xmm use is another odd thing.

Vlad, I'm not sure about the possibilities LRA has here but maybe
you can have a look at the testcase in comment#6 (use -O3 -march=znver2
or -march=core-avx2).  For one I expected

        vmovdqu (%rsi), %xmm2
        vmovdqa %xmm2, -24(%rsp)
        movq    -16(%rsp), %rax   (2a)
        vmovdqa -24(%rsp), %xmm4  (1)
...
        movq    -24(%rsp), %rdx   (2b)

(1) to be not there (not sure how that even survives postreload
optimizations...)
(2a/b) to be 'inherited' by instead loading from (%rsi) and 8(%rsi) which
is maybe too much being asked because it requires aliasing considerations

That is, even if we don't consider using

   movq %xmm2, %rax (2a)
   pextrd %xmm2, %rdx, 1 (2b)

I expected us to not spill.

Reply via email to