https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |vmakarov at gcc dot gnu.org Keywords| |ra --- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> --- So coming back here. We're presenting RA with a quite hard problem given we have (insn 7 4 8 2 (set (reg:TI 84 [ _9 ]) (mem:TI (reg:DI 101) [0 MEM <__int128 unsigned> [(char * {ref-all})in_8(D)]+0 S16 A8])) 73 {*movti_internal} (expr_list:REG_DEAD (reg:DI 101) (nil))) (insn 8 7 9 2 (parallel [ (set (reg:DI 95) (lshiftrt:DI (subreg:DI (reg:TI 84 [ _9 ]) 8) (const_int 63 [0x3f]))) (clobber (reg:CC 17 flags)) ]) "t.c":7:26 703 {*lshrdi3_1} (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))) .. (insn 10 9 11 2 (parallel [ (set (reg:DI 97) (lshiftrt:DI (subreg:DI (reg:TI 84 [ _9 ]) 0) (const_int 63 [0x3f]))) (clobber (reg:CC 17 flags)) ]) "t.c":8:30 703 {*lshrdi3_1} (expr_list:REG_UNUSED (reg:CC 17 flags) .. (insn 12 11 13 2 (set (reg:V2DI 98 [ vect__5.3 ]) (ashift:V2DI (subreg:V2DI (reg:TI 84 [ _9 ]) 0) (const_int 1 [0x1]))) "t.c":9:16 3611 {ashlv2di3} (expr_list:REG_DEAD (reg:TI 84 [ _9 ]) (nil))) where I wonder why we keep the (subreg:DI (reg:TI 84 ...) 8) around for so long. Probably the subreg pass gives up because of the V2DImode subreg of that reg. That said RA chooses xmm for reg:84 but then spills it immediately to fulfil the subregs even though there's mov and pextrd that could be used or the reload could use the original mem. That we reload even the xmm use is another odd thing. Vlad, I'm not sure about the possibilities LRA has here but maybe you can have a look at the testcase in comment#6 (use -O3 -march=znver2 or -march=core-avx2). For one I expected vmovdqu (%rsi), %xmm2 vmovdqa %xmm2, -24(%rsp) movq -16(%rsp), %rax (2a) vmovdqa -24(%rsp), %xmm4 (1) ... movq -24(%rsp), %rdx (2b) (1) to be not there (not sure how that even survives postreload optimizations...) (2a/b) to be 'inherited' by instead loading from (%rsi) and 8(%rsi) which is maybe too much being asked because it requires aliasing considerations That is, even if we don't consider using movq %xmm2, %rax (2a) pextrd %xmm2, %rdx, 1 (2b) I expected us to not spill.