https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591
--- Comment #21 from Kewen Lin <linkw at gcc dot gnu.org> ---
For optimized IR:
a$raw$3_220 = D.39813.rawD.30221[3];
vect_a_raw_4_70.539_1584 = MEM <vector(4) short intD.20> [(short intD.20
*)&D.39813 + 8B];
_1640 = a$raw$0_221 & 255;
_1649 = a$raw$1_74 & 255;
_1658 = a$raw$2_264 & 255;
_52 = a$raw$3_220 & 255;
vD.39776 = bD.39739; // involved decl1
MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM
<unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776];
vD.39776 ={v} {CLOBBER(eol)};
vD.39779 = b00D.39742; // involved decl2
raw_u_1614 = vD.39779.rawD.30221[0];
_1615 = raw_u_1614 << 8;
vD.39779.rawD.30221[0] = _1615;
raw_u_1622 = vD.39779.rawD.30221[1];
_1623 = raw_u_1622 << 8;
vD.39779.rawD.30221[1] = _1623;
...
Partition 1: size 16 align 16
D.39819 vD.39749 vD.39756 vD.39764 aD.39773
vD.39779 vD.39735 vD.39736 aD.39630 vD.39636
aD.39640 vD.39753 vD.39761 vD.39776 vD.39782
vD.39776 and vD.39779 are coalesced.
It's expanded as:
vD.39776 = bD.39739;
(insn 383 382 384 (set (reg:V2DI 616)
(mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 48 [0x30])) [7 MEM[(struct Vec128D.30433 *)_1274]+0
S16 A128])) -1
(nil))
(insn 384 383 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [7 MEM[(struct Vec128D.30433 *)_10]+0
S16 A128])
(reg:V2DI 616)) -1
(nil))
MEM <unsigned charD.25[16]> [(charD.5 * {ref-all})&b00D.39742] = MEM
<unsigned charD.25[16]> [(charD.5 * {ref-all})&vD.39776];
(insn 385 384 386 (set (reg:V2DI 617)
(mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [0 MEM <unsigned charD.25[16]> [(charD.5
* {ref-all})_10]+0 S16 A128])) "test.cc":14:19 -1
(nil))
(insn 386 385 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 80 [0x50])) [0 MEM <unsigned charD.25[16]> [(charD.5
* {ref-all})_1277]+0 S16 A128])
(reg:V2DI 617)) "test.cc":14:19 -1
(nil))
vD.39776 ={v} {CLOBBER(eol)};
vD.39779 = b00D.39742;
(insn 387 386 388 (set (reg:V2DI 618)
(mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 80 [0x50])) [5 MEM[(struct Vec128D.30212 *)_1277]+0
S16 A128])) -1
(nil))
(insn 388 387 0 (set (mem/c:V2DI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [5 MEM[(struct Vec128D.30212 *)_10]+0
S16 A128])
(reg:V2DI 618)) -1
(nil))
raw_u_1614 = vD.39779.rawD.30221[0];
_1615 = raw_u_1614 << 8;
vD.39779.rawD.30221[0] = _1615;
;; v.raw[0] = _1615;
(insn 389 388 390 (set (reg:HI 619)
(mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
*)_10].rawD.30221[0]+0 S2 A128])) "test.cc":218:14 -1
(nil))
(insn 390 389 391 (set (reg:SI 620)
(ashift:SI (subreg:SI (reg:HI 619) 0)
(const_int 8 [0x8]))) "test.cc":218:14 -1
(nil))
(insn 391 390 0 (set (mem/c:HI (plus:DI (reg/f:DI 112 virtual-stack-vars)
(const_int 16 [0x10])) [4 MEM[(struct Vec128D.30212
*)_10].rawD.30221[0]+0 S2 A128])
(subreg:HI (reg:SI 620) 2)) "test.cc":218:14 -1
(nil))
=========
Later, insn 388 gets removed (also insn 387 and 385), as the store value is
exactly the same as what insn 384 has. And the scheduler doesn't consider there
is a dependence between insn 389 and insn 384 then results in unexpected move.
Hi Richi, do you think that this is exactly duplicated of known -fstack-reuse
issue?