https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77686
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jwakely.gcc at gmail dot com Component|target |libstdc++ --- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> --- So it's @(insn:TI 21 24 15 (parallel [ @ (set (reg:SI 0 r0) @ (mem/c:SI (reg/f:SI 3 r3 [120]) [23 MEM[(union _Any_data &)&D.50945]+0 S4 A64])) @ (set (reg:SI 1 r1) @ (mem/c:SI (plus:SI (reg/f:SI 3 r3 [120]) @ (const_int 4 [0x4])) [23 MEM[(union _Any_data &)&D.50945]+4 S4 A32])) @ ]) t.ii:1926 383 {*ldm2_} @ (nil)) ldm r3, {r0, r1} @ 21 *ldm2_ [length = 4] vs. @(insn:TI 17 25 18 (set (mem/f/c:SI (plus:SI (reg/f:SI 13 sp) @ (const_int 32 [0x20])) [30 MEM[(struct __lambda1 *)&D.50945]+0 S4 A64]) @ (reg/f:SI 5 r5 [orig:114 this ] [114])) 630 {*arm_movsi_vfp} @ (nil)) str r5, [sp, #32] @ 17 *arm_movsi_vfp/6 [length = 4] @(insn:TI 18 17 118 (set (mem/f/c:SI (plus:SI (reg/f:SI 13 sp) @ (const_int 36 [0x24])) [30 MEM[(struct __lambda1 *)&D.50945 + 4B]+0 S4 A32]) @ (reg/f:SI 5 r5 [orig:114 this ] [114])) 630 {*arm_movsi_vfp} @ (nil)) str r5, [sp, #36] @ 18 *arm_movsi_vfp/6 [length = 4] and obviously r3 == sp + 32 and thus this is a must-alias. But the alias sets are 30 vs. 23 here. At RTL expansion time: ;; MEM[(struct __lambda1 *)&D.50945] = this_4(D); (insn 17 16 0 (set (mem/f/c:SI (plus:SI (reg/f:SI 105 virtual-stack-vars) (const_int -16 [0xfffffffffffffff0])) [30 MEM[(struct __lambda1 *)&D.50945]+0 S4 A64]) (reg/f:SI 114 [ this ])) -1 (nil)) ;; MEM[(struct __lambda1 *)&D.50945 + 4B] = this_4(D); (insn 18 17 0 (set (mem/f/c:SI (plus:SI (reg/f:SI 105 virtual-stack-vars) (const_int -12 [0xfffffffffffffff4])) [30 MEM[(struct __lambda1 *)&D.50945 + 4B]+0 S4 A32]) (reg/f:SI 114 [ this ])) -1 (nil)) ... ;; __tmp = MEM[(union _Any_data &)&D.50945]; (insn 19 18 20 (set (reg:SI 119) (plus:SI (reg/f:SI 105 virtual-stack-vars) (const_int -40 [0xffffffffffffffd8]))) t.ii:1926 -1 (nil)) (insn 20 19 21 (set (reg:SI 120) (plus:SI (reg/f:SI 105 virtual-stack-vars) (const_int -16 [0xfffffffffffffff0]))) t.ii:1926 -1 (nil)) (insn 21 20 22 (parallel [ (set (reg:SI 0 r0) (mem/c:SI (reg:SI 120) [23 MEM[(union _Any_data &)&D.50945]+0 S4 A64])) (set (reg:SI 1 r1) (mem/c:SI (plus:SI (reg:SI 120) (const_int 4 [0x4])) [23 MEM[(union _Any_data &)&D.50945]+4 S4 A32])) ]) t.ii:1926 -1 (nil)) (insn 22 21 0 (parallel [ (set (mem/c:SI (reg:SI 119) [23 __tmp+0 S4 A64]) (reg:SI 0 r0)) (set (mem/c:SI (plus:SI (reg:SI 119) (const_int 4 [0x4])) [23 __tmp+4 S4 A32]) (reg:SI 1 r1)) ]) t.ii:1926 -1 (nil)) From GIMPLE with more context: ;; basic block 2, loop depth 0 ;; pred: ENTRY dummy_a = 1; std::__ostream_insert<char, std::char_traits<char> > (&cout, "", 0); MEM[(struct &)&f] ={v} {CLOBBER}; MEM[(struct &)&f] ={v} {CLOBBER}; MEM[(union _Any_data *)&f] = this_4(D); MEM[(union _Any_data *)&f + 4B] = &dummy_a; MEM[(struct &)&D.50945] ={v} {CLOBBER}; MEM[(struct &)&D.50945] ={v} {CLOBBER}; MEM[(struct __lambda1 *)&D.50945] = this_4(D); <--- MEM[(struct __lambda1 *)&D.50945 + 4B] = this_4(D); <--- __tmp = MEM[(union _Any_data &)&D.50945]; <--- MEM[(union _Any_data *)&D.50945] = MEM[(union _Any_data &)&f]; MEM[(union _Any_data *)&f] = __tmp; __tmp ={v} {CLOBBER}; ... not very well optimized either, the copy to __tmp could be elided. What SRA does is obviously correct now (and incorrect before): - MEM[(struct __lambda0 *)&D.47785] = __f; + MEM[(struct __lambda0 *)&D.47785] = __f$__this_17; + MEM[(struct __lambda0 *)&D.47785 + 4B] = __f$__dummy_a_19; ... - MEM[(struct __lambda1 *)&D.47815] = __f; + MEM[(struct __lambda1 *)&D.47815] = __f$__this_17; + MEM[(struct __lambda1 *)&D.47815 + 4B] = __f$__tmp_19; so you see it now preserves the alias sets for the stores. That _Any_data identifier makes me suspicious of the testcase invoking undefined behavior: union _Any_data { void* _M_access() { return &_M_pod_data[0]; } const void* _M_access() const { return &_M_pod_data[0]; } template<typename _Tp> _Tp& _M_access() { return *static_cast<_Tp*>(_M_access()); } template<typename _Tp> const _Tp& _M_access() const { return *static_cast<const _Tp*>(_M_access()); } _Nocopy_types _M_unused; char _M_pod_data[sizeof(_Nocopy_types)]; }; it seems to fall foul of the common misconception that you can do aggregate copies of type _Any_data and that this will properly transfer objects constructed into _Any_data::_M_pod_data. That is obviously not the case. -> latent libstd++ issue unless proved otherwise.