https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64601
Bug ID: 64601 Summary: Missed PRE on std::vector move assignment Product: gcc Version: 5.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Comparing these 2 ways (not necessarily 100% equivalent) to move a std::vector, we generate significantly better code for f than for g, which is not really what users expect. #include <vector> #include <utility> typedef std::vector<int> V; void f(V&v,V&w){ V(std::move(w)).swap(v); } void g(V&v,V&w){ v=std::move(w); } For f (good code): <bb 2>: _3 = MEM[(int * &)w_2(D)]; MEM[(int * &)w_2(D)] = 0B; _6 = MEM[(int * &)w_2(D) + 8]; MEM[(int * &)w_2(D) + 8] = 0B; _7 = MEM[(int * &)w_2(D) + 16]; MEM[(int * &)w_2(D) + 16] = 0B; _8 = MEM[(int * &)v_4(D)]; MEM[(int * &)v_4(D)] = _3; MEM[(int * &)v_4(D) + 8] = _6; MEM[(int * &)v_4(D) + 16] = _7; if (_8 != 0B) goto <bb 3>; else goto <bb 4>; <bb 3>: operator delete (_8); [tail call] <bb 4>: return; While for g (not as good): <bb 2>: __tmp.0_5 = MEM[(int * &)v_4(D)]; MEM[(int * &)v_4(D)] = 0B; MEM[(int * &)v_4(D) + 8] = 0B; MEM[(int * &)v_4(D) + 16] = 0B; _6 = MEM[(int * &)w_2(D)]; MEM[(int * &)v_4(D)] = _6; MEM[(int * &)w_2(D)] = 0B; __tmp.0_7 = MEM[(int * &)v_4(D) + 8]; _8 = MEM[(int * &)w_2(D) + 8]; MEM[(int * &)v_4(D) + 8] = _8; MEM[(int * &)w_2(D) + 8] = __tmp.0_7; __tmp.0_9 = MEM[(int * &)v_4(D) + 16]; _10 = MEM[(int * &)w_2(D) + 16]; MEM[(int * &)v_4(D) + 16] = _10; MEM[(int * &)w_2(D) + 16] = __tmp.0_9; if (__tmp.0_5 != 0B) [...] The first really surprising line is the definition of __tmp.0_7 (always 0). Gcc should know that the stores to v+0 and v+16 don't alias v+8 (same base, known size+offset), and it should also know that w+0 and v+8 don't alias (same base type, different fields). I am wondering if the issue could be related to having directly an int* MEM_REF (it appears during early inline) instead of several COMPONENT_REF (which would show as ._M_impl._M_start in the dump). (alternatively, gcc could notice that the potential clobbering value is the same as the old value, but that's a separate, known issue)