https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104480
Bug ID: 104480 Summary: [trunk] Combining stores across memory locations might violate [intro.memory]/3 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: marc.mutz at hotmail dot com Target Milestone: --- I'm not sure whether GCC trunk just became much smarter, or introduced a regresssion. Sorry if it's the former. Consider: // https://gcc.godbolt.org/z/ch8rTob7c struct S1 { int a1 : 16; int a2 : 16; }; struct S2 { short a1; short a2; }; extern char x; template<typename T> void f(T &t) { t.a1 = x; t.a2 = x + 1; } template void f(S1 &); template void f(S2 &); All GCC version up to 11.2 will use two movw to implement both f() instantiations. GCC trunk now uses one movl in both instantiations. That's clearly allowed for f<S1>() by [intro.memory]/3, but it's less clear that it's an allowed optimisation for S2, because a1, s2 are two separate memory locations there. Clang, in fact, produces different code for the two instantiations. Of course, GCC might be the clever kid here and realize that it can combine the writes, because that's a valid observable sequence, but an object of type S2, having alignment 2, may cross a cacheline-boundary, in which case the movl might not be atomic, even on x86, and then a different core may observe the writes out of order, which probably shouldn't happen.