https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104480

            Bug ID: 104480
           Summary: [trunk] Combining stores across memory locations might
                    violate [intro.memory]/3
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marc.mutz at hotmail dot com
  Target Milestone: ---

I'm not sure whether GCC trunk just became much smarter, or introduced a
regresssion. Sorry if it's the former.

Consider:

// https://gcc.godbolt.org/z/ch8rTob7c
struct S1
{
    int a1 : 16;
    int a2 : 16;
};
struct S2
{
    short a1;
    short a2;
};

extern char x;

template<typename T> void f(T &t) {
    t.a1 = x;
    t.a2 = x + 1;
}
template void f(S1 &);
template void f(S2 &);

All GCC version up to 11.2 will use two movw to implement both f()
instantiations. GCC trunk now uses one movl in both instantiations. That's
clearly allowed for f<S1>() by [intro.memory]/3, but it's less clear that it's
an allowed optimisation for S2, because a1, s2 are two separate memory
locations there. Clang, in fact, produces different code for the two
instantiations.

Of course, GCC might be the clever kid here and realize that it can combine the
writes, because that's a valid observable sequence, but an object of type S2,
having alignment 2, may cross a cacheline-boundary, in which case the movl
might not be atomic, even on x86, and then a different core may observe the
writes out of order, which probably shouldn't happen.

Reply via email to