http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448
--- Comment #4 from algrant at acm dot org ---
So using g++,
#include <atomic>
int f1(std::atomic<int> const *p, std::atomic<int> const *q)
{
int flag = p->load(std::memory_order_consume);
return flag ? (q + flag - flag)->load(std::memory_order_relaxed) : 0;
}
demonstrates the same lack of ordering. You suggest that this might
be a problem with the atomic built-ins - and yes, if this had been a
load-acquire, it would be a problem with the built-in not introducing a
barrier or using a load-acquire instruction. But for a load-consume on
this architecture, no barrier is necessary to separate the load-consume
from a load that is address-dependent on it. The programmer wrote a
dependency but the compiler lost track of it.
It's not necessary to demonstrate failure - there's an architectural
race condition here. Even if it doesn't fail now there's no guarantee
it will never fail on future more aggressively reordering cores.