On Fri, Jun 03, 2016 at 02:26:09PM +0200, Torvald Riegel wrote: > And that would be fine, IMO. If you can't even load atomically, doing > something useful with this type will be hard except in special cases. > Also, doing a CAS (compare-and-swap) and thus potentially bringing in > the cache line in exclusive mode can be a lot more costly than what > users might expect from a load. A short critical section might not be > much slower. > > If you only have a CAS as base of the atomic operations on a type, then > a CAS operation exposed to the user will still be a just a single HW > CAS. But any other operation besides the CAS and a load will need *two* > CAS operations; even an atomic store has to be implemented as a CAS > loop.
Would we just stop expanding all those __atomic_*/__sync_* builtins inline then (which would IMHO break tons of stuff), or just some predicate that atomic.h/atomic headers use? > > But doesn't that mean you should fall back to locked operation also for any > > other atomic operation on such types, because otherwise if you atomic_store > > or any other kind of atomic operation, it wouldn't use the locking, while > > for atomic load it would? > > I suppose you mean that one must fall back to using locking for all > operations? If load isn't atomic, then it can't be made atomic using > external locks if the other operations don't use the locks. > > > That would be an ABI change and quite significant > > pessimization in many cases. > > A change from wide CAS to locking would be an ABI change I suppose, but > it could also be considered a necessary bugfix if we don't want to write > to read-only memory. Does this affect anything but i686? Also x86_64 (for 128-bit atomics), clearly also either arm or aarch64 (judging from who initiated this thread), I bet there are many others. Jakub