On Fri, Jun 03, 2016 at 02:26:09PM +0200, Torvald Riegel wrote:
> And that would be fine, IMO.  If you can't even load atomically, doing
> something useful with this type will be hard except in special cases.
> Also, doing a CAS (compare-and-swap) and thus potentially bringing in
> the cache line in exclusive mode can be a lot more costly than what
> users might expect from a load.  A short critical section might not be
> much slower.
> 
> If you only have a CAS as base of the atomic operations on a type, then
> a CAS operation exposed to the user will still be a just a single HW
> CAS.  But any other operation besides the CAS and a load will need *two*
> CAS operations; even an atomic store has to be implemented as a CAS
> loop.

Would we just stop expanding all those __atomic_*/__sync_* builtins inline
then (which would IMHO break tons of stuff), or just some predicate that
atomic.h/atomic headers use?

> > But doesn't that mean you should fall back to locked operation also for any
> > other atomic operation on such types, because otherwise if you atomic_store
> > or any other kind of atomic operation, it wouldn't use the locking, while
> > for atomic load it would?
> 
> I suppose you mean that one must fall back to using locking for all
> operations?  If load isn't atomic, then it can't be made atomic using
> external locks if the other operations don't use the locks.
> 
> > That would be an ABI change and quite significant
> > pessimization in many cases.
> 
> A change from wide CAS to locking would be an ABI change I suppose, but
> it could also be considered a necessary bugfix if we don't want to write
> to read-only memory.  Does this affect anything but i686?

Also x86_64 (for 128-bit atomics), clearly also either arm or aarch64
(judging from who initiated this thread), I bet there are many others.

        Jakub

Reply via email to