Re: -mcx16 vs. not using CAS for atomic loads

2017-01-25 Thread Torvald Riegel
On Tue, 2017-01-24 at 13:06 -0800, Richard Henderson wrote: > On 01/24/2017 01:08 AM, Torvald Riegel wrote: > > Unless HW transactions are guaranteed to succeed for scenarios that are > > sufficient for the atomics, HTM won't help because we'd have to consider > > the worst-case, which would mean s

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-24 Thread Peter Bergner
On 1/24/17 3:06 PM, Richard Henderson wrote: The only possible concern I see might be with simulators that force HTM failure, for the purpose of forcibly testing fallback paths. I guess we'd have to continue to fall back to the lock path for that case. IIRC, this was the path that valgrind was

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-24 Thread Richard Henderson
On 01/24/2017 01:08 AM, Torvald Riegel wrote: > Unless HW transactions are guaranteed to succeed for scenarios that are > sufficient for the atomics, HTM won't help because we'd have to consider > the worst-case, which would mean some non-HTM fallback. We're talking about a 16 byte aligned load he

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-24 Thread Torvald Riegel
On Fri, 2017-01-20 at 09:55 -0800, Richard Henderson wrote: > On 01/19/2017 10:23 AM, Torvald Riegel wrote: > > I think I prefer Option 3b as the short-term solution. It does not > > break programs (except the __atomic_always_lock_free assertion scenario, > > but that's likely to not work anyway g

Re: -mcx16 vs. not using CAS for atomic loads

2017-01-20 Thread Richard Henderson
On 01/19/2017 10:23 AM, Torvald Riegel wrote: * Option 3a: -mcx16 continues to only mean that cmpxchg16b is available, and we keep __sync builtins unchanged. This doesn't break valid uses of __sync* (eg, if they didn't need atomic loads at all). We change __atomic for 16-byte to not use cmpxchg1