On Thu, Aug 18, 2016 at 08:51:31AM -0700, Andi Kleen wrote: > > I'd prefer to make updates atomic in multi-threaded applications. > > The best proxy we have for that is -pthread. > > > > Is it slower, most definitely, but odds are we're giving folks > > garbage data otherwise, which in many ways is even worse. > > It will likely be catastrophically slower in some cases. > > Catastrophically as in too slow to be usable. > > An atomic instruction is a lot more expensive than a single increment. Also > they sometimes are really slow depending on the state of the machine.
Can't we just have thread-local copies of all the counters (perhaps using __thread pointer as base) and just atomically merge at thread termination? Jakub