On Tue, Sep 22, 2015 at 9:10 AM, Tejun Heo <t...@kernel.org> wrote: > > That's a pentium pro era errata. Virtually no working machine is > affected by that anymore and nobody builds kernel with that option. > In most cases, store_release and load_acquire are cheaper as they're > more specific. On x86, store_release and load_acquire boil down to > compiler reordering barriers. You're running in the opposite > direction.
Well, to be fair, there are lots of machines where acquire/release is actually quite expensive. In general, the cheapest barrier there is (apart from the "no barrier at all" or just "compiler barrier") is "smp_wmb()". If an architecture gets that one wrong, the architects were f*cking morons. It should be a fundamentally cheap operation, since writes are buffered and it should simply be a buffer barrier. The acquire/release things are generally fairly cheap on modern architectures. Not free (except on x86), but fairly low-cost. HOWEVER, they are not at all free on some older architectures, including 32-bit ARM. smp_rmb() should generally be about the same cost as an acquire. It can go either way. So *if* the algorithm is amenable to smp_wmb()/smp_rmb() kind of barriers, that's actually quite possibly better than acquire/release. smp_mb() is expensive pretty much everywhere. Looking forward, I suspect long-term acquire/release is what hardware is going to be "reasonably good at", but as things are right now, you can't necessarily rely on them being fast. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html