* Richard Henderson: > To keep all this in perspective, folks should remember that atomic > operations are *slow*. Very very slow. Orders of magnitude slower > than function calls. Seriously. Taking p4 as the extreme example, > one can expect a null function call in around 10 cycles, but a locked > memory operation to take 1000. Usually things aren't that bad, but > I believe some poor design decisions were made for p4 here. But even > on a platform without such problems you can expect a factor of 30 > difference.
And, as far as I know, you take this performance hit even if you aren't running SMP and could use an ordinary read-modify-write instruction instead.