Ian Lance Taylor <[EMAIL PROTECTED]> writes: > It would be nice to figure out how gcc could improve matters. The > answer is not going to be to disable certain optimizations.
It's just very questionable this particular transformation is a optimization at all. Turning a shared cache line into an exclusive one can be very expensive even on small MP systems. Also it increases traffic on the bus even on uniprocessor systems. For me it looks more like a mistuned optimization: if conversion is useful for values in registers; but questionable for arbitary memory stores that are not guaranteed to be in L1. The worst memory overhead will likely swamp what ever pipeline advantages you can get from not jumping. Or rather if it's done for stores it needs to guarantee cancel the store in the not taken case (that is possible even on x86 by redirecting the store using cmov to a temporary on the stack which is likely in L1) I guess that code just needs to cooperate better with the register allocator? -Andi