Andi Kleen <[EMAIL PROTECTED]> writes: > Ian Lance Taylor <[EMAIL PROTECTED]> writes: > > > It would be nice to figure out how gcc could improve matters. The > > answer is not going to be to disable certain optimizations. > > It's just very questionable this particular transformation is a > optimization at all. Turning a shared cache line into an exclusive one > can be very expensive even on small MP systems. Also it increases > traffic on the bus even on uniprocessor systems. > > For me it looks more like a mistuned optimization: if conversion is useful > for values in registers; but questionable for arbitary memory stores that are > not > guaranteed to be in L1. The worst memory overhead will likely swamp what > ever pipeline advantages you can get from not jumping. > > Or rather if it's done for stores it needs to guarantee cancel the > store in the not taken case (that is possible even on x86 by > redirecting the store using cmov to a temporary on the stack which > is likely in L1) > > I guess that code just needs to cooperate better with the register allocator? >
The current code (noce_try_addcc in ifcvt.c) does not even consider whether this conversion is being done for a store or not. I agree that that is most likely an optimization bug that this conversion is being for a store to a variable which is not on the stack. This should be filed as an optimization bug (not a correctness bug) in gcc bugzilla. (This is of course orthogonal to the main issue of what gcc can do to help code correctness.) Ian