Hi, On Fri, 26 Oct 2007, Tomash Brechko wrote:
> It was already said that instead of disallowing all optimization with > volatile, the optimization itself may be made a bit differently. > Besides, the concern that it will hurt performance at large is a bit > far-stretched. You still may speculatively store to automatic var for > which address was never taken, and this alone covers 50%--80% of > cases. Both, the assessment of far-stretchedness and these numbers seem to be invented ad hoc. The latter is irrelevant (it's not interesting how many cases there are, but how important those cases which occur are, for some metric, let's say performance). And the former isn't true, i.e. the concern is not far-stretched. For 456.hmmer for instance it is crucial that this transformation happens, the basic situation looks like so: int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, int *tpdm, int xmb, int *bp, int *ms) { int k, sc; for (k = 1; k <= M; k++) { mc[k] = mpp[k-1] + tpmm[k-1]; if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; mc[k] += ms[k]; } } Here the conditional stores to mc[k] are better be implemented as conditional moves, otherwise you loose about 25% performance on some platforms. See PR27313, for which I implemented this transformation on the tree level. A similar transformation happens already since much longer time by the RTL if-cvt. All of these are currently completely valid transformations, so they could only be redefined as invalid by some other memory model. Such other memory model has to take into account the performance implications, which do exist. Contrary to what some proponents of a different model claim. Certainly some suggestions for another memory model look quite similar to considering all non-automatic objects as volatile, at which point the question should be allowed why not simply using 'volatile'. > Only globals, or locals which address was passed to some > function, should be treated specially. Also, for the case > > void > f(int set_v, int *v) > { > if (set_v) > *v = 1; > } > > there's no load-maybe_update-store optimization, so there won't be > slowdown for such cases also (BTW, how this case is different from > when v is global?). The difference is, that 'v' might be zero, hence *v could trap, hence it can't be moved out of its control region. If you somehow could determine that *v can't trap (e.g. by having a dominating access to it already) then the transformation will be done. Ciao, Michael.