On Fri, Oct 26, 2007 at 19:04:10 +0200, Michael Matz wrote: > int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, > int *tpdm, int xmb, int *bp, int *ms) > { > int k, sc; > for (k = 1; k <= M; k++) > { > mc[k] = mpp[k-1] + tpmm[k-1]; > if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; > if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; > if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; > mc[k] += ms[k]; > } > }
Aha, but the store in this example is _never_ speculative when concurrency in concerned: you _explicitly_ store to mc[k] anyway, so you may as well add some stores here and there. If mc[] shared, it's programmer's responsibility to protect it with the lock. When you remove the first and the last lines inside the loop, then all stores will become conditional. But only one value will get to mc[k], so there's no point in making the only store unconditional. Note that it doesn't cancel cmoves, as those are loads, not stores. But look at the whole matter another way: suppose GCC implements some optimization, really cool one, and users quickly find a lot of uses for it. But then it is discovered that this optimization is not general enough, and in come cases wrong code is produced. What would you do? Remove it? But users will complain. Ignore the matter? Other users will complain. But you may make it optional, like -funsafe-math-optimizations or -funsafe-loop-optimizations, and everyone is happy. Our situation is a bit different, because 1) speculative store is not a bug per see, 2) program classes where it can do harm (mutli-threaded), and where it can not (single-threaded), are clearly separable. Alright, not entirely, because we don't know when and how libraries are used. But that is the case for -funsafe- options above too. Want safe library? Compile with -fno-thread-unsafe-optimizations, or specify that any user data pointers to which are passed to the library should not be shared (at least during the library call). > > void > > f(int set_v, int *v) > > { > > if (set_v) > > *v = 1; > > } > > > > there's no load-maybe_update-store optimization, so there won't be > > slowdown for such cases also (BTW, how this case is different from > > when v is global?). > > The difference is, that 'v' might be zero, hence *v could trap, hence it > can't be moved out of its control region. If you somehow could determine > that *v can't trap (e.g. by having a dominating access to it already) then > the transformation will be done. Good point. But how to tell the compiler that it is not NULL? The following doesn't work too: void f(int set_v, int v[1]) { if (set_v) v[0] = 1; } void g(int set_v, int *v) __attribute__((nonnull)); void g(int set_v, int *v) { if (set_v) *v = 1; } Please note that I'm not trying to prove you wrong, just curious about the reasons why there's no optimization. -- Tomash Brechko