Hi,

On Fri, 26 Oct 2007, Tomash Brechko wrote:

> On Fri, Oct 26, 2007 at 21:45:03 +0400, Tomash Brechko wrote:
> > Note that it doesn't cancel cmoves, as those are loads, not stores.
> 
> I just checked with x86 instruction reference, CMOVcc is reg->reg or
> mem->reg, never reg->mem.  You know God's deed when you see it. :)

I wasn't precise in what actually is the important optimization.  The 
important thing about this loop is, that the data is basically random, so 
that branch prediction has no chance to do any good.  Consequentially all 
branches in that loop have a pretty high cost.  So high in fact that it's 
better to replace it with conditional moves on the value to store and make 
the store unconditional.

So, yes, there are no conditional store instructions on x86, but the 
branches need to be removed anyway for performance, and for that we need 
to make the stores unconditional (even at the cost of perhaps introducing 
another load).

You are also right that for that example we can determine that an 
unconditional store already dominates (and postdominates) the conditional 
stores in question and hence would already be thread-unsafe, so the 
transformation would be okay even with thread-safeness in mind.

I was merely showing that this transformation _does_ matter in some cases 
to refute opposite claims which seemed to be expressed too airy in this 
thread.

Now there are multiple ways out of this dilemma, retaining the 
transformation and not breaking threaded code:
1) do the transformation only if there are already other stores in an 
   outer control region.  I see that already being worked on down-thread.
2) do the transformation but also conditionalize the address of the store:
   if (cond)
     *p = val;

   --->

   __typeof__ (*p) dummy;
   if (!cond)
     p = &dummy;  // dummy a stack slot, hence no trap, no thread 
                  // implications
   *p = val;
   
I plan to work on the latter anyway somewhen as it also allows me to do 
the transformation if unconditional non-trappingness can't be proven.


Ciao,
Michael.

P.S.: I'm still somewhat disappointed about the way this discussion goes, 
it reminds me of the ugly one about signed integer overflow.  There it was 
an overly vocal set of people refusing to write ISO C which lead to a very 
intrusive change in the compiler.  Now this seems to happen again (though 
no such intrusive changes would be required right now, but perhaps for the 
other memory model).  Then and now the presumed "deficiencies" did exist 
already since years, but for some unfathomable reason only resulted in 
tempest in a teacup recently.  I don't think it's a good strategy to 
change the compiler into a strictly speaking wrong direction whenever the 
loudness of whiners reaches a certain amount.

Reply via email to