RE: [PR tree-optimization/52558]: RFC: questions on store data race

Boehm, Hans Fri, 13 Apr 2012 16:23:44 -0700


> -----Original Message-----
> From: Aldy Hernandez [mailto:al...@redhat.com]
> Sent: Thursday, April 12, 2012 3:12 PM
> To: Richard Guenther
> Cc: Andrew MacLeod; Boehm, Hans; gcc-patches; Torvald Riegel
> Subject: [PR tree-optimization/52558]: RFC: questions on store data
> race
> 
> Here we have a testcase that affects both the C++ memory model and
> transactional memory.
> 
> [Hans, this is caused by the same problem that is causing the
> speculative register promotion issue you and Torvald pointed me at].
> 
> In the following testcase (adapted from the PR), the loop invariant
> motion pass caches a pre-existing value for g_2, and then performs a
> store to g_2 on every path, causing a store data race:
> 
> int g_1 = 1;
> int g_2 = 0;
> 
> int func_1(void)
> {
>    int l;
>    for (l = 0; l < 1234; l++)
>    {
>      if (g_1)
>        return l;
>      else
>        g_2 = 0;       <-- Store to g_2 should only happen if !g_1
>    }
>    return 999;
> }
> 
> This gets transformed into something like:
> 
>       g_2_lsm = g_2;
>       if (g_1) {
>               g_2 = g_2_lsm;  // boo! hiss!
>               return 0;
>       } else {
>               g_2_lsm = 0;
>               g_2 = g_2_lsm;
>       }
> 
> The spurious write to g_2 could cause a data race.
Presumably the g_2_lsm = g_2 is actually outside the loop?


Why does the second g_2 = g_2_lsm; get introduced?  I would have expected it 
before the return.  Was the example just over-abbreviated?

Other than that, this sounds right to me.  So does Richard's flag-based 
version, which is the approach I would have originally expected.  But that 
clearly costs you a register.  It would be interesting to see how they compare.

Hans

> 
> Andrew has suggested verifying that the store to g_2 was actually
> different than on entry, and letting PHI copy propagation optimize the
> redundant comparisons.  Like this:
> 
>       g_2_lsm = g_2;
>       if (g_1) {
>               if (g_2_lsm != g_2)     // hopefully optimized away
>                       g_2 = g_2_lsm;  // hopefully optimized away
>               return 0;
>       } else {
>               g_2_lsm = 0;
>               if (g_2_lsm != g_2)     // hopefully optimized away
>                       g_2 = g_2_lsm;
>       }
> 
> ...which PHI copy propagation and/or friends should be able to
> optimize.
> 
> For that matter, regardless of the memory model, the proposed solution
> would improve the existing code by removing the extra store (in this
> contrived test case anyhow).
> 
> Anyone see a problem with this approach (see attached patch)?
> 
> Assuming the unlikely scenario that everyone likes this :), I have two
> problems that I would like answered.  But feel free to ignore if the
> approach is a no go.
> 
> 1. No pass can figure out the equality (or inequality) of g_2_lsm and
> g_2.  In the PR, Richard mentions that both FRE/PRE can figure this
> out, but they are not run after store motion.  DOM should also be able
> to, but apparently does not :(.  Tips?
> 
> 2. The GIMPLE_CONDs I create in this patch are currently causing
> problems with complex floats/doubles.  do_jump is complaining that it
> can't compare them, so I assume it is too late (in tree-ssa-loop-im.c)
> to compare complex values since complex lowering has already happened
> (??). Is there a more canonical way of creating a GIMPLE_COND that may
> contain complex floats at this stage?
> 
> Thanks folks.

RE: [PR tree-optimization/52558]: RFC: questions on store data race

Reply via email to