Re: [PR51752] publication safety violations in loop invariant motion pass

Torvald Riegel Fri, 24 Feb 2012 05:10:50 -0800

On Fri, 2012-02-24 at 09:58 +0100, Richard Guenther wrote:
> On Thu, Feb 23, 2012 at 10:11 PM, Aldy Hernandez <al...@redhat.com> wrote:
> > On 02/23/12 12:19, Aldy Hernandez wrote:
> >
> >> about hit me. Instead now I save all loads in a function and iterate
> >> through them in a brute force way. I'd like to rewrite this into a hash
> >> of some sort, but before I go any further I'm interested to know if the
> >> main idea is ok.
> >
> >
> > For the record, it may be ideal to reuse some of the iterations we already
> > do over the function's basic blocks, so we don't have to iterate yet again
> > over the IL stream.  Though it may complicate the pass unnecessarily.
> 
> It's definitely ugly that you need to do this.


Indeed.  But that's simply the price we have to pay for making
publication with transactions easier for the programmer yet still at
acceptable performance.

Also note that what we primarily have to care about for this PR is to
not hoist loads _within_ transactions if they would violate publication
safety.  I didn't have time to look at Aldy's patch yet, but a first
safe and conservative way would be to treat transactions as full
transformation barriers, and prevent publication-safety-violating
transformations _within_ transactions.  Which I would prefer until we're
confident that we understood all of it.

For hoisting out of or across transactions, we have to reason about more
than just publication safety.

> And yes, you have to look at
> very many passes I guess, PRE comes to my mind at least.
> 
> I still do not understand the situation very well, at least why for
> 
>  transaction { for () if (b) load; } load;
> 
> it should be safe to hoist load out of the transaction

This one is funny.  *Iff* this is an atomic txn, we can assume that the
transaction does not wait for a concurrent event. If it is a relaxed
txn, then we must not have a loop which terminates depending on an
atomic load (which we don't in that example); otherwise, we cannot hoist
load to before the txn (same reason as why we must not hoist to before
an atomic load with memory_order_acquire).
Now, if these things hold, then the load will always be executed after
the txn.  Thus, we can assume that it will happen anyway and nothing can
stop it (irrespective of which position the txn gets in the
Transactional Synchronization Order (see the C++ TM spec)).  It is a
nonatomic load, and we can rely on the program being data-race-free, so
we cannot have other threads storing to the same location, so we can
hoist it across because in that part of the program, the location is
guaranteed to be thread-local data (or immutable).

As I said, this isn't related to just pub safety anymore.  And this is
tricky enough that I'd rather not try to optimize this currently but
instead wait until we have more confidence in our understanding of the
matter.

> while for
> 
>  load; transaction { for () if (b) load; }
> 
> it is not.

If the same assumptions as above hold, I think you can hoist it out,
because again you can assume that it targets thread-local/immutable
data: the nontransactional (+nonatomic!) load can happen at any time,
essentially, irrespective of b's value or how/when other threads modify
it.  Thus, it cannot have been changed between the two loads in a
data-race-free program.

> Neither do I understand why it's not ok for
> 
>  transaction { for () if (b) load; }
> 
> to hoist the load out of the transaction.

You would violate publication safety.

Also, you don't have any reason to believe that load targets
thread-local/immutable data, so you must not make it nontransactional
(otherwise, you could introduce a race condition).

> 
> I assume all the issue is about hoisting things over the trasaction start.
> So - why does the transaction start not simply clobber all global memory?
> That would avoid the hoisting.  I assume that  transforming the above to
> 
>  transaction { tem = load; for () if (b) = tem; }
> 
> is ok?

No, it is not.  Actually, this is precisely the transformation that we
need to prevent from happening.
As Aldy said, please see the explanations in the PR, or have a look at
the C++ TM specification and the C++11 memory model alternatively.  We
could also discuss this on IRC, if this would be easier.

Torvald

Re: [PR51752] publication safety violations in loop invariant motion pass

Reply via email to