On Wed, Nov 27, 2013 at 12:54:14PM +0100, Richard Biener wrote:
> On Wed, 27 Nov 2013, Jakub Jelinek wrote:
> 
> > On Wed, Nov 27, 2013 at 10:53:56AM +0100, Richard Biener wrote:
> > > Hmm.  I'm still thinking that we should handle this during the regular
> > > transform step.
> > 
> > I wonder if it can't be done instead just in vectorizable_load,
> > if LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo) and the load is
> > invariant, just emit the (broadcasted) load not inside of the loop, but on
> > the loop preheader edge.
> 
> It is safe even for !LOOP_REQUIRES_VERSIONING_FOR_ALIAS.  It's just
> a missed optimization I even noted when originally implementing
> support for invariant loads ...

True, but only for non-simd loops, or if we proved it by looking at all
relevant LOOP_VINFO_DDRSs.  But, if it is not a simd loop, and
not !LOOP_REQUIRES_VERSIONING_FOR_ALIAS, wouldn't previous optimizations
hoist the load before the loop already?

> Ick.  I hate this behind-the-back stuff - so safelen doesn't mean
> that a[i] and a[0] do not alias.

My initial understanding of the SIMD loops was also that it allows the
the up to safelen consecutive iterations to be randomly reordered or
intermixed without affecting valid programs, but further mails from Tobias
and others on this topic plus testcases changed my understanding of it.

Note that we don't purge LOOP_VINFO_DDRSs in any way for loop->safelen,
just don't add versioning for aliasor punt if there is some possible (or
proven) aliasing.  Perhaps we could add a bool flag to loop_vinfo which
would tell us whether the loop has no data dependencies at all (i.e.
either for non-safelen is !LOOP_REQUIRES_VERSIONING_FOR_ALIAS, or
with safelen non-zero would be !LOOP_REQUIRES_VERSIONING_FOR_ALIAS).
Then we could hoist if that flag is set or
LOOP_REQUIRES_VERSIONING_FOR_ALIAS (because then the runtime test
checks the dependency).

> Note that this will break with
> SLP stuff at least as that will re-order reads/writes.  Not sure
> how safelen applies to SLP though.  That is
> 
>     a[i] = i;
>     b[i] = a[0];
>     a[i+1] = i+1;
>     b[i+1] = a[1];
> 
> will eventually end up re-ordering reads/writes in non-obvious
> ways.

You mean SLP inside of loop vectorization, right?  Because for normal SLP
outside of loop vectorizer simdlen is ignored and normal data ref is
performed without any bypassing.

        Jakub

Reply via email to