On Thu, Oct 17, 2013 at 10:07:43AM +0200, Richard Biener wrote:
> Which suggests we use
> 
> #pragma GCC ivdep
> 
> to not collide with eventually different semantics in existing programs
> that use variants of this pragma?

Yeah, perhaps.

> > Intel: 
> > http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-B25ABCC2-BE6F-4599-AEDF-2434F4676E1B.htm
> > "The ivdep pragma instructs the compiler to ignore assumed vector 
> > dependencies.
> >  To ensure correct code, the compiler treats an assumed dependence as a 
> > proven
> >  dependence, which prevents vectorization. This pragma overrides that 
> > decision.
> >  Use this pragma only when you know that the assumed loop dependencies are 
> > safe
> >  to ignore."
> 
> This suggests that _known_ dependences are still treated as dependences.
> But what is known obviously depends on the implementation which
> may not know that a[i] and a[i+1] depend but merely assume it.  Not
> a standard-proof definition of the pragma ;)

Very bad definition indeed.

> That said, safelen even overrides know dependences (but with unknown
> distance vector)! (that looks like a bug to me, or at least a QOI issue)

safelen is whatever the OpenMP 4.0 standard requires (and Cilk+ is I believe
just defering the description to the OpenMP 4.0 definition), and fortunately
the OpenMP 4.0 standard doesn't contain any so badly worded definition.
The actual wording is not that the <= safelen consecutive iterations can be
run in any order, but that they can be performed all together using
(possibly emulated) SIMD instructions.  Thus I think it is correct if we use
it for decisions if we can vectorize a loop (without versioning it for
alias), regardless of known vs. unknown dependencies - if we have known
dependencies that would result in known broken code, perhaps we should
warn?, but probably not for anything further, it should not affect aliasing
of scalar or vector loads/stores in the loop, etc.
Then I think we'd handle forward but not backward dependencies.  Given the
void ignore_vec_dep(int *a, int k, int c, int m)
{
  #pragma omp simd
  for (int i = 0; i < m; i++)
    a[i] = a[i + k] * c;
}
testcase, I think we'll handle it fine for k <= -m and k >= 0.
For k >= m obviously, there is no overlap and even runtime versioning for
alias would handle it right, for smaller k because the load (vector or
non-vector) will be before the store and we don't tell aliasing the two
don't alias.  We don't vectorize any load+store operations (expressed as one
stmt in GIMPLE; struct copies or atomic stmts), do we?  Those would be
a problem with the above.  If for anything else we place all the VF loads
where the original load was in the IL and all the VF stores where the
original store was in the IL, and just leave it to other passes to reorder
if they can prove it doesn't alias, we should be fine.

        Jakub

Reply via email to