On Thu, Oct 17, 2013 at 10:07:43AM +0200, Richard Biener wrote: > Which suggests we use > > #pragma GCC ivdep > > to not collide with eventually different semantics in existing programs > that use variants of this pragma?
Yeah, perhaps. > > Intel: > > http://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-B25ABCC2-BE6F-4599-AEDF-2434F4676E1B.htm > > "The ivdep pragma instructs the compiler to ignore assumed vector > > dependencies. > > To ensure correct code, the compiler treats an assumed dependence as a > > proven > > dependence, which prevents vectorization. This pragma overrides that > > decision. > > Use this pragma only when you know that the assumed loop dependencies are > > safe > > to ignore." > > This suggests that _known_ dependences are still treated as dependences. > But what is known obviously depends on the implementation which > may not know that a[i] and a[i+1] depend but merely assume it. Not > a standard-proof definition of the pragma ;) Very bad definition indeed. > That said, safelen even overrides know dependences (but with unknown > distance vector)! (that looks like a bug to me, or at least a QOI issue) safelen is whatever the OpenMP 4.0 standard requires (and Cilk+ is I believe just defering the description to the OpenMP 4.0 definition), and fortunately the OpenMP 4.0 standard doesn't contain any so badly worded definition. The actual wording is not that the <= safelen consecutive iterations can be run in any order, but that they can be performed all together using (possibly emulated) SIMD instructions. Thus I think it is correct if we use it for decisions if we can vectorize a loop (without versioning it for alias), regardless of known vs. unknown dependencies - if we have known dependencies that would result in known broken code, perhaps we should warn?, but probably not for anything further, it should not affect aliasing of scalar or vector loads/stores in the loop, etc. Then I think we'd handle forward but not backward dependencies. Given the void ignore_vec_dep(int *a, int k, int c, int m) { #pragma omp simd for (int i = 0; i < m; i++) a[i] = a[i + k] * c; } testcase, I think we'll handle it fine for k <= -m and k >= 0. For k >= m obviously, there is no overlap and even runtime versioning for alias would handle it right, for smaller k because the load (vector or non-vector) will be before the store and we don't tell aliasing the two don't alias. We don't vectorize any load+store operations (expressed as one stmt in GIMPLE; struct copies or atomic stmts), do we? Those would be a problem with the above. If for anything else we place all the VF loads where the original load was in the IL and all the VF stores where the original store was in the IL, and just leave it to other passes to reorder if they can prove it doesn't alias, we should be fine. Jakub