Hi! I admit I fully don't understand why exactly, but my experimentation so far showed that for read/write and write/read ddrs it is ok and desirable to ignore the dist > 0 && DDR_REVERSED_P (ddr) cases, but for write/write ddrs it is undesirable. See the PR for further tests, perhaps I could turn them into further testcases.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2014-01-22 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/59594 * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Don't ignore dist > 0 && DDR_REVERSED_P (ddr) if step is negative and both DRs are writes. * gcc.dg/vect/no-vfa-vect-depend-2.c: New test. * gcc.dg/vect/pr59594.c: New test. --- gcc/tree-vect-data-refs.c.jj 2014-01-16 20:54:59.000000000 +0100 +++ gcc/tree-vect-data-refs.c 2014-01-22 13:13:49.751362484 +0100 @@ -383,11 +383,13 @@ vect_analyze_data_ref_dependence (struct continue; } - if (dist > 0 && DDR_REVERSED_P (ddr)) + if (dist > 0 && DDR_REVERSED_P (ddr) + && (DR_IS_READ (dra) || DR_IS_READ (drb))) { /* If DDR_REVERSED_P the order of the data-refs in DDR was reversed (to make distance vector positive), and the actual - distance is negative. */ + distance is negative. If both DRs are writes, we can't ignore + the DDR. See PR59594. */ if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, "dependence distance negative.\n"); --- gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c.jj 2014-01-22 13:28:47.100724091 +0100 +++ gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c 2014-01-22 13:41:38.736778586 +0100 @@ -0,0 +1,55 @@ +/* { dg-require-effective-target vect_int } */ + +#include <stdarg.h> +#include "tree-vect.h" + +#define N 17 + +int ia[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0}; +int ib[N] = {48,45,42,39,36,33,30,27,24,21,18,15,12,9,6,3,0}; +int res[N] = {48,192,180,168,156,144,132,120,108,96,84,72,60,48,36,24,12}; + +__attribute__ ((noinline)) +int main1 () +{ + int i; + + /* Not vectorizable due to data dependence: dependence distance 1. */ + for (i = N - 1; i >= 0; i--) + { + ia[i] = ia[i+1] * 4; + } + + /* check results: */ + for (i = 0; i < N; i++) + { + if (ia[i] != 0) + abort (); + } + + /* Vectorizable. Dependence distance -1. */ + for (i = N - 1; i >= 0; i--) + { + ib[i+1] = ib[i] * 4; + } + + /* check results: */ + for (i = 0; i < N; i++) + { + if (ib[i] != res[i]) + abort (); + } + + return 0; +} + +int main (void) +{ + check_vect(); + + return main1 (); +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {xfail vect_no_align } } } */ +/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ --- gcc/testsuite/gcc.dg/vect/pr59594.c.jj 2014-01-22 13:39:51.362322166 +0100 +++ gcc/testsuite/gcc.dg/vect/pr59594.c 2014-01-22 13:39:43.000000000 +0100 @@ -0,0 +1,31 @@ +/* PR tree-optimization/59594 */ + +#include "tree-vect.h" + +#define N 1024 +int b[N + 1]; + +int +main () +{ + int i; + check_vect (); + for (i = 0; i < N + 1; i++) + { + b[i] = i; + asm (""); + } + for (i = N; i >= 0; i--) + { + b[i + 1] = b[i]; + b[i] = 1; + } + if (b[0] != 1) + __builtin_abort (); + for (i = 0; i < N; i++) + if (b[i + 1] != i) + __builtin_abort (); + return 0; +} + +/* { dg-final { cleanup-tree-dump "vect" } } */ Jakub