On Mon, 4 Dec 2017, Richard Biener wrote: > On Mon, 4 Dec 2017, Bin.Cheng wrote: > > > On Mon, Dec 4, 2017 at 1:11 PM, Richard Biener <rguent...@suse.de> wrote: > > > > > > I've noticed we perform FP reduction association without the required > > > checks for associative math. I've added > > > gcc.dg/tree-ssa/loop-interchange-1b.c to cover this. > > > > > > I also noticed we happily interchange a loop with a reduction like > > > > > > sum = a[i] - sum; > > > > > > where a change in order of elements isn't ok. Unfortunately bwaves > > > is exactly a case where single_use != next_def (tried to simply remove > > > that case for now), because reassoc didn't have a chance to fix the > > > operand order. Thus this patch exports the relevant handling from > > > the vectorizer (for stage1 having a separate infrastructure gathering / > > > analyzing of reduction/induction infrastructure would be nice...) > > > and uses it from interchange. We then don't handle > > > gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer > > > missed-opt is PR65930). I didn't bother to split up the vectorizer > > > code further to implement relaxed validity checking but simply XFAILed > > > this testcase. > > > > > > Earlier I simplified allocation stuff in the main loop which is why > > > this part is included in this patch. > > > > > > Bootstrap running on x86_64-unknown-linux-gnu. > > > > > > I'll see to craft a testcase with the sum = a[i] - sum; mis-handling. > > > > > > Ok? > > Sure. > > Just for the record. There is also similar associative check in > > predcom. As you suggested, a path extraction/checking interface for > > associative checking would be great, given we have multiple users now. > > Yeah. Note for interchange we don't need associativeness in the sense > as implemented by associative_tree_code, we need left-associativeness > which also MINUS_EXPR provides. So my check isn't perfect. I guess > we instead need > > associative_tree_code () > || (code == MINUS_EXPR > && use_p->use == gimple_assign_rhs1_ptr (single_use)) > > where we could also allow RDIV_EXPR and other left-associative > tree codes (but check_reduction_path would do the wrong thing > with those at the moment but it has MINUS_EXPR handling that > would support sum = x - (y - sum) which the above rejects). > > So I'm retesting with > > /* Check the reduction operation. We require a left-associative > operation. > For FP math we also need to be allowed to associate operations. */ > enum tree_code code = gimple_assign_rhs_code (single_use); > if (gassign *ass = dyn_cast <gassign *> (single_use)) > { > enum tree_code code = gimple_assign_rhs_code (ass); > if (! (associative_tree_code (code) > || (code == MINUS_EXPR > && use_p->use == gimple_assign_rhs1_ptr (ass))) > || (FLOAT_TYPE_P (TREE_TYPE (var)) > && ! flag_associative_math)) > return false; > } > else > return false; > > which is more restrictive than before.
And here's two testcases, one for sum = x - sum and one for division by sum which shows wrong-code without this patch. Richard. 2017-12-05 Richard Biener <rguent...@suse.de> * gcc.dg/tree-ssa/loop-interchange-12.c: New testcase. * gcc.dg/tree-ssa/loop-interchange-13.c: Likewise. Index: gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-12.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-12.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-12.c (working copy) @@ -0,0 +1,50 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */ + +/* Copied from graphite/interchange-4.c */ + +#define DEBUG 0 +#if DEBUG +#include <stdio.h> +#endif + +unsigned u[1024]; + +static void __attribute__((noinline,noclone,noipa)) +foo (int N, unsigned *res) +{ + int i, j; + unsigned sum = 1; + for (i = 0; i < N; i++) + for (j = 0; j < N; j++) + sum = u[i + 2 * j] / sum; + + *res = sum; +} + +extern void abort (); + +int +main (void) +{ + int i, j; + unsigned res; + + u[0] = 10; + u[1] = 200; + u[2] = 10; + u[3] = 10; + + foo (2, &res); + +#if DEBUG + fprintf (stderr, "res = %d \n", res); +#endif + + if (res != 0) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-not "is interchanged" "linterchange"} } */ Index: gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-13.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-13.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-13.c (working copy) @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -floop-interchange -fdump-tree-linterchange-details" } */ + +/* Copied from graphite/interchange-4.c */ + +#define DEBUG 0 +#if DEBUG +#include <stdio.h> +#endif + +unsigned u[1024]; + +static void __attribute__((noinline,noclone,noipa)) +foo (int N, int M, unsigned *res) +{ + int i, j; + unsigned sum = 0; + if (N > 0) + for (i = 0; i < M; i++) + for (j = 0; j < N; j++) + sum = u[i + 3 * j] - sum; + + *res = sum; +} + +extern void abort (); + +int +main (void) +{ + int i, j; + unsigned res; + + u[0] = 1; + u[1] = 2; + u[2] = 4; + u[3] = 5; + u[4] = 7; + u[5] = 8; + + foo (2, 3, &res); + +#if DEBUG + fprintf (stderr, "res = %d \n", res); +#endif + + if (res != 13) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-not "is interchanged" "linterchange"} } */