On 07/30/2015 03:01 AM, Jakub Jelinek wrote:
On Wed, Jul 29, 2015 at 04:48:23PM -0700, Aldy Hernandez wrote:
@@ -7490,8 +7503,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
              == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
    gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
              == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
-  gimplify_omp_ctxp->iter_vars.create (TREE_VEC_LENGTH
-                                      (OMP_FOR_INIT (for_stmt)));
+  gimplify_omp_ctxp->loop_iter_var.create (TREE_VEC_LENGTH
+                                          (OMP_FOR_INIT (for_stmt)));
+  gimplify_omp_ctxp->loop_dir.create (TREE_VEC_LENGTH
+                                     (OMP_FOR_INIT (for_stmt)));
+  gimplify_omp_ctxp->loop_const_step.create (TREE_VEC_LENGTH
+                                            (OMP_FOR_INIT (for_stmt)));
    for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
      {
        t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);

I think the above should be guarded with
   tree c = find_omp_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
   if (c && OMP_CLAUSE_ORDERED_EXPR (c))
The most common case is that ordered(n) is not present, so we should
optimize for that.

Done.


@@ -7501,10 +7518,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
        gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
                  || POINTER_TYPE_P (TREE_TYPE (decl)));
        if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
-       gimplify_omp_ctxp->iter_vars.quick_push
+       gimplify_omp_ctxp->loop_iter_var.quick_push
          (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
        else
-       gimplify_omp_ctxp->iter_vars.quick_push (decl);
+       gimplify_omp_ctxp->loop_iter_var.quick_push (decl);

        /* Make sure the iteration variable is private.  */
        tree c = NULL_TREE;

And all these etc. pushes too, simply remember is_doacross in some bool
variable.

Not applicable. I've moved the code to omp-low, as suggested, where the omp_for_data is available, thus simplifying everything.


@@ -8387,6 +8435,228 @@ gimplify_transaction (tree *expr_p, gimple_seq *pre_p)
    return GS_ALL_DONE;
  }


Function comment missing.

Fixed.


+static gimple
+gimplify_omp_ordered (tree expr, gimple_seq body)
+{

+     The basic algorithm is to create a sink vector whose first
+     element is the GCD of all the first elements, and whose remaining
+     elements are the minimum of the subsequent columns.
+
+     We ignore dependence vectors whose first element is zero because
+     such dependencies are known to be executed by the same thread.
+
+     ?? ^^^^ Is this only applicable for collapse(1) loops?  If so, how
+     ?? to handle collapse(N) loops where N > 1?

For collapse(N) N > 1, you can't ignore first iter var with 0 offset, you can 
only
ignore if N first iter vars have 0 offset.

Pretty much for the purpose of the algorithm, you compute "first element"
for collapse(N) N > 1 and ordered(5-N)
        for (iv1 = ..; iv1 < ..; iv1 += step1)
          for (iv2 = ..; iv2 < ..; iv2 += step2)
            for (iv3 = ..; iv3 < ..; iv3 += step3)
              for (iv4 = ..; iv4 < ..; iv4 += step4)
                depend(iv1 + off1, iv2 + off2, iv3 + off3, iv4 + off4)
as: off(N) + off(N-1)*step(N) + iv(N-2)*step(N)*step(N-1)...
(to be checked the case if the directions differ).
So basically, you want to precompute if you'd add some loop counter
in between loop(N) and loop(N+1) that would be initially zero and incremented
by step(N).  The GCD is then performed on this, it is compared to zero,
and then finally split again into the several offsets.
The "is this invalid iteration" check (is offN divisible by stepN)
needs to be done before the merging of course.

Except, now that I think, it is not that easy.  Because we have another
test for "is this invalid iteration", in particularly one performed at
runtime on each of the depends.  Say if offN is negative and loop(N)
is forward loop (thus stepN is positive (note, for backward loop stepN
still might be huge positive value for unsigned iterators)), the check
would be if (ivN + offN >= initialN), for forward loop with positive offN
if (ivN + offN < endvalueN) etc.  Now, not sure if by computing the GCD and
merging several depend clauses into one we preserve those tests or not.

Ughh... as a followup? For now I'm going to bail on collapse > 1 and add a big FIXME.


All in all, I think it might be really better to do the depend clause
merging during omp lowering in omp-low.c, where you can call
extract_omp_for_data and inspect the iteration variables, have the steps
computed for you etc.  That is the spot where we probably want to emit some
GOMP_depend_sink call (and GOMP_depend_source) and prepare arguments for it.

Excellent suggestion.  Done.


+void
+funk ()
+{
+#pragma omp parallel for ordered(2)
+  for (i=0; i < N; i++)
+    for (j=0; j < N; ++j)
+    {
+/* We should keep the (sink:i,j-2) by virtue of it the i+0.  The
+   remaining clauses get folded with a GCD of -2 for `i' and a minimum
+   of -1 for 'j'.  */

I think we shouldn't keep the useless one (sink: i, j-2) (or keep invalid
ones).

Done.

How's this?

Aldy
commit 9c979c4c4cd53092affbf98c05ddd8c9a60915c7
Author: Aldy Hernandez <al...@redhat.com>
Date:   Wed Jul 29 13:39:06 2015 -0700

        * wide-int.h (wi::gcd): New.
        * gimplify.c (struct gimplify_omp_ctx): Rename iter_vars to
        loop_iter_var.
        (delete_omp_context): Same.
        (gimplify_expr): Move code handling OMP_ORDERED into...
        (gimplify_omp_ordered): ...here.  New.
        * omp-low.c (lower_omp_ordered_clauses): New.
        (lower_omp_ordered): Call lower_omp_ordered_clauses.
    testsuite/
        * gcc.dg/gomp/sink-fold-1.c: New.
        * gcc.dg/gomp/sink-fold-2.c: New.
        * gcc.dg/gomp/sink-fold-3.c: New.
        * c-c++-common/gomp/sink-4.c: Look in omplower dump file instead.
        * g++.dg/gomp/sink-3.C: Have sink offset be in the opposite
        direction.  Make variables more readable.

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 2331001..83eb6a1 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -153,7 +153,7 @@ struct gimplify_omp_ctx
   splay_tree variables;
   hash_set<tree> *privatized_types;
   /* Iteration variables in an OMP_FOR.  */
-  vec<tree> iter_vars;
+  vec<tree> loop_iter_var;
   location_t location;
   enum omp_clause_default_kind default_kind;
   enum omp_region_type region_type;
@@ -392,7 +392,7 @@ delete_omp_context (struct gimplify_omp_ctx *c)
 {
   splay_tree_delete (c->variables);
   delete c->privatized_types;
-  c->iter_vars.release ();
+  c->loop_iter_var.release ();
   XDELETE (c);
 }
 
@@ -7490,8 +7490,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
              == TREE_VEC_LENGTH (OMP_FOR_COND (for_stmt)));
   gcc_assert (TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt))
              == TREE_VEC_LENGTH (OMP_FOR_INCR (for_stmt)));
-  gimplify_omp_ctxp->iter_vars.create (TREE_VEC_LENGTH
-                                      (OMP_FOR_INIT (for_stmt)));
+
+  tree c = find_omp_clause (OMP_FOR_CLAUSES (for_stmt), OMP_CLAUSE_ORDERED);
+  bool is_doacross = false;
+  if (c && OMP_CLAUSE_ORDERED_EXPR (c))
+    {
+      is_doacross = true;
+      gimplify_omp_ctxp->loop_iter_var.create (TREE_VEC_LENGTH
+                                              (OMP_FOR_INIT (for_stmt)));
+    }
   for (i = 0; i < TREE_VEC_LENGTH (OMP_FOR_INIT (for_stmt)); i++)
     {
       t = TREE_VEC_ELT (OMP_FOR_INIT (for_stmt), i);
@@ -7500,11 +7507,14 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
       gcc_assert (DECL_P (decl));
       gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (decl))
                  || POINTER_TYPE_P (TREE_TYPE (decl)));
-      if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
-       gimplify_omp_ctxp->iter_vars.quick_push
-         (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
-      else
-       gimplify_omp_ctxp->iter_vars.quick_push (decl);
+      if (is_doacross)
+       {
+         if (TREE_CODE (for_stmt) == OMP_FOR && OMP_FOR_ORIG_DECLS (for_stmt))
+           gimplify_omp_ctxp->loop_iter_var.quick_push
+             (TREE_VEC_ELT (OMP_FOR_ORIG_DECLS (for_stmt), i));
+         else
+           gimplify_omp_ctxp->loop_iter_var.quick_push (decl);
+       }
 
       /* Make sure the iteration variable is private.  */
       tree c = NULL_TREE;
@@ -8387,6 +8397,53 @@ gimplify_transaction (tree *expr_p, gimple_seq *pre_p)
   return GS_ALL_DONE;
 }
 
+/* Gimplify an OMP_ORDERED construct.  EXPR is the tree version.  BODY
+   is the OMP_BODY of the original EXPR (which has already been
+   gimplified so it's not present in the EXPR).
+
+   Return the gimplified GIMPLE_OMP_ORDERED tuple.  */
+
+static gimple
+gimplify_omp_ordered (tree expr, gimple_seq body)
+{
+  tree c, decls;
+  int failures = 0;
+  unsigned int i;
+
+  if (gimplify_omp_ctxp)
+    for (c = OMP_ORDERED_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
+      if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
+         && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
+       {
+         bool fail = false;
+         for (decls = OMP_CLAUSE_DECL (c), i = 0;
+              decls && TREE_CODE (decls) == TREE_LIST;
+              decls = TREE_CHAIN (decls), ++i)
+           if (i < gimplify_omp_ctxp->loop_iter_var.length ()
+               && TREE_VALUE (decls) != gimplify_omp_ctxp->loop_iter_var[i])
+             {
+               error_at (OMP_CLAUSE_LOCATION (c),
+                         "variable %qE is not an iteration "
+                         "of outermost loop %d, expected %qE",
+                         TREE_VALUE (decls), i + 1,
+                         gimplify_omp_ctxp->loop_iter_var[i]);
+               fail = true;
+               failures++;
+             }
+         /* Avoid being too redundant.  */
+         if (!fail && i != gimplify_omp_ctxp->loop_iter_var.length ())
+           {
+             error_at (OMP_CLAUSE_LOCATION (c),
+                       "number of variables in depend(sink) "
+                       "clause does not match number of "
+                       "iteration variables");
+             failures++;
+           }
+       }
+
+  return gimple_build_omp_ordered (body, OMP_ORDERED_CLAUSES (expr));
+}
+
 /* Convert the GENERIC expression tree *EXPR_P to GIMPLE.  If the
    expression produces a value to be used as an operand inside a GIMPLE
    statement, the value will be stored back in *EXPR_P.  This value will
@@ -9200,38 +9257,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p,
                }
                break;
              case OMP_ORDERED:
-               if (gimplify_omp_ctxp)
-                 for (tree c = OMP_ORDERED_CLAUSES (*expr_p);
-                      c; c = OMP_CLAUSE_CHAIN (c))
-                   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND
-                       && OMP_CLAUSE_DEPEND_KIND (c) == OMP_CLAUSE_DEPEND_SINK)
-                     {
-                       unsigned int n = 0;
-                       bool fail = false;
-                       for (tree decls = OMP_CLAUSE_DECL (c);
-                            decls && TREE_CODE (decls) == TREE_LIST;
-                            decls = TREE_CHAIN (decls), ++n)
-                         if (n < gimplify_omp_ctxp->iter_vars.length ()
-                             && TREE_VALUE (decls)
-                             != gimplify_omp_ctxp->iter_vars[n])
-                           {
-                             error_at (OMP_CLAUSE_LOCATION (c),
-                                       "variable %qE is not an iteration "
-                                       "of outermost loop %d, expected %qE",
-                                       TREE_VALUE (decls), n + 1,
-                                       gimplify_omp_ctxp->iter_vars[n]);
-                             fail = true;
-                           }
-                       /* Avoid being too redundant.  */
-                       if (!fail
-                           && n != gimplify_omp_ctxp->iter_vars.length ())
-                         error_at (OMP_CLAUSE_LOCATION (c),
-                            "number of variables in depend(sink) clause "
-                            "does not match number of iteration variables");
-                     }
-
-               g = gimple_build_omp_ordered (body,
-                                             OMP_ORDERED_CLAUSES (*expr_p));
+               g = gimplify_omp_ordered (*expr_p, body);
                break;
              case OMP_CRITICAL:
                gimplify_scan_omp_clauses (&OMP_CRITICAL_CLAUSES (*expr_p),
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index e7d21ea..a63bf60 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -11969,6 +11969,230 @@ lower_omp_taskgroup (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
 }
 
 
+/* Fold the OMP_ORDERED_CLAUSES for the OMP_ORDERED in STMT if possible.  */
+
+static void
+lower_omp_ordered_clauses (gomp_ordered *ord_stmt, omp_context *ctx)
+{
+  struct omp_for_data fd;
+  if (!ctx->outer || gimple_code (ctx->outer->stmt) != GIMPLE_OMP_FOR)
+    return;
+
+  /* ?? This is stupid.  We need to call extract_omp_for_data just
+     to get the number of ordered loops... */
+  extract_omp_for_data (as_a <gomp_for *> (ctx->outer->stmt), &fd, NULL);
+  if (!fd.ordered)
+    return;
+  struct omp_for_data_loop *loops
+    = (struct omp_for_data_loop *)
+    alloca (fd.ordered * sizeof (struct omp_for_data_loop));
+  /* ?? ...and then again to get the actual loops.  */
+  extract_omp_for_data (as_a <gomp_for *> (ctx->outer->stmt), &fd, loops);
+
+  /* Canonicalize sink dependence clauses into one folded clause if
+     possible.
+
+     The basic algorithm is to create a sink vector whose first
+     element is the GCD of all the first elements, and whose remaining
+     elements are the minimum of the subsequent columns.
+
+     We ignore dependence vectors whose first element is zero because
+     such dependencies are known to be executed by the same thread.
+
+     We take into account the direction of the loop, so a minimum
+     becomes a maximum if the loop is iterating backwards.  We also
+     ignore sink clauses where the loop direction is unknown, or where
+     the offsets are clearly invalid because they are not a multiple
+     of the loop increment.
+
+     For example:
+
+            for (i=0; i < N; ++i)
+               depend(sink:i-8,j-1)
+               depend(sink:i,j-2)      // Completely ignored because i+0.
+               depend(sink:i-4,j+3)
+               depend(sink:i-6,j+2)
+
+     Folded clause is:
+
+        depend(sink:-gcd(8,4,6),min(-1,3,2))
+         -or-
+       depend(sink:-2,-1)
+  */
+
+  /* FIXME: Computing GCD's where the first element is zero is
+     non-trivial in the presence of collapsed loops.  Do this later.  */
+  if (fd.collapse > 1)
+    return;
+
+  unsigned int len = fd.ordered;
+  vec<wide_int> folded_deps;
+  folded_deps.create (len);
+  folded_deps.quick_grow_cleared (len);
+  /* Bitmap representing dimensions in the final dependency vector that
+     have been set.  */
+  sbitmap folded_deps_used = sbitmap_alloc (len);
+  bitmap_clear (folded_deps_used);
+  /* TRUE if the first dimension's offset is negative.  */
+  bool neg_offset_p = false;
+
+  /* ?? We need to save the original iteration variables stored in the
+     depend clauses, because those in fd.loops[].v have already been
+     gimplified.  Perhaps we should use the gimplified versions. ??  */
+  tree *iter_vars = (tree *) alloca (sizeof (tree) * len);
+  memset (iter_vars, 0, sizeof (tree) * len);
+
+  tree *list_p = gimple_omp_ordered_clauses_ptr (ord_stmt);
+  tree c;
+  unsigned int i;
+  while ((c = *list_p) != NULL)
+    {
+      bool remove = false;
+
+      if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_DEPEND
+         || OMP_CLAUSE_DEPEND_KIND (c) != OMP_CLAUSE_DEPEND_SINK)
+       goto next_ordered_clause;
+
+      tree decls;
+      for (decls = OMP_CLAUSE_DECL (c), i = 0;
+          decls && TREE_CODE (decls) == TREE_LIST;
+          decls = TREE_CHAIN (decls), ++i)
+       {
+         gcc_assert (i < len);
+
+         enum {
+           DIR_UNKNOWN,
+           DIR_FORWARD,
+           DIR_BACKWARD
+         } loop_dir;
+         switch (fd.loops[i].cond_code)
+           {
+           case LT_EXPR:
+           case LE_EXPR:
+             loop_dir = DIR_FORWARD;
+             break;
+           case GT_EXPR:
+           case GE_EXPR:
+             loop_dir = DIR_BACKWARD;
+             break;
+           default:
+             loop_dir = DIR_UNKNOWN;
+             gcc_unreachable ();
+           }
+
+         /* While the committee makes up its mind, bail if we have any
+            non-constant steps.  */
+         if (TREE_CODE (fd.loops[i].step) != INTEGER_CST)
+           goto gimplify_omp_ordered_ret;
+
+         wide_int offset = TREE_PURPOSE (decls);
+         if (!iter_vars[i])
+           iter_vars[i] = TREE_VALUE (decls);
+
+         /* Ignore invalid offsets that are not multiples of the step.  */
+         if (!wi::multiple_of_p
+             (wi::abs (offset), wi::abs ((wide_int) fd.loops[i].step),
+              UNSIGNED))
+           {
+             warning_at (OMP_CLAUSE_LOCATION (c), 0,
+                         "ignoring sink clause with offset that is not "
+                         "a multiple of the loop step");
+             remove = true;
+             goto next_ordered_clause;
+           }
+
+         /* Calculate the first dimension.  The first dimension of
+            the folded dependency vector is the GCD of the first
+            elements, while ignoring any first elements whose offset
+            is 0.  */
+         if (i == 0)
+           {
+             /* Ignore dependence vectors whose first dimension is 0.  */
+             if (offset == 0)
+               {
+                 remove = true;
+                 goto next_ordered_clause;
+               }
+             else
+               {
+                 neg_offset_p =
+                   wi::neg_p (offset,
+                              TYPE_SIGN (TREE_TYPE (TREE_PURPOSE (decls))));
+                 if ((loop_dir == DIR_FORWARD && !neg_offset_p)
+                     || (loop_dir == DIR_BACKWARD && neg_offset_p))
+                   {
+                     error_at (OMP_CLAUSE_LOCATION (c),
+                               "first offset must be in opposite direction "
+                               "of loop iterations");
+                     goto gimplify_omp_ordered_ret;
+                   }
+                 /* Initialize the first time around.  */
+                 if (!bitmap_bit_p (folded_deps_used, 0))
+                   {
+                     bitmap_set_bit (folded_deps_used, 0);
+                     folded_deps[0] = wi::abs (offset);
+                   }
+                 else
+                   folded_deps[i] = wi::gcd (folded_deps[0], offset, UNSIGNED);
+               }
+           }
+         /* Calculate minimum for the remaining dimensions.  */
+         else
+           {
+             if (!bitmap_bit_p (folded_deps_used, i))
+               {
+                 bitmap_set_bit (folded_deps_used, i);
+                 folded_deps[i] = offset;
+               }
+             else if ((loop_dir == DIR_FORWARD
+                       && wi::lts_p (offset, folded_deps[i]))
+                      || (loop_dir == DIR_BACKWARD
+                          && wi::gts_p (offset, folded_deps[i])))
+               folded_deps[i] = offset;
+           }
+       }
+
+      remove = true;
+
+    next_ordered_clause:
+      if (remove)
+       *list_p = OMP_CLAUSE_CHAIN (c);
+      else
+       list_p = &OMP_CLAUSE_CHAIN (c);
+    }
+
+  for (i = 0; i < len; ++i)
+    if (!bitmap_bit_p (folded_deps_used, i))
+      break;
+  if (i == len)
+    {
+      if (neg_offset_p)
+       folded_deps[0] = -folded_deps[0];
+
+      tree vec = NULL;
+      i = len;
+      do
+       {
+         i--;
+         vec = tree_cons (wide_int_to_tree (TREE_TYPE (fd.loops[i].v),
+                                            folded_deps[i]),
+                          iter_vars[i], vec);
+       }
+      while (i);
+
+      c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_DEPEND);
+      OMP_CLAUSE_DEPEND_KIND (c) = OMP_CLAUSE_DEPEND_SINK;
+      OMP_CLAUSE_DECL (c) = vec;
+      OMP_CLAUSE_CHAIN (c) = gimple_omp_ordered_clauses (ord_stmt);
+      *gimple_omp_ordered_clauses_ptr (ord_stmt) = c;
+    }
+
+ gimplify_omp_ordered_ret:
+  sbitmap_free (folded_deps_used);
+  folded_deps.release ();
+}
+
+
 /* Expand code for an OpenMP ordered directive.  */
 
 static void
@@ -11979,6 +12203,8 @@ lower_omp_ordered (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
   gcall *x;
   gbind *bind;
 
+  lower_omp_ordered_clauses (as_a <gomp_ordered *> (stmt), ctx);
+
   push_gimplify_context ();
 
   block = make_node (BLOCK);
diff --git a/gcc/testsuite/c-c++-common/gomp/sink-4.c 
b/gcc/testsuite/c-c++-common/gomp/sink-4.c
index 7934de2..111178b 100644
--- a/gcc/testsuite/c-c++-common/gomp/sink-4.c
+++ b/gcc/testsuite/c-c++-common/gomp/sink-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fopenmp -fdump-tree-gimple" } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
 
 /* Test that we adjust pointer offsets for sink variables
    correctly.  */
@@ -22,4 +22,4 @@ funk (foo *begin, foo *end)
     }
 }
 
-/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+400\\)" 1 "gimple" } } 
*/
+/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+400.\\)" 1 "omplower" 
} } */
diff --git a/gcc/testsuite/g++.dg/gomp/sink-3.C 
b/gcc/testsuite/g++.dg/gomp/sink-3.C
index 83a742e..4271d66 100644
--- a/gcc/testsuite/g++.dg/gomp/sink-3.C
+++ b/gcc/testsuite/g++.dg/gomp/sink-3.C
@@ -8,7 +8,7 @@ typedef struct {
     char stuff[400];
 } foo;
 
-foo *p, *q, *r;
+foo *end, *begin, *p;
 
 template<int N>
 void
@@ -16,7 +16,7 @@ funk ()
 {
   int i,j;
 #pragma omp parallel for ordered(1)
-  for (p=q; p < q; p--)
+  for (p=end; p > begin; p--)
     {
 #pragma omp ordered depend(sink:p+1)
       void bar ();
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c 
b/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c
new file mode 100644
index 0000000..f2961b9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
+
+/* Test depend(sink) clause folding.  */
+
+int i,j, N;
+
+extern void bar();
+
+void
+funk ()
+{
+#pragma omp parallel for ordered(2)
+  for (i=0; i < N; i++)
+    for (j=0; j < N; ++j)
+    {
+/* We remove the (sink:i,j-2) by virtue of it the i+0.  The remaining
+   clauses get folded with a GCD of -2 for `i' and a minimum of -1 for
+   'j'.  */
+#pragma omp ordered \
+  depend(sink:i-8,j-1) \
+  depend(sink:i, j-2) \
+  depend(sink:i-4,j+3) \
+  depend(sink:i-6,j+2)
+        bar();
+#pragma omp ordered depend(source)
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "omp ordered depend\\(sink:i-2,j-1\\)" 1 
"omplower" } } */
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c 
b/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c
new file mode 100644
index 0000000..b3b4ac7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+
+int i,j, N;
+
+extern void bar();
+
+void
+funk ()
+{
+#pragma omp parallel for ordered(2)
+  for (i=0; i < N; i += 3)
+    for (j=0; j < N; ++j)
+    {
+#pragma omp ordered depend(sink:i-8,j-1) /* { dg-warning "ignoring sink clause 
with offset that is not a multiple" } */
+#pragma omp ordered depend(sink:i+3,j-1) /* { dg-error "first offset must be 
in opposite direction" } */
+        bar();
+#pragma omp ordered depend(source)
+    }
+}
diff --git a/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c 
b/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c
new file mode 100644
index 0000000..4d6293c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/sink-fold-3.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -fdump-tree-omplower" } */
+
+/* Test that we fold sink offsets correctly while taking into account
+   pointer sizes.  */
+
+typedef struct {
+    char stuff[400];
+} foo;
+
+void
+funk (foo *begin, foo *end)
+{
+  foo *p;
+#pragma omp parallel for ordered(1)
+  for (p=end; p > begin; p--)
+    {
+#pragma omp ordered depend(sink:p+2) depend(sink:p+4)
+      void bar ();
+        bar();
+#pragma omp ordered depend(source)
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "depend\\(sink:p\\+800B\\)" 1 "omplower" 
} } */
diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index d8f7b46..c20db61 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -514,6 +514,7 @@ namespace wi
   BINARY_FUNCTION div_round (const T1 &, const T2 &, signop, bool * = 0);
   BINARY_FUNCTION divmod_trunc (const T1 &, const T2 &, signop,
                                WI_BINARY_RESULT (T1, T2) *);
+  BINARY_FUNCTION gcd (const T1 &, const T2 &, signop = UNSIGNED);
   BINARY_FUNCTION mod_trunc (const T1 &, const T2 &, signop, bool * = 0);
   BINARY_FUNCTION smod_trunc (const T1 &, const T2 &);
   BINARY_FUNCTION umod_trunc (const T1 &, const T2 &);
@@ -2653,6 +2654,27 @@ wi::divmod_trunc (const T1 &x, const T2 &y, signop sgn,
   return quotient;
 }
 
+/* Compute the greatest common divisor of two numbers A and B using
+   Euclid's algorithm.  */
+template <typename T1, typename T2>
+inline WI_BINARY_RESULT (T1, T2)
+wi::gcd (const T1 &a, const T2 &b, signop sgn)
+{
+  T1 x, y, z;
+
+  x = wi::abs (a);
+  y = wi::abs (b);
+
+  while (gt_p (x, 0, sgn))
+    {
+      z = mod_trunc (y, x, sgn);
+      y = x;
+      x = z;
+    }
+
+  return y;
+}
+
 /* Compute X / Y, rouding towards 0, and return the remainder.
    Treat X and Y as having the signedness given by SGN.  Indicate
    in *OVERFLOW if the division overflows.  */

Reply via email to