Hi Richard,

On 1 February 2018 at 23:21, Richard Biener <richard.guent...@gmail.com> wrote:
> On Thu, Feb 1, 2018 at 5:07 AM, Kugan Vivekanandarajah
> <kugan.vivekanandara...@linaro.org> wrote:
>> Hi Richard,
>>
>> On 31 January 2018 at 21:39, Richard Biener <richard.guent...@gmail.com> 
>> wrote:
>>> On Wed, Jan 31, 2018 at 11:28 AM, Kugan Vivekanandarajah
>>> <kugan.vivekanandara...@linaro.org> wrote:
>>>> Hi Richard,
>>>>
>>>> Thanks for the review.
>>>> On 25 January 2018 at 20:04, Richard Biener <richard.guent...@gmail.com> 
>>>> wrote:
>>>>> On Wed, Jan 24, 2018 at 10:56 PM, Kugan Vivekanandarajah
>>>>> <kugan.vivekanandara...@linaro.org> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> Here is a patch for popcount builtin detection similar to LLVM. I
>>>>>> would like to queue this for review for next stage 1.
>>>>>>
>>>>>> 1. This is done part of loop-distribution and effective for -O3 and 
>>>>>> above.
>>>>>> 2. This does not distribute loop to detect popcount (like
>>>>>> memcpy/memmove). I dont think that happens in practice. Please correct
>>>>>> me if I am wrong.
>>>>>
>>>>> But then it has no business inside loop distribution but instead is
>>>>> doing final value
>>>>> replacement, right?  You are pattern-matching the whole loop after all.  
>>>>> I think
>>>>> final value replacement would already do the correct thing if you
>>>>> teached number of
>>>>> iteration analysis that niter for
>>>>>
>>>>>   <bb 3> [local count: 955630224]:
>>>>>   # b_11 = PHI <b_5(5), b_8(6)>
>>>>>   _1 = b_11 + -1;
>>>>>   b_8 = _1 & b_11;
>>>>>   if (b_8 != 0)
>>>>>     goto <bb 6>; [89.00%]
>>>>>   else
>>>>>     goto <bb 8>; [11.00%]
>>>>>
>>>>>   <bb 6> [local count: 850510900]:
>>>>>   goto <bb 3>; [100.00%]
>>>>
>>>> I am looking into this approach. What should be the scalar evolution
>>>> for b_8 (i.e. b & (b -1) in a loop) should be? This is not clear to me
>>>> and can this be represented with the scev?
>>>
>>> No, it's not affine and thus cannot be represented.  You only need the
>>> scalar evolution of the counting IV which is already handled and
>>> the number of iteration analysis needs to handle the above IV - this
>>> is the missing part.
>> Thanks for the clarification. I am now matching this loop pattern in
>> number_of_iterations_exit when number_of_iterations_exit_assumptions
>> fails. If the pattern matches, I am inserting the _builtin_popcount in
>> the loop preheater and setting the loop niter with this. This will be
>> used by the final value replacement. Is this what you wanted?
>
> No, you shouldn't insert a popcount stmt but instead the niter
> GENERIC tree should be a CALL_EXPR to popcount with the
> appropriate argument.

Thats what I tried earlier but ran into some ICEs. I wasn't sure if
niter in tree_niter_desc can take such.

Attached patch now does this. Also had to add support for CALL_EXPR in
few places to handle niter with CALL_EXPR. Does this look OK?

Thanks,
Kugan


gcc/ChangeLog:

2018-02-08  Kugan Vivekanandarajah  <kug...@linaro.org>

    * gimple-expr.c (extract_ops_from_tree): Handle CALL_EXPR.
    * tree-chrec.c (chrec_convert_1): Likewise.
    * tree-scalar-evolution.c (expression_expensive_p): Likewise.
    * tree-ssa-loop-ivopts.c (contains_abnormal_ssa_name_p): Likewise.
    * tree-ssa-loop-niter.c (check_popcount_pattern): New.
    (number_of_iterations_exit): Record niter for popcount patern.
    * tree-ssa.c (verify_ssa): Check stmt to be non NULL.

gcc/testsuite/ChangeLog:

2018-02-08  Kugan Vivekanandarajah  <kug...@linaro.org>

    * gcc.dg/tree-ssa/popcount.c: New test.


>
>> In general, there is also a condition gating the loop like
>>
>> if (b_4 != 0)
>>   goto loop;
>> else
>>   end:
>>
>> This of course will not be removed by final value replacement. Since
>> popcount (0) is defined, this is redundant and could be removed but
>> not removed.
>
> Yeah, that's probably sth for another pass though.  I suppose the
> end: case just uses zero in which case you'll have a PHI and you
> can optimize
>
>   if (b != 0)
>     return popcount (b);
>   return 0;
>
> in phiopt.
>
> Richard.
>
>> Thanks,
>> Kugan
>>
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Kugan
>>>>>
>>>>> is related to popcount (b_5).
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new 
>>>>>> regressions.
>>>>>>
>>>>>> Thanks,
>>>>>> Kugan
>>>>>>
>>>>>> gcc/ChangeLog:
>>>>>>
>>>>>> 2018-01-25  Kugan Vivekanandarajah  <kug...@linaro.org>
>>>>>>
>>>>>>     PR middle-end/82479
>>>>>>     * tree-loop-distribution.c (handle_popcount): New.
>>>>>>     (pass_loop_distribution::execute): Use handle_popcount.
>>>>>>
>>>>>> gcc/testsuite/ChangeLog:
>>>>>>
>>>>>> 2018-01-25  Kugan Vivekanandarajah  <kug...@linaro.org>
>>>>>>
>>>>>>     PR middle-end/82479
>>>>>>     * gcc.dg/tree-ssa/popcount.c: New test.
From 296efa2c09cdd797e3295f0c29a5a943dc87e9f2 Mon Sep 17 00:00:00 2001
From: Kugan Vivekanandarajah <kugan.vivekanandara...@linaro.org>
Date: Thu, 8 Feb 2018 09:28:57 +1100
Subject: [PATCH] popcount v2

---
 gcc/gimple-expr.c                        |   3 +-
 gcc/testsuite/gcc.dg/tree-ssa/popcount.c |  41 ++++++++++
 gcc/tree-chrec.c                         |   2 +
 gcc/tree-scalar-evolution.c              |   2 +
 gcc/tree-ssa-loop-ivopts.c               |  10 +++
 gcc/tree-ssa-loop-niter.c                | 124 ++++++++++++++++++++++++++++++-
 gcc/tree-ssa.c                           |   2 +-
 7 files changed, 181 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/popcount.c

diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index 56caaca..3fc04ff 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -548,7 +548,8 @@ extract_ops_from_tree (tree expr, enum tree_code *subcode_p, tree *op1_p,
       *op2_p = NULL_TREE;
       *op3_p = NULL_TREE;
     }
-  else if (grhs_class == GIMPLE_SINGLE_RHS)
+  else if (grhs_class == GIMPLE_SINGLE_RHS
+	   || TREE_CODE (expr) == CALL_EXPR)
     {
       *op1_p = expr;
       *op2_p = NULL_TREE;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
new file mode 100644
index 0000000..86a66cb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+extern int foo (int);
+
+int PopCount (long b) {
+    int c = 0;
+    b++;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    return c;
+}
+int PopCount2 (long b) {
+    int c = 0;
+
+    while (b) {
+	b &= b - 1;
+	c++;
+    }
+    foo (c);
+    return foo (c);
+}
+
+void PopCount3 (long b1) {
+
+    for (long i = 0; i < b1; ++i)
+      {
+	long b = i;
+	int c = 0;
+	while (b) {
+	    b &= b - 1;
+	    c++;
+	}
+	foo (c);
+      }
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_popcount" 3 "optimized" } } */
diff --git a/gcc/tree-chrec.c b/gcc/tree-chrec.c
index 924df04..098f46f 100644
--- a/gcc/tree-chrec.c
+++ b/gcc/tree-chrec.c
@@ -1382,6 +1382,8 @@ keep_cast:
 						  CHREC_RIGHT (chrec)));
       res = chrec_convert_1 (type, res, at_stmt, use_overflow_semantics, from);
     }
+  else if (TREE_CODE (chrec) == CALL_EXPR)
+    res = fold_build1 (NOP_EXPR, TREE_TYPE (TREE_TYPE (chrec)), chrec);
   else
     res = fold_convert (type, chrec);
 
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c
index 0ba1aa8..ff313c9 100644
--- a/gcc/tree-scalar-evolution.c
+++ b/gcc/tree-scalar-evolution.c
@@ -3488,6 +3488,8 @@ expression_expensive_p (tree expr)
       if (!integer_pow2p (TREE_OPERAND (expr, 1)))
 	return true;
     }
+  if (code == CALL_EXPR)
+    return false;
 
   switch (TREE_CODE_CLASS (code))
     {
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index b313571..519649a 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -985,6 +985,16 @@ contains_abnormal_ssa_name_p (tree expr)
   code = TREE_CODE (expr);
   codeclass = TREE_CODE_CLASS (code);
 
+  if (code == CALL_EXPR)
+    {
+      tree arg;
+      call_expr_arg_iterator iter;
+      FOR_EACH_CALL_EXPR_ARG (arg, iter, expr)
+	if (contains_abnormal_ssa_name_p (arg))
+	  return true;
+      return false;
+    }
+
   if (code == SSA_NAME)
     return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
 
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index fa49abf..470ba2f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2430,6 +2430,111 @@ number_of_iterations_exit_assumptions (struct loop *loop, edge exit,
   return (!integer_zerop (niter->assumptions));
 }
 
+/* See if LOOP is a popcout implementation of the form
+
+    int c = 0;
+    while (b) {
+	b = b & (b - 1);
+	c++;
+    }
+
+    If so, Set NITER to  __builtin_popcount (b) - 1
+    return true if we did, false otherwise.  */
+
+static bool
+check_popcount_pattern (loop_p loop, tree *niter)
+{
+  tree lhs, rhs;
+  tree dest, src;
+  gimple *and_minus_one;
+  int count = 0;
+  gphi *count_phi = NULL;
+
+  if (!single_exit (loop))
+    return false;
+
+  /* Check loop terminating branch is like
+     if (b != 0).  */
+  gimple *stmt = last_stmt (loop->header);
+  if (!stmt
+      || gimple_code (stmt) != GIMPLE_COND
+      || !zerop (gimple_cond_rhs (stmt)))
+    return false;
+
+  /* Cheeck "b = b & (b - 1)" is calculated.  */
+  lhs = gimple_cond_lhs (stmt);
+  if (TREE_CODE (lhs) != SSA_NAME)
+    return false;
+  gimple *and_stmt = SSA_NAME_DEF_STMT (lhs);
+  if (!is_gimple_assign (and_stmt)
+      || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR)
+    return false;
+  lhs = gimple_assign_rhs1 (and_stmt);
+  rhs = gimple_assign_rhs2 (and_stmt);
+  if (TREE_CODE (lhs) == SSA_NAME
+      && (and_minus_one = SSA_NAME_DEF_STMT (lhs))
+      && is_gimple_assign (and_minus_one)
+      && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR)
+      && integer_minus_onep (gimple_assign_rhs2 (and_minus_one)))
+      lhs = rhs;
+  else if (TREE_CODE (rhs) == SSA_NAME
+      && (and_minus_one = SSA_NAME_DEF_STMT (rhs))
+      && is_gimple_assign (and_minus_one)
+      && (gimple_assign_rhs_code (and_minus_one) == PLUS_EXPR)
+      && integer_minus_onep (gimple_assign_rhs2 (and_minus_one)))
+      ;
+  else
+    return false;
+  if ((gimple_assign_rhs1 (and_stmt) != gimple_assign_rhs1 (and_minus_one))
+      && (gimple_assign_rhs2 (and_stmt) != gimple_assign_rhs1 (and_minus_one)))
+    return false;
+
+  /* Check the recurrence.  */
+  gimple *phi = SSA_NAME_DEF_STMT (gimple_assign_rhs1 (and_minus_one));
+  gimple *src_phi = SSA_NAME_DEF_STMT (lhs);
+  if (gimple_code (phi) != GIMPLE_PHI
+      || gimple_code (src_phi) != GIMPLE_PHI)
+    return false;
+
+  /* Check the loop closed SSA definition for just the variable c defined in
+     loop.  */
+  src = gimple_phi_arg_def (src_phi, loop_preheader_edge (loop)->dest_idx);
+  basic_block bb = single_exit (loop)->dest;
+  for (gphi_iterator gpi = gsi_start_phis (bb);
+       !gsi_end_p (gpi); gsi_next (&gpi))
+    {
+      count_phi = gpi.phi ();
+      count++;
+    }
+
+  if (count != 1)
+    return false;
+
+  /* Make sure we have a count by one and it starts from zero.  */
+  rhs = gimple_phi_arg_def (count_phi, 0);
+  stmt = SSA_NAME_DEF_STMT (rhs);
+  if (!is_gimple_assign (stmt)
+      || (gimple_assign_rhs_code (stmt) != PLUS_EXPR)
+      || !integer_onep (gimple_assign_rhs2 (stmt)))
+    return false;
+  rhs = gimple_assign_rhs1 (stmt);
+  stmt = SSA_NAME_DEF_STMT (rhs);
+  if (gimple_code (stmt) != GIMPLE_PHI
+      || !integer_zerop (gimple_phi_arg_def (stmt,
+					     loop_preheader_edge (loop)->dest_idx)))
+    return false;
+
+  dest = gimple_phi_result (count_phi);
+  tree var = make_ssa_name (TREE_TYPE (dest), NULL);
+  tree fn = builtin_decl_implicit (BUILT_IN_POPCOUNT);
+
+  var = build_call_expr (fn, 1, src);
+  *niter = fold_build2 (MINUS_EXPR, TREE_TYPE (dest), var,
+			build_int_cst (TREE_TYPE (dest), 1));
+  return true;
+}
+
+
 /* Like number_of_iterations_exit_assumptions, but return TRUE only if
    the niter information holds unconditionally.  */
 
@@ -2441,7 +2546,24 @@ number_of_iterations_exit (struct loop *loop, edge exit,
   gcond *stmt;
   if (!number_of_iterations_exit_assumptions (loop, exit, niter,
 					      &stmt, every_iteration))
-    return false;
+    {
+      tree count;
+      if (check_popcount_pattern (loop, &count))
+	{
+	  niter->assumptions = boolean_false_node;
+	  niter->control.base = NULL_TREE;
+	  niter->control.step = NULL_TREE;
+	  niter->control.no_overflow = false;
+	  niter->niter = count;
+	  niter->assumptions = boolean_true_node;
+	  niter->may_be_zero = boolean_false_node;
+	  niter->max = -1;
+	  niter->bound = NULL_TREE;
+	  niter->cmp = ERROR_MARK;
+	  return true;
+	}
+      return false;
+    }
 
   if (integer_nonzerop (niter->assumptions))
     return true;
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index ee311ce..572419c 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -1047,7 +1047,7 @@ verify_ssa (bool check_modified_stmt, bool check_ssa_operands)
 	  verify_ssa_name (name, virtual_operand_p (name));
 
 	  stmt = SSA_NAME_DEF_STMT (name);
-	  if (!gimple_nop_p (stmt))
+	  if (stmt && !gimple_nop_p (stmt))
 	    {
 	      basic_block bb = gimple_bb (stmt);
 	      if (verify_def (bb, definition_block,
-- 
2.7.4

Reply via email to