On Tue, 20 Aug 2019, Richard Biener wrote:

> 
> Excessive use of __builtin_assume_aligned can cause missed optimizations
> because those calls are propagation barriers.  The following removes
> those that are redundant and provide no extra information, on the
> testcase allowng store-merging to apply.
> 
> Since the bit lattice and the const/copy lattice are merged
> we cannot track this during CCP propagation (make a copy out
> of the redundant call and propagate that out).  Thus I apply
> it in the CCP specific folding routine called during
> substitute_and_fold only.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

The following is what I have applied.

Richard.

2019-08-21  Richard Biener  <rguent...@suse.de>

        PR tree-optimization/91482
        * tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless
        BUILT_IN_ASSUME_ALIGNED calls.

        * gcc.dg/tree-ssa/pr91482.c: New testcase.

Index: gcc/tree-ssa-ccp.c
===================================================================
--- gcc/tree-ssa-ccp.c  (revision 274750)
+++ gcc/tree-ssa-ccp.c  (working copy)
@@ -2315,6 +2315,32 @@ ccp_folder::fold_stmt (gimple_stmt_itera
              }
           }
 
+       /* If there's no extra info from an assume_aligned call,
+          drop it so it doesn't act as otherwise useless dataflow
+          barrier.  */
+       if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED))
+         {
+           tree ptr = gimple_call_arg (stmt, 0);
+           ccp_prop_value_t ptrval = get_value_for_expr (ptr, true);
+           if (ptrval.lattice_val == CONSTANT
+               && TREE_CODE (ptrval.value) == INTEGER_CST
+               && ptrval.mask != 0)
+             {
+               ccp_prop_value_t val
+                 = bit_value_assume_aligned (stmt, NULL_TREE, ptrval, false);
+               unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ());
+               unsigned int align = least_bit_hwi (val.mask.to_uhwi ());
+               if (ptralign == align
+                   && ((TREE_INT_CST_LOW (ptrval.value) & (align - 1))
+                       == (TREE_INT_CST_LOW (val.value) & (align - 1))))
+                 {
+                   bool res = update_call_from_tree (gsi, ptr);
+                   gcc_assert (res);
+                   return true;
+                 }
+             }
+         }
+
        /* Propagate into the call arguments.  Compared to replace_uses_in
           this can use the argument slot types for type verification
           instead of the current argument type.  We also can safely
Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c     (nonexistent)
+++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c     (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */
+
+void write64 (void *p)
+{
+  unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8);
+  *p1++ = 0;
+  unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4);
+  *p2++ = 1;
+}
+
+/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 
2 stores" "store-merging" { target lp64 } } } */

Reply via email to