On Tue, 20 Aug 2019, Richard Biener wrote: > > Excessive use of __builtin_assume_aligned can cause missed optimizations > because those calls are propagation barriers. The following removes > those that are redundant and provide no extra information, on the > testcase allowng store-merging to apply. > > Since the bit lattice and the const/copy lattice are merged > we cannot track this during CCP propagation (make a copy out > of the redundant call and propagate that out). Thus I apply > it in the CCP specific folding routine called during > substitute_and_fold only. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
The following is what I have applied. Richard. 2019-08-21 Richard Biener <rguent...@suse.de> PR tree-optimization/91482 * tree-ssa-ccp.c (ccp_folder::fold_stmt): Remove useless BUILT_IN_ASSUME_ALIGNED calls. * gcc.dg/tree-ssa/pr91482.c: New testcase. Index: gcc/tree-ssa-ccp.c =================================================================== --- gcc/tree-ssa-ccp.c (revision 274750) +++ gcc/tree-ssa-ccp.c (working copy) @@ -2315,6 +2315,32 @@ ccp_folder::fold_stmt (gimple_stmt_itera } } + /* If there's no extra info from an assume_aligned call, + drop it so it doesn't act as otherwise useless dataflow + barrier. */ + if (gimple_call_builtin_p (stmt, BUILT_IN_ASSUME_ALIGNED)) + { + tree ptr = gimple_call_arg (stmt, 0); + ccp_prop_value_t ptrval = get_value_for_expr (ptr, true); + if (ptrval.lattice_val == CONSTANT + && TREE_CODE (ptrval.value) == INTEGER_CST + && ptrval.mask != 0) + { + ccp_prop_value_t val + = bit_value_assume_aligned (stmt, NULL_TREE, ptrval, false); + unsigned int ptralign = least_bit_hwi (ptrval.mask.to_uhwi ()); + unsigned int align = least_bit_hwi (val.mask.to_uhwi ()); + if (ptralign == align + && ((TREE_INT_CST_LOW (ptrval.value) & (align - 1)) + == (TREE_INT_CST_LOW (val.value) & (align - 1)))) + { + bool res = update_call_from_tree (gsi, ptr); + gcc_assert (res); + return true; + } + } + } + /* Propagate into the call arguments. Compared to replace_uses_in this can use the argument slot types for type verification instead of the current argument type. We also can safely Index: gcc/testsuite/gcc.dg/tree-ssa/pr91482.c =================================================================== --- gcc/testsuite/gcc.dg/tree-ssa/pr91482.c (nonexistent) +++ gcc/testsuite/gcc.dg/tree-ssa/pr91482.c (working copy) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ccp1 -fdump-tree-store-merging" } */ + +void write64 (void *p) +{ + unsigned *p1 = (unsigned *) __builtin_assume_aligned (p, 8); + *p1++ = 0; + unsigned *p2 = (unsigned *) __builtin_assume_aligned (p1, 4); + *p2++ = 1; +} + +/* { dg-final { scan-tree-dump-times "__builtin_assume_aligned" 1 "ccp1" } } */ +/* { dg-final { scan-tree-dump "New sequence of 1 stores to replace old one of 2 stores" "store-merging" { target lp64 } } } */