https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78847
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amker at gcc dot gnu.org, | |rguenth at gcc dot gnu.org --- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- And the real inefficiency is niter analysis returning unsimplified (unsigned long) &MEM[(void *)&foo + 9B] - (unsigned long) ((const char *) &foo.ascii_ + 1) from here: #1 0x000000000134f088 in number_of_iterations_ne (loop=0x7ffff5534550, type=<pointer_type 0x7ffff569f3f0>, iv=0x7fffffffd4b0, final=<addr_expr 0x7ffff51b4620>, niter=0x7fffffffd5e0, exit_must_be_taken=true, bnds=0x7fffffffd3d0) at /space/rguenther/src/svn/gcc-7-branch/gcc/tree-ssa-loop-niter.c:991 991 fold_convert (niter_type, iv->base)); (gdb) l 986 else 987 { 988 s = fold_convert (niter_type, iv->step); 989 c = fold_build2 (MINUS_EXPR, niter_type, 990 fold_convert (niter_type, final), 991 fold_convert (niter_type, iv->base)); where final: &MEM[(void *)&foo + 9B] iv->base: (const char *) &foo.ascii_ + 1 one could either try to enhance the associate: case of fold_binary or use tree-affine above, like with Index: gcc/tree-ssa-loop-niter.c =================================================================== --- gcc/tree-ssa-loop-niter.c (revision 245276) +++ gcc/tree-ssa-loop-niter.c (working copy) @@ -42,6 +42,7 @@ along with GCC; see the file COPYING3. #include "tree-chrec.h" #include "tree-scalar-evolution.h" #include "params.h" +#include "tree-affine.h" /* The maximum number of dominator BBs we search for conditions @@ -986,9 +987,12 @@ number_of_iterations_ne (struct loop *lo else { s = fold_convert (niter_type, iv->step); - c = fold_build2 (MINUS_EXPR, niter_type, - fold_convert (niter_type, final), - fold_convert (niter_type, iv->base)); + aff_tree aff_ivbase, aff_final; + tree_to_aff_combination (iv->base, niter_type, &aff_ivbase); + tree_to_aff_combination (final, niter_type, &aff_final); + aff_combination_scale (&aff_ivbase, -1); + aff_combination_add (&aff_ivbase, &aff_final); + c = aff_combination_to_tree (&aff_ivbase); } mpz_init (max); though this is horribly incomplete (fixes the testcase though) and if, then niter analysis should use aff_trees throughout (not re-build trees for the intermediate stuff). With that patch we immediately get '9' as the number of iterations and thus ldist produces <bb 2> [10.00%]: _18 = buf__9(D) + ptr_6(D); __builtin_memcpy (_18, &foo, 9);