http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50955
--- Comment #11 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-31 14:02:20 UTC --- (In reply to comment #4) > It looks like IVOPTs fails to consider a candidate for the use inquestion > and thus, after choosing the final IV set ends up rewriting that use into > this stupid form (instead of using candidate 0). The use is > > use 3 > generic > in statement vect_p.60_223 = PHI <vect_p.60_224(16), vect_p.63_222(5)> > > at position > type vector(8) unsigned char * > base batmp.61_221 + 1 > step 8 > base object (void *) batmp.61_221 > is a biv > related candidates > > but costs say: > > Use 3: > cand cost compl. depends on > 0 8 1 inv_expr:0 > 1 4 1 inv_expr:1 > 2 4 1 inv_expr:0 > 3 4 1 inv_expr:1 > 4 4 1 inv_expr:2 > > Initial set of candidates: > cost: 24 (complexity 4) > cand_cost: 10 > cand_use_cost: 10 (complexity 4) > candidates: 0, 4 > use:0 --> iv_cand:4, cost=(2,1) > use:1 --> iv_cand:4, cost=(2,1) > use:2 --> iv_cand:4, cost=(2,1) > use:3 --> iv_cand:4, cost=(4,1) > use:4 --> iv_cand:0, cost=(0,0) > invariants 7 > > Selected IV set: > candidate 0 (important) > var_before ivtmp.107_150 > var_after ivtmp.107_256 > incremented before exit test > type unsigned int > base 0 > step 1 > candidate 4 (important) > var_before ivtmp.110_241 > var_after ivtmp.110_146 > incremented before exit test > type unsigned int > base (unsigned int) (&p1 + 8) > step 8 > base object (void *) &p1 > > so expressing use 3 with candidate 4 is cheaper than with candidate 0... > > Now, for address-uses we have > > if (address_p) > { > /* Do not try to express address of an object with computation based > on address of a different object. This may cause problems in rtl > level alias analysis (that does not expect this to be happening, > as this is illegal in C), and would be unlikely to be useful > anyway. */ > if (use->iv->base_object > && cand->iv->base_object > && !operand_equal_p (use->iv->base_object, cand->iv->base_object, > 0)) > return infinite_cost; > > in cost calculation, but in this case it's a nonlinear use, and we have > > (gdb) call debug_generic_expr (use->iv->base_object) > (void *) batmp.61_221 > (gdb) call debug_generic_expr (cand->iv->base_object) > (void *) &p1 So one could extend the address_p case to cover all address IVs, like with Index: gcc/tree-ssa-loop-ivopts.c =================================================================== --- gcc/tree-ssa-loop-ivopts.c (revision 183757) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -4048,7 +4048,11 @@ get_computation_cost_at (struct ivopts_d return infinite_cost; } - if (address_p) + if (address_p + || (use->iv->base_object + && cand->iv->base_object + && POINTER_TYPE_P (TREE_TYPE (use->iv->base_object)) + && POINTER_TYPE_P (TREE_TYPE (cand->iv->base_object)))) { /* Do not try to express address of an object with computation based on address of a different object. This may cause problems in rtl which avoids the issue for this testcase at the cost of choosing three IVs instead of two: Selected IV set: candidate 0 (important) var_before ivtmp.107_152 var_after ivtmp.107_258 incremented before exit test type unsigned int base 0 step 1 candidate 4 (important) var_before ivtmp.110_243 var_after ivtmp.110_148 incremented before exit test type unsigned int base (unsigned int) (&p1 + 8) step 8 base object (void *) &p1 candidate 11 (important) var_before ivtmp.115_259 var_after ivtmp.115_251 incremented before exit test type unsigned int base (unsigned int) (batmp.61_223 + 1) step 8 base object (void *) batmp.61_223 IVOPTs does not consider using candidate 0 for use 3: Use 3: cand cost compl. depends on 0 8 1 inv_expr:0 2 4 1 inv_expr:0 11 0 0 12 0 0 13 4 1 14 8 1 inv_expr:0 15 8 1 inv_expr:1 18 4 1 inv_expr:2 19 4 1 inv_expr:0 22 4 1 inv_expr:2 23 4 1 inv_expr:0 26 4 1 inv_expr:2 27 4 1 inv_expr:0 28 4 1 as it seems to favor adding another IV here. It seems to even generate better code, but it's a pretty big hammer ...