On Tue, Jun 17, 2014 at 4:11 AM, Zhenqiang Chen <zhenqiang.c...@linaro.org> wrote: > Hi, > > For some large constant, ports like ARM, need one more instructions to > operate it. e.g > > #define MASK 0xfe00ff > void maskdata (int * data, int len) > { > int i = len; > for (; i > 0; i -= 2) > { > data[i] &= MASK; > data[i + 1] &= MASK; > } > } > > Need two instructions for each AND operation: > > and r3, r3, #16711935 > bic r3, r3, #65536 > > If we keep the MASK in a register, loop2_invariant pass can hoist it > out the loop. And it can be shared by different references. > > So the patch skips constant propagation if it makes INSN's cost higher.
So cprop undos invariant motions work here? Should we make sure we add a REG_EQUAL note when not propagating? > Bootstrap and no make check regression on X86-64 and ARM Chrome book. > > OK for trunk? > > Thanks! > -Zhenqiang > > ChangeLog: > 2014-06-17 Zhenqiang Chen <zhenqiang.c...@linaro.org> > > * cprop.c (try_replace_reg): Check cost for constants. > > diff --git a/gcc/cprop.c b/gcc/cprop.c > index aef3ee8..c9cf02a 100644 > --- a/gcc/cprop.c > +++ b/gcc/cprop.c > @@ -733,6 +733,14 @@ try_replace_reg (rtx from, rtx to, rtx insn) > rtx src = 0; > int success = 0; > rtx set = single_set (insn); > + int old_cost = 0; > + bool copy_p = false; > + bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn)); > + > + if (set && SET_SRC (set) && REG_P (SET_SRC (set))) > + copy_p = true; > + else > + old_cost = set_rtx_cost (set, speed); Looks bogus for set == NULL? Also what about register pressure? I think this kind of change needs wider testing as RTX costs are usually not fully implemented and you introduce a new use kind (or is it already used elsewhere in this way to compute cost difference of a set with s/reg/const?). What kind of performance difference do you see? Thanks, Richard. > /* Usually we substitute easy stuff, so we won't copy everything. > We however need to take care to not duplicate non-trivial CONST > @@ -740,6 +748,20 @@ try_replace_reg (rtx from, rtx to, rtx insn) > to = copy_rtx (to); > > validate_replace_src_group (from, to, insn); > + > + /* For CONSTANT_P (TO), loop2_invariant pass might hoist it out the loop. > + And it can be shared by different references. So skip propagation if > + it makes INSN's rtx cost higher. */ > + if (set && !copy_p && CONSTANT_P (to)) > + { > + int new_cost = set_rtx_cost (set, speed); > + if (new_cost > old_cost) > + { > + cancel_changes (0); > + return false; > + } > + } > + > if (num_changes_pending () && apply_change_group ()) > success = 1;