HAO CHEN GUI <[email protected]> writes:
> Hi,
> This patch replaces rtx_cost with insn_cost in forward propagation.
> In the PR, one constant vector should be propagated and replace a
> pseudo in a store insn if we know it's a duplicated constant vector.
> It reduces the insn cost but not rtx cost. In this case, the cost is
> determined by destination operand (memory or pseudo). Unfortunately,
> rtx cost can't help.
>
> The test case is added in the second target specific patch.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643995.html
>
> Compared to previous version, the main change is not to do
> substitution if either new or old insn cost is zero. The zero means
> the cost is unknown.
>
> Previous version
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643994.html
>
> Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?
>
> ChangeLog
> fwprop: Replace set_src_cost with insn_cost in try_fwprop_subst_pattern
>
> gcc/
> * fwprop.cc (try_fwprop_subst_pattern): Replace set_src_cost with
> insn_cost.
Thanks for doing this. It's definitely the right direction, but:
> patch.diff
> diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc
> index cb6fd6700ca..184a22678b7 100644
> --- a/gcc/fwprop.cc
> +++ b/gcc/fwprop.cc
> @@ -470,21 +470,19 @@ try_fwprop_subst_pattern (obstack_watermark &attempt,
> insn_change &use_change,
> redo_changes (0);
> }
>
> - /* ??? In theory, it should be better to use insn costs rather than
> - set_src_costs here. That would involve replacing this code with
> - change_is_worthwhile. */
...as hinted at in the comment, rtl-ssa already has a routine for
insn_cost-based calculations. It has two (supposed) advantages:
it caches the old costs, and it takes execution frequency into
account when optimising for speed.
The comment is out of date though. The name of the routine is
changes_are_worthwhile rather than change_is_worthwhile. Could you
try using that instead?
Richard
> bool ok = recog (attempt, use_change);
> if (ok && !prop.changed_mem_p () && !use_insn->is_asm ())
> - if (rtx use_set = single_set (use_rtl))
> + if (single_set (use_rtl))
> {
> bool speed = optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_rtl));
> + auto new_cost = insn_cost (use_rtl, speed);
> temporarily_undo_changes (0);
> - auto old_cost = set_src_cost (SET_SRC (use_set),
> - GET_MODE (SET_DEST (use_set)), speed);
> + /* Invalide recog data. */
> + INSN_CODE (use_rtl) = -1;
> + auto old_cost = insn_cost (use_rtl, speed);
> redo_changes (0);
> - auto new_cost = set_src_cost (SET_SRC (use_set),
> - GET_MODE (SET_DEST (use_set)), speed);
> - if (new_cost > old_cost
> + if (new_cost == 0 || old_cost == 0
> + || new_cost > old_cost
> || (new_cost == old_cost && !prop.likely_profitable_p ()))
> {
> if (dump_file)