On Thu, Jan 6, 2022 at 11:39 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> This is the third iteration of a patch to perceive MULT_HIGHPART_EXPR
> in the middle-end.  As they say "the third time's a charm".  The first
> version implemented this in match.pd, which was considered too early.
> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551316.html
> The second version attempted to do this during RTL expansion, and was
> considered to be too late in the middle-end.
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576922.html
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576923.html
>
> This latest version incorporates Richard Biener's feedback/suggestion
> to perceive MULT_HIGHPART_EXPR in one of the "instruction selection
> passes", specifically tree-ssa-math-opts, where the recognition of
> highpart multiplications takes place in the same pass as widening
> multiplications.
>
> With each rewrite, the patch is also getting more aggressive in the
> set of widening multiplications that it recognizes as highpart multiplies.
> Currently any widening multiplication followed by a right shift (either
> signed or unsigned) by a bit count sufficient to eliminate the lowpart
> is recognized.  The result of this shift doesn't need to be truncated.
> As previously, this patch confirms the target provides a suitable
> optab before introducing the MULT_HIGHPART_EXPR.  This is the reason
> the testcase is restricted to x86_64, as this pass doesn't do anything
> on some platforms, but x86_64 should be sufficient to confirm that the
> pass is working/continues to work.
>
> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check (both with and without --target_board='unix{-m32}')
> with no new regressions.  Ok for mainline?

Few nits:

+static bool
+convert_mult_to_highpart (gimple *stmt, gimple_stmt_iterator *gsi)
+{
+  tree lhs = gimple_assign_lhs (stmt);

since you assume 'stmt' is a GIMPLE assignment please statically
type it as 'gassign *'.

+  gimple *def = SSA_NAME_DEF_STMT (sarg0);
+  if (!is_gimple_assign (def))
+    return false;

could be written as

    gassign *def = dyn_cast <gassign *> (SSA_NAME_DEF_STMT (sarg0));
    if (!def)
      return false;

as well to make the followup code cheaper.

+  tree mop1 = gimple_assign_rhs1 (def);
+  tree mop2 = gimple_assign_rhs2 (def);
+  tree optype = TREE_TYPE (mop1);
+  bool unsignedp = TYPE_UNSIGNED (optype);
+  unsigned int prec = TYPE_PRECISION (optype);
+
+  if (optype != TREE_TYPE (mop2)

I think mop1 and mop2 have to be compatible types (the tree-cfg.c
GIMPLE verification only tests for same precision it seems but tree.def
says they are of type T1).  That said, I think optype != TREE_TYPE (mop2)
is superfluous and too conservative at it.  Bonus points for improving the
WIDEN_MULT_EXPR IL verification.  I hope the signs of the result and
the operands are the same though neiher does tree.def specify that nor
tree-cfg.c verify it.

+  /* The above transformation often creates useless type conversions, i.e.
+     when followed by a truncation, that aren't cleaned up elsewhere.  */
+  gimple *use_stmt;
+  imm_use_iterator use_iter;
+  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, lhs)
+    if (gimple_assign_cast_p (use_stmt))
+      {

I don't like this much but I see how this looks like a useful thing to do.
Since we're close to RTL expansion it might not matter much or does
this have an effect on final code generation?  Implementation wise
I'd rather have sth like the simple_dce_from_worklist helper - have
a helper that folds use stmts of SSA names whose definition stmt
we changed in a worklist manner.  Ideally this would do what
forwprop does on a stmt, and ideally ordered in a way that if
we fold two of a stmts uses defs we only fold that stmt once after
we've folded both use defs.

I'm not sure iterating with FOR_EACH_IMM_USE_STMT and at
the same time fiddling with immediate uses of that def is OK,
IIRC it can easily break the iteration scheme which applies sorting
and a marker.  So to fix that you need some kind of worklist
anyhow which means you could do a more simplistic
fold_stmt () on that.

If the change works good enough even w/o the use folding the
patch looks good to me with that stripped.

Thanks,
Richard.

>
> 2022-01-06  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * tree-ssa-math-opts.c (struct widen_mul_stats): Add a
>         highpart_mults_inserted field.
>         (convert_mult_to_highpart): New function to convert right shift
>         of a widening multiply into a MULT_HIGHPART_EXPR.
>         (math_opts_dom_walker::after_dom_children) [RSHIFT_EXPR]:
>         Call new convert_mult_to_highpart function.
>         (pass_optimize_widening_mul::execute): Add a statistics counter
>         for tracking "highpart multiplications inserted" events.
>
> gcc/testsuite/ChangeLog
>         * gcc.target/i386/mult-highpart.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>

Reply via email to