On 16.07.2017 00:51, Segher Boessenkool wrote:
Hi!

On Wed, Jul 12, 2017 at 03:15:09PM +0200, Georg-Johann Lay wrote:
the current cost computations in rtlanal.c and maybe other places
suffer from the fact that they are hiding parts of the expressions
from the back-end, like SET_DESTs of single_set or the anatomy of
PARALELLs.

Would it be in order to have a hook like the one attached?

I am aware of that, in an ideal world, there wouldn't be more
than one hook to get rtx costs.  But well...

The number of hooks is not a problem.  The overall complexity of this
interface (between the backends and optimisers; a group of related
hooks) _is_ a problem.  Also, the interface should be good for all
targets, easy to use, do one thing and do it well.

The interface is easy and straight forward, but the new hook might not
be as convenient as rtx_costs.  For example, testing for single_set is
a bit tedious when you don't have an insn.

There are situations where rtx_costs is called with outer_code = INSN;
cf. set_rtx_cost.

Using set_src_cost instead of a new hook that gracefully degrades
might make a hell break loose when rtx_costs just return some penalty
costs for unknown stuff instead of "0".

Therefore, a new hook is easier to use and understand, but it might
be confusing in what situations rtx_costs with outer_code = INSN is
used and in what situations patter_costs would be used.

Also not that set_src_cost is used incorrectly for set(zero_extract):
rtl.h states that it does "Return the cost of moving X into a register".
In rtlanal::insn_rtx_cost() however, this function is called for
*any* SETs on the SET_SRC, even if SET_DEST is not a register.

He might fix this by filtering out destinations that are zero_extract.
Subregs might be mem or regs, and the current implementation looks
good.

Moreover, insn_rtx_costs could do something like

rtx_cost (pat, VOIDmode, INSN, 4, speed_p)

when it sees a PARALLEL with more than 1 SET.

Currently we have rtx_costs, which computes an estimated cost for any
expression.  In most cases what we want to compute is the cost of an
instruction though!  It can be a single set (in which case the
expression cost is a reasonable approximation), but it also can be
something else (like, a parallel).  Also, on many targets at least,
it is *much* easier to compute the cost knowing that this is a valid
instruction.

So I argue that we want to have a hook for insn cost.

Now what should it take as input?  An rtx_insn, or just the pattern
(as insn_rtx_cost does)?  And if an rtx_insn, should that than be an
rtx_insn that is already linked into the insn chain (so we can see what
other insns generate its inputs, and which of its outputs are unused,
etc.)?

IMO, if we have an insn then we could also pass it, it doesn't cost
anything.  However I don't know if the insn might have some strange
properties like an INSN_CODE that doesn't reflect the actual pattern.
Ans if reg_notes might be helpful and correct resp. up to date.

It should be in order to run the hook from within a sequence, hence
"insn chain" might not be what the user expects, i.e. prev_insn might
be NULL even though the current function already has some insns.


One comment about your patch:

+/* A magic value to be returned by TARGET_INSN_COSTS to indicate that
+   the costs are not known or too complicated to work out there.  */
+#define INSN_COSTS_UNKNOWN (-1234567)

Why not just -1?  And is 0 really so terrible; in the extremely rare
case we really want cost 0, it won't hurt much saying "unknown", as we
do currently.  For "hook unimplemented", just set the hook to NULL.

Segher

Using "0" is really bad design.  It's not clear when you are reading the
sources that it is some magic value.  Some identifier to refer to such
a value is much better.  And 0 is also bad because there might be
situations where 0 is a legal cost.  There might even be situations
where you want negative cost in order to prefer a specific pattern
re. some alternative (cost functions are not always 100% correct and
are expected costs as they run before reg alloc etc.)

My second proposal avoided magic value altogether and uses a bool*
instead.  My proposal below is similar, but is passes the cost
in a int*.  As the call site will have to check whether the hook
computed the costs, it might be more convenient to return whether
or not the hook came up with the costs or not.

Johann

Index: target.def
===================================================================
--- target.def	(revision 250090)
+++ target.def	(working copy)
@@ -3558,6 +3558,39 @@ DEFHOOKPOD
  appropriately.",
  unsigned int, INVALID_REGNUM)
 
+/* Compute the costs of an insn pattern. */
+DEFHOOK
+(pattern_costs,
+"This target hook describes the relative costs of an insn pattern.\n\
+\n\
+@var{pattern} is the pattern of an insn or an rtx that is supposed\n\
+to be used as the pattern of an insn the remainder of the compilation.\n\
+\n\
+If the hook computed the cost, set @var{*costs} to the estimated costs\n\
+and return true.  If the hook implementation choses not to compute the\n\
+costs for some reason, return false.  This directs the middle-end to use\n\
+other approaches to get the respective costs like calling\n\
+@code{TARGET_RTX_COSTS} for sub-rtxes of @var{pattern}.\n\
+\n\
+When optimizing for execution speed, i.e.@: when @var{speed} is\n\
+true, this hook should be used to estimate the relative execution cost\n\
+of the pattern, and the size cost of the pattern if @var{speed} is false.\n\
+\n\
+Use this pattern if a @code{SET_DEST} is needed to calculate the correct\n\
+costs or when you want to see the whole of a @code{PARALLEL} and not only\n\
+parts of it. Notice that for a @code{single_set} you might see a\n\
+@code{PARALLEL} @var{pattern} that contains a @code{SET} together with\n\
+@code{COBBER}s.\n\
+\n\
+In implementing this hook, you can use @code{COSTS_N_INSNS (@var{n})}\n\
+to specify a cost equal to @var{n} instructions.\n\
+\n\
+Don't implement this hook if it would always return false.\n\
+Even if this hook handles all cases, you still need to implement\n\
+@code{TARGET_RTX_COSTS}.",
+ bool, (rtx pattern, bool speed, int *costs),
+ default_pattern_costs)
+
 /* Compute a (partial) cost for rtx X.  Return true if the complete
    cost has been computed, and false if subexpressions should be
    scanned.  In either case, *TOTAL contains the cost result.  */
Index: targhooks.c
===================================================================
--- targhooks.c	(revision 250090)
+++ targhooks.c	(working copy)
@@ -2108,4 +2108,10 @@ default_excess_precision (enum excess_pr
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT;
 }
 
+bool
+default_pattern_costs (rtx, bool, int*)
+{
+  return false;
+}
+
 #include "gt-targhooks.h"
Index: targhooks.h
===================================================================
--- targhooks.h	(revision 250090)
+++ targhooks.h	(working copy)
@@ -264,4 +264,6 @@ extern unsigned int default_min_arithmet
 extern enum flt_eval_method
 default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED);
 
+extern bool default_pattern_costs (rtx, bool, int*);
+
 #endif /* GCC_TARGHOOKS_H */
Index: rtlanal.c
===================================================================
--- rtlanal.c	(revision 250090)
+++ rtlanal.c	(working copy)
@@ -5263,6 +5263,9 @@ insn_rtx_cost (rtx pat, bool speed)
   int i, cost;
   rtx set;
 
+  if (targetm.pattern_costs (pat, speed, &cost))
+    return cost;
+
   /* Extract the single set rtx from the instruction pattern.  We
      can't use single_set since we only have the pattern.  We also
      consider PARALLELs of a normal set and a single comparison.  In
@@ -5318,6 +5321,13 @@ seq_cost (const rtx_insn *seq, bool spee
 
   for (; seq; seq = NEXT_INSN (seq))
     {
+      int pattern_costs;
+      if (INSN_P (seq)
+	  && targetm.pattern_costs (PATTERN (seq), speed, &pattern_costs))
+	{
+	  cost += pattern_costs;
+	  continue;
+	}
       set = single_set (seq);
       if (set)
         cost += set_rtx_cost (set, speed);
Index: doc/tm.texi.in
===================================================================
--- doc/tm.texi.in	(revision 250123)
+++ doc/tm.texi.in	(working copy)
@@ -4790,6 +4790,8 @@ @samp{fold_range_test ()} is optimal.  T
 
 @hook TARGET_OPTAB_SUPPORTED_P
 
+@hook TARGET_PATTERN_COSTS
+
 @hook TARGET_RTX_COSTS
 
 @hook TARGET_ADDRESS_COST

Reply via email to