[Bug middle-end/66565] New: Problems and limitation GCC cost metrics and ways to improve the situation

ktkachov at gcc dot gnu.org Wed, 17 Jun 2015 02:48:25 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66565


            Bug ID: 66565
           Summary: Problems and limitation GCC cost metrics and ways to
                    improve the situation
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
                CC: dje at gcc dot gnu.org
  Target Milestone: ---

In preparation for the GNU Cauldron 2015 costs BoF here are some talking points
gathered from various GCC developers. Feel free to your own gripes, proposals,
concerns.



* We have many different types of costs: rtx costs, branch costs, address
costs, vector costs etc. that measure different things in different units.
  This causes a problem when they are compared against each other. Example:
IVOPTs calculate complex profitability metrics and use both rtx and address
costs,
  mixing and adding them together. Any changes in either can cause unexpected
variations in the codegen.

* Costs are frequently compared to magic constants that have been picked for a
particular testcase on a particular target
  Example: loop-invariant.c: create_new_invariant compares an address cost to 3
because it helped bwaves on x86.

* Implementing rtx costs in the backend usually involves writing a dumbed-down
recog to pattern match on the rtxes. This is duplication of logic.

* The midend frequently builds bogus rtxes that are never going to match
anything and then ask the backend to assign a cost.
  This unnecessarily complicates backend rtx costs implementations and is hard
to reason about when the midend just wants a measure of 'complexity'.

* rtx costs need to be able express whether they are being used for calculating
the result delay or for understanding how long it is before another
 (related – for some definition of related) instruction can be issued.
Furthermore, result delay may depend on the context of how that
 result is used (eg due to pipeline forwarding).

* I'm not sure whether it is necessary to distinguish between early
availability of a result vs late use of a result.
  Branch costs really need to understand more about how predictable a branch
is, not how strongly taken or not taken
  that branch is. With modern branch predictors a branch can be neither
strongly taken nor strongly not taken yet be entirely
  predictable (eg if it alternates precisely between the two states). It's true
that traditional static profiling tools do not do
  a good job of identifying the pattern of taken/not-taken that a single branch
follows, let alone whether the branch is taken often
  enough with temporal locality to be likely predictable at all.

* "Cost" is too vacuous a term with too overloaded a meaning and too low a
resolution for the queries we actually want to make.
  The midend always wants a cost so that it can compare two strategies and pick
the more optimal.
  The proliferation and misunderstanding of cost units is a side effect of
this. So we really have to move to something like:

  target_should_inline_call (<interesting call information from the pass>)
  target_should_ifconvert (<gimple sequences for before/after transformation>)
  target_combine_is_profitable_p (<before, after rtx>)
  target_should_reconstruct_value_p (<candidate RTX to either build and keep in
registers or reconstruct in loop>)
  target_choose_preferred_addressing_mode (<list of candidate addressing modes
for a MEM>)

  Where we ask specific and meaningful questions of the target, rather than
asking for a guess at costs and using that.
  The difficult argument to make will be that we have to be radical about this
and prepared to regress many targets (in the short term)
  for the sake of ending up with a maintainable and understandable design for
making these decisions. Otherwise we'll sink years
  of engineering time in to trying to figure out the meaning of the number 3,
and we'll design
  something very poor which tries to emulate the old behaviour.

* Can we use the (C++) type system to:
 (a) prevent costs from being compared to things that aren't costs, like magic
numbers;
 (b) prevent comparisons between costs of different units;
 (c) enable multiple axes i.e. throughput, latency, codesize.

* The documentation and guidance for what costs mean is poor. And it probably
cannot be improved
while costs have these ambiguous inconsistent uses.

[Bug middle-end/66565] New: Problems and limitation GCC cost metrics and ways to improve the situation

Reply via email to