Re: [PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

Bernd Schmidt Tue, 15 Dec 2015 05:22:48 -0800

On 12/14/2015 01:25 PM, Kyrill Tkachov wrote:

For this PR I want to teach combine to deal with unrecognisable patterns
that contain a sub-expression like
(x + x) by transforming it into (x << 1) and trying to match the result.
This is because some instruction
sets like arm and aarch64 can combine shifts with other arithmetic
operations or have shifts in their RTL representation
of more complex operations (like the aarch64 UBFIZ instruction which can
be expressed as a zero_extend+ashift pattern).


Due to a change in rtx costs for -mcpu=cortex-a53 in GCC 5 we no longer
expand an expression like x * 2 as x << 1
but rather as x + x, which hurts combination opportunities dues to this
deficiency.

This patch addresses the issue in the recog_for_combine function in
combine.c in a similar way to the change_zero_ext
trick. That is, if it recog_for_combine fails to match a pattern it
replaces all instances of x + x in the
rtx with x << 1 and tries again.

This way I've been able to get combine to more aggressively generate the
arithmetic+shift forms of instructions for
-mcpu=cortex-a53 on aarch64 as well as instructions like ubfiz and sbfiz
that contain shift-by-immediate sub-expressions.

This patch shouldn't affect rtxes that already match, so it should have
no fallout on other cases.

I'm somewhat undecided on this. If we keep adding cases to thismechanism, the run time costs will eventually add up (we'll iterate overthe pattern over and over again if it doesn't match, which is the normalcase in combine), and we're still not testing combinations of thesereplacements.

I wonder if it would be possible to have genrecog write a specialrecognizer that can identify cases where a pattern would match if it waschanged. Something along the lines of


recog_for_combine (..., vec<..> *replacements)
{
....
  /* Trying to recognize a shift.  */
  if (GET_CODE (x) == PLUS && rtx_equal_p (XEXP (x, 0), XEXP (x, 1)))
    replacements->safe_push (...)
}

Seems like it would be more efficient and more flexible.


Bernd

Re: [PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

Reply via email to