Re: [PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

Kyrill Tkachov Wed, 16 Dec 2015 09:01:37 -0800


On 16/12/15 12:18, Bernd Schmidt wrote:

On 12/15/2015 05:21 PM, Kyrill Tkachov wrote:

Then for the shift pattern in the MD file we'd have to dynamically
select the scheduling type depending on whether or not the shift
amount is 1 and the costs line up?


Yes. This isn't unusual, take a look at i386.md where you have a lot of 
switches on attr type to decide which string to print.


I'm just worried that if we take this idea to its logical conclusion, we have 
to add a new canonicalisation rule:
"all (plus x x) expressions shall be expressed as (ashift x 1)".
Such a rule seems too specific to me and all targets would have to special-case 
it in their MD patterns and costs
if they ever wanted to treat an add and a shift differently.
In this particular case we'd have
to conditionalise the scheduling string selection on a particular CPU tuning 
and the shift amount, which will make
the pattern much harder to read.
To implement this properly we'd also have to

The price we pay when trying these substitutions is an iteration over
 the rtx with FOR_EACH_SUBRTX_PTR. recog gets called only if that
iteration actually performed a substitution of x + x into x << 1. Is
that too high a price to pay? (I'm not familiar with the performance
 characteristics of the FOR_EACH_SUBRTX machinery)


It depends on how many of these transforms we are going to try; it also feels 
very hackish, trying to work around the core design of the combiner. IMO it 
would be better for machine descriptions to work with the pass rather than 
against it.


Perhaps I'm lacking the historical context, but what is the core design of the 
combiner?
Why should the backend have to jump through these hoops if it already 
communicates to the midend
(through correct rtx costs) that a shift is more expensive than a plus?
I'd be more inclined to agree that this is perhaps a limitation in recog rather 
than combine,
but still not a backend problem.

If you can somehow arrange for the (plus x x) to be turned into a shift while 
substituting that might be yet another approach to try.


I did investigate where else we could make this transformation.
For the zero_extend+shift case (the ubfiz instruction from the testcase in my 
original submission)
we could fix this by modifying make_extraction to convert its argument to a 
shift from (plus x x)
as, in that context, shifts are undoubtedly more likely to simplify with the 
various extraction
operations that it's trying to perform.

That leaves the other case (orr + shift), where converting to a shift isn't a
simplification in any way, but the backend happens to have an instruction that 
matches the
combined orr+shift form. There we want to perform the transformation purely to 
aid
recognition, not out of any simplification considerations. That's what I'm 
trying to figure out
how to do now.

However, we want to avoid doing it unconditionally because if we have just a 
simple set
of y = x + x we want to leave it as a plus rather than a shift because it's 
cheaper
on that target.

Thanks,
Kyrill


Bernd

Re: [PATCH][combine] PR rtl-optimization/68651 Try changing rtx from (r + r) to (r << 1) to aid recognition

Reply via email to