On Mon, Sep 23, 2019 at 12:56 PM Andrew Stubbs <andrew_stu...@mentor.com> wrote: > > Hi All, > > I'm trying to figure out how to prevent LRA selecting alternatives that > result in values being copied from A to B for one instruction, and then > immediately back from B to A again, when there are apparently more > sensible alternatives available. > > I have an insn with the following pattern (simplified here): > > [(set (match_operand:DI 0 "register_operand" "=Sg,v") > (ashift:DI > (match_operand:DI 1 "gcn_alu_operand" " Sg,v") > (match_operand:SI 2 "gcn_alu_operand" " Sg,v"))) > (clobber (match_scratch:BI 3 "=cs,X"))] > > There are two lshl instructions; one for scalar registers and one for > vector registers. The vector here has only a single element, so the two > are equivalent, but we need to pick one. > > This operation works for both register files, but there are other > operations that exist only on one side or the other, so we want those to > determine in which register file the values are allocated. > > Unfortunately, the compiler (almost?) exclusively selects the second > alternative, even when this means moving the values from one register > file to the other, and then back again. > > The problem is that the scalar instruction clobbers the CC register, > which results in a "reject++" for that alternative in the LRA dump. > > I can fix this by disparaging the second alternative in the pattern: > > (clobber (match_scratch:BI 3 "=cs,?X")) > > This appears to do the right thing. I can now see both kinds of shift > appearing in the assembly dumps. > > But that does "reject+=6", which makes me worry that the balance has now > shifted too far the other way. > > Does this make sense? > > (clobber (match_scratch:BI 3 "=^cs,?X")) > > Is there a better way to discourage the copies? Perhaps without editing > all the patterns? > > What I want is for the two alternatives to appear equal when the CC > register is not live, and when CC is live for LRA to be able to choose > between reloading CC or switching to the other alternative according to > the situation, not on the pattern alone. > > Thanks in advance.
I've faced a similar situation for PR91154 where for x86 there's a vector min/max but no scalar one and the vector min/max is faster than the conditional move sequence on the integer side. In the end the extra freedom for the RA is bad since it doesn't do a global decision on the initial regset preference. So - don't do it. In the simpler case as it is yours it would indeed be nice if it just worked... Richard. > > Andrew