On 12/08/14 18:03, Kyrill Tkachov wrote: > > On 12/08/14 16:16, Kyrill Tkachov wrote: >> On 12/08/14 16:11, Kyrill Tkachov wrote: >>> On 12/08/14 15:22, Jeff Law wrote: >>>> On 08/12/14 04:31, Kyrill Tkachov wrote: >>>>> On 12/08/14 10:39, Richard Biener wrote: >>>>>> On Mon, Aug 11, 2014 at 9:56 PM, Jeff Law <l...@redhat.com> wrote: >>>>>>> On 08/11/14 07:41, Kyrill Tkachov wrote: >>>>>>>> I haven't been able to get combine to match the comparison+xor+neg+plus >>>>>>>> RTL and it seems like it would be just a workaround to undo the >>>>>>>> tree-level transformation. >>>>>>> Yea, it'd just be a workaround, but it's probably the easiest way to >>>>>>> deal >>>>>>> with this problem. Can you describe in further detail why you >>>>>>> weren't able >>>>>>> to get this to work? >>>>>> Too many instructions to combine I guess. You might want to add >>>>>> intermediate "combine" insn-and-splits. If that's still a no-go then >>>>>> read on. >>>> My guess was too many insns as well.. But that's often solvable. >>>> >>>>> From the combine dump I can see that it tried to combine up to: >>>>> (set (reg/i:SI 0 x0) >>>>> (plus:SI (xor:SI (neg:SI (reg:SI 84 [ D.2565 ])) >>>>> (reg:SI 73 [ D.2564 ])) >>>>> (reg:SI 84 [ D.2565 ]))) >>>> And did it find a match for this? What happens if (just for testing >>>> purposes), you create a pattern for this? Does combine then try >>>> something even more complex, possibly getting your conditional negation? >>> I managed to get combine to recognise this pattern: >>> (set (match_operand:GPI 0 "register_operand" "=r") >>> (plus:GPI (xor:GPI (neg:GPI (match_operand:GPI 1 >>> "register_operand" "r")) >>> (match_operand:GPI 2 "register_operand" "r")) >>> (match_dup 1))) >>> >>> Now what I need is for operand 1 to instead be a cc_reg comparison, but >>> when I change operand 1 to this pattern: >>> >>> (set (match_operand:GPI 0 "register_operand" "=r") >>> (plus:GPI (xor:GPI (neg:GPI (match_operator:GPI 1 >>> "aarch64_comparison_operator" >>> [(match_operand:CC 2 "cc_register" "") >>> (const_int 0)])) >>> (match_operand:GPI 3 "register_operand" "r")) >>> (match_dup 1))) >>> >>> This doesn't match. Is there any way to express this in a combineable >>> pattern? >> argh, Thunderbird enforced the 80 character limit... >> >> The pattern that matched was: >> (set (match_operand:GPI 0 "register_operand" "=r") >> (plus:GPI >> (xor:GPI >> (neg:GPI (match_operand:GPI 1 "register_operand" "r")) >> (match_operand:GPI 2 "register_operand" "r")) >> (match_dup 1))) > > If I match that to a define_and_split and split it into two sets: > (set (reg1) (neg reg2)) > (set (plus (xor (reg1) (reg3)) > (reg2))) > > and then add a define_insn with the full pattern: > > (set (match_operand:GPI 0 "register_operand" "=r") > (plus:GPI > (xor:GPI > (neg:GPI > (match_operator:GPI 1 "aarch64_comparison_operator" > [(match_operand:CC 2 "cc_register" "") > (const_int 0)])) > (match_operand:GPI 3 "register_operand" "r")) > (match_dup 1))) > > Then this does manage to match the full thing that I want. But I had to > add another define_split to break up the plus+xor Frankenstein's monster > back into two separate insns for the cases where it picks up > non-conditional-negate patterns. > > Does that seem reasonable or too much of a hack? Any plus+xor rtxs that > get left after combine should be split up again relatively quickly in > split1 and shouldn't inhibit further optimisation too badly, no? > >
The problem with the frankenmonster patterns is that they tend to proliferate into the machine description, and before you know where you are the back-end is full of them. Furthermore, they are very sensitive to the greedy first-match nature of combine: a better, later, combination is missed because a less good, earlier, optimization matched. If the first insn in the sequence is merged into an earlier instruction then you can end up with a junk sequence that completely fails to simplify. That ends up with super-frankenmonster patterns to deal with all the subcases and the problems grow exponentially from there. I really do think that the best solution would be to try and catch this during expand if possible and generate the right pattern from the start; then you don't risk combine failing to come to the rescue after several intermediate transformations have taken place. > Kyrill >> >> And the one that failed to combine (with operand 1 substituted for a >> cc_reg comparison) was: >> >> (set (match_operand:GPI 0 "register_operand" "=r") >> (plus:GPI >> (xor:GPI >> (neg:GPI >> (match_operator:GPI 1 "aarch64_comparison_operator" >> [(match_operand:CC_ZERO 2 "cc_register" "") >> (const_int 0)])) >> (match_operand:GPI 3 "register_operand" "r")) >> (match_dup 1))) >> >>> Thanks, >>> Kyrill >>> >>> >>>>> On the other hand, I did manage to write a peephole2 that detected the >>>>> sequence of compare+neg+xor+plus and transformed it into the >>>>> if_then_else form that our current conditional negation pattern has, >>>>> although I'm not sure how flexible this is. >>>> Probably not very. We really should be looking at combine. In fact, I >>>> would argue that we should be looking at combine regardless of whether >>>> or not we twiddle expansion as humans or machine generated code could >>>> look like this... >>>> >>>> jeff >>>> >>>> >>> >> >> > > >