Alex Coplan <alex.cop...@arm.com> writes: >> At the risk of feature creep :-) a useful third pattern could be >> to combine a zero-extended operator result with an existing DImode value. >> In that case, the existing DImode value really can be "rZ" and should >> always be in the "else" arm of the if_then_else. E.g.: >> >> unsigned long long >> f(int a, unsigned long b, unsigned c) >> { >> return a ? b : ~c; >> } >> >> unsigned long long >> g(int a, unsigned c) >> { >> return a ? 0 : ~c; >> } >> >> But that's definitely something that can be left for later. > > Hm. What sequence would you like to see for the function f here with the > argument already in DImode? I don't think we can do it with just a CMP + CSINV > like in the other test cases because you end up wanting to access b as x1 and > c > as w2 which is not allowed.
Yeah, I was hoping for something like... > Indeed, clang generates a MVN + CSEL sequence where the CSEL operates on the > 64-bit registers: > > f: > mvn w8, w2 > cmp w0, #0 > csel x0, x8, x1, eq > ret ...this rather than the 4-insn (+ret) sequence that we currently generate. So it would have been a define_insn_and_split that handles the zero case directly but splits into the "optimal" two-instruction sequence for registers. But I guess the underlying problem is instead that we don't have a pattern for (zero_extend:DI (not:SI ...)). Adding that would definitely be a better fix. > However, I agree that the restricted case where the else arm is (const_int 0) > is > a worthwhile optimization (that wasn't caught by the previous patterns due to > const_ints never being zero_extended as you mentioned). I've added a pattern > and > tests for this case in the updated patch. Looks good, just a couple of minor things: > 2020-04-30 Alex Coplan <alex.cop...@arm.com> > > * config/aarch64/aarch64.md (*csinv3_utxw_insn1): New. > (*csinv3_uxtw_insn2): New. > (*csinv3_uxtw_insn3): New. ChangeLog trivia, but these last two lines should only be indented by a tab. Hopefully one day we'll finally ditch this format and stop having to quibble over such minor formatting stuff... > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index c7c4d1dd519..37d651724ad 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -4390,6 +4390,45 @@ > [(set_attr "type" "csel")] > ) > > +(define_insn "*csinv3_uxtw_insn1" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (if_then_else:DI > + (match_operand 1 "aarch64_comparison_operation" "") > + (zero_extend:DI > + (match_operand:SI 2 "register_operand" "r")) > + (zero_extend:DI > + (NEG_NOT:SI (match_operand:SI 3 "register_operand" "r")))))] > + "" > + "cs<neg_not_cs>\\t%w0, %w2, %w3, %m1" > + [(set_attr "type" "csel")] > +) > + > +(define_insn "*csinv3_uxtw_insn2" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (if_then_else:DI > + (match_operand 1 "aarch64_comparison_operation" "") > + (zero_extend:DI > + (NEG_NOT:SI (match_operand:SI 2 "register_operand" "r"))) > + (zero_extend:DI > + (match_operand:SI 3 "register_operand" "r"))))] > + "" > + "cs<neg_not_cs>\\t%w0, %w3, %w2, %M1" > + [(set_attr "type" "csel")] > +) > + > +(define_insn "*csinv3_uxtw_insn3" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (if_then_else:DI > + (match_operand 1 "aarch64_comparison_operation" "") > + (zero_extend:DI > + (NEG_NOT:SI (match_operand:SI 2 "register_operand" "r"))) > + (const_int 0)))] > + "" > + "cs<neg_not_cs>\\t%w0, wzr, %w2, %M1" > + [(set_attr "type" "csel")] > +) > + > + Usually there's just one blank line between patterns, even if the patterns aren't naturally grouped. No need to repost just for that. I'll push with those changes once stage 1 opens. Thanks, Richard