> +;; A conjucate is a rotation of 180* around the argand plane, or * I.
Hmm, but a complex conjugate is a reflection around the real axis rather than a rotation. Also, 180 degrees around the Argand plane is * -1 rather than * I. So… > +(define_int_attr rot_op [(UNSPEC_FCMLS "") > + (UNSPEC_FCMLS180 "_conj") > + (UNSPEC_FCMLA "") > + (UNSPEC_FCMLA180 "_conj") > + (UNSPEC_FCMUL "") > + (UNSPEC_FCMUL180 "_conj") > + (UNSPEC_CMLS "") > + (UNSPEC_CMLA "") > + (UNSPEC_CMLA180 "_conj") > + (UNSPEC_CMUL "") > + (UNSPEC_CMUL180 "_conj")]) > + > +(define_int_attr rotsplit1 [(UNSPEC_FCMLA "0") > + (UNSPEC_FCMLA180 "0") > + (UNSPEC_FCMUL "0") > + (UNSPEC_FCMUL180 "0") > + (UNSPEC_FCMLS "270") > + (UNSPEC_FCMLS180 "90")]) > + > +(define_int_attr rotsplit2 [(UNSPEC_FCMLA "90") > + (UNSPEC_FCMLA180 "270") > + (UNSPEC_FCMUL "90") > + (UNSPEC_FCMUL180 "270") > + (UNSPEC_FCMLS "180") > + (UNSPEC_FCMLS180 "180")]) I think it would be worth clarifying what the patterns do. AIUI, the effect of the MUL180 and MLA180 “rot_op” cases as written in the patch is to multiply the conjugate of the first operand by the second operand, whereas the documentation in the earlier patches implied that they should multiply the conjugate of the second operand by the first. But maybe I've got this wrong. operands to the pattern: a + bi, c + di conjugate of first operand * second operand [A]: (a - bi) * (c + di) = (ac + bd) + (ad - bc)i first operand * conjugate of second operand [B]: (a + bi) * (c - di) = (ac + bd) + (bc - ad)i the FCMUL180 sequence chosen by the patch: 0: a * (c + di) = ac + adi 270: b * (d - ci) = bd - bci which gives A rather than B. > +;; The complex mla/mls operations always need to expand to two instructions. > +;; The first operation does half the computation and the second does the > +;; remainder. Because of this, expand early. > +(define_expand "cml<fcmac1><rot_op><mode>4" > + [(set (match_operand:VHSDF 0 "register_operand") > + (plus:VHSDF (match_operand:VHSDF 1 "register_operand") > + (unspec:VHSDF [(match_operand:VHSDF 2 "register_operand") > + (match_operand:VHSDF 3 "register_operand")] > + FCMLA_OP)))] > + "TARGET_COMPLEX" > +{ > + emit_insn (gen_aarch64_fcmla<rotsplit1><mode> (operands[0], operands[1], > + operands[2], operands[3])); > + emit_insn (gen_aarch64_fcmla<rotsplit2><mode> (operands[0], operands[0], > + operands[2], operands[3])); > + DONE; > +}) The interface doesn't guarantee that operands[0] is distinct from the inputs, so I think the result of the first instruction should be a temporary register. Thanks, Richard