On 6/7/26 17:05, [email protected] wrote:
+;; BMI2 MULX highpart-only pattern. Uses MULX to get only the high part,
+;; discarding the low part into a scratch register. This avoids the
+;; mov from rdx after mulq when only the high part is needed.
+(define_insn "*bmi2_umul<mode>3_highpart"
+ [(set (match_operand:DWIH 0 "register_operand" "=r")
+ (umul_highpart:DWIH
+ (match_operand:DWIH 1 "register_operand" "d")
+ (match_operand:DWIH 2 "nonimmediate_operand" "rm")))
+ (clobber (match_scratch:DWIH 3 "=r"))]
+ "TARGET_BMI2"
+ "mulx\t{%2, %3, %0|%0, %3, %2}"
+ [(set_attr "type" "imulx")
+ (set_attr "prefix" "vex")
+ (set_attr "mode" "<MODE>")])
"If the first and second operand are identical, it will contain the high half of the
multiplication result."
Which suggests you don't need the scratch, just "%2,%0,%0".
r~