Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

Robin Dapp Thu, 27 Mar 2025 15:29:22 -0700

Hi Paul-Antoine,

This pattern enables the combine pass to merge a vec_duplicate into a plus-mult
or minus-mult RTL instruction.


Before this patch, we have two instructions, e.g.:
  vfmv.v.f        v6,fa0
  vfmadd.vv       v9,v6,v7

After, we get only one:
  vfmadd.vf       v9,fa0,v7

On SPEC2017's 503.bwaves_r, depending on the workload, the reduction in dynamic
instruction count varies from -4.66% to -4.75%.

The general issue with this kind of optimization (we have discussed it a fewtimes already) is that, depending on the uarch, we want the local combineoptimization that you show but not the fwprop/late-combine one where wepropagate a vector broadcast into a loop.

So IMHO in order to continue with this and similar patterns we need at leastaccompanying rtx_cost handling that would allow us to tune per uarch.

Pan Li sent a similar patch for vadd.vv/vadd.vx I think in November and Ibelieve he intended to continue when stage 1 opens.

An outstanding question is how to distinguish the combine case from thelate-combine case. I haven't yet thought about that in detail.


--
Regards
Robin

Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

Reply via email to