RE: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread Li, Pan2
] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx > I see, didn't aware of that. I am not sure if we need to consider vsetvl here? > As there are extra 2 insn here. I wouldn't consider it as it's outside of the loop. What matters is latency inside the loop. > I see, n

Re: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread Jeff Law
On 11/27/24 5:48 AM, Robin Dapp wrote: This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below: I think we concluded a while ago that we don't want this turned on universally. For the example/tests you provide it will be a de-optimization on any ua

Re: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread Robin Dapp
> I see, didn't aware of that. I am not sure if we need to consider vsetvl here? > As there are extra 2 insn here. I wouldn't consider it as it's outside of the loop. What matters is latency inside the loop. > I see, need to consider the cost here. Any example I can reference? Sorry I > haven't

RE: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread Li, Pan2
s. Pan -Original Message- From: Robin Dapp Sent: Wednesday, November 27, 2024 8:48 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin Dapp Subject: Re: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to va

Re: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread Robin Dapp
> This patch would like to combine the vec_duplicate + vadd.vv to the > vadd.vx. From example as below: I think we concluded a while ago that we don't want this turned on universally. For the example/tests you provide it will be a de-optimization on any uarch that has non-zero GPR -> VR latency.

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx

2024-11-27 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below: #define DEF_VX_BINARY(T, OP)\ void\ test_vx_binary (T * restrict out, T