] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx
> I see, didn't aware of that. I am not sure if we need to consider vsetvl here?
> As there are extra 2 insn here.
I wouldn't consider it as it's outside of the loop. What matters is latency
inside the loop.
> I see, n
On 11/27/24 5:48 AM, Robin Dapp wrote:
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below:
I think we concluded a while ago that we don't want this turned on universally.
For the example/tests you provide it will be a de-optimization on any ua
> I see, didn't aware of that. I am not sure if we need to consider vsetvl here?
> As there are extra 2 insn here.
I wouldn't consider it as it's outside of the loop. What matters is latency
inside the loop.
> I see, need to consider the cost here. Any example I can reference? Sorry I
> haven't
s.
Pan
-Original Message-
From: Robin Dapp
Sent: Wednesday, November 27, 2024 8:48 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin
Dapp
Subject: Re: [PATCH v1 1/3] RISC-V: Combine vec_duplicate + vadd.vv to va
> This patch would like to combine the vec_duplicate + vadd.vv to the
> vadd.vx. From example as below:
I think we concluded a while ago that we don't want this turned on universally.
For the example/tests you provide it will be a de-optimization on any uarch
that has non-zero GPR -> VR latency.
From: Pan Li
This patch would like to combine the vec_duplicate + vadd.vv to the
vadd.vx. From example as below:
#define DEF_VX_BINARY(T, OP)\
void\
test_vx_binary (T * restrict out, T