Hi,
On Wed, Mar 12, 2025 at 03:55:47PM +0000, Paolo Savini wrote:
This commit improves the performance of QEMU when emulating strided
vector
loads and stores by substituting the call for the helper function
with the
generation of equivalent TCG operations.
Signed-off-by: Paolo Savini <paolo.sav...@embecosm.com>
Reviewed-by: Daniel Henrique Barboza <dbarb...@ventanamicro.com>
---
target/riscv/insn_trans/trans_rvv.c.inc | 323
++++++++++++++++++++----
1 file changed, 273 insertions(+), 50 deletions(-)
This recent QEMU patch broke the RISC-V vector optimized ChaCha20 code
in the Linux kernel. I simplified the reproducer to the following,
which had its behavior changed:
rvv_test_func:
vsetivli zero, 1, e32, m1, ta, ma
li t0, 64
vlsseg8e32.v v0, (a0), t0
addi a0, a0, 32
vlsseg8e32.v v8, (a0), t0
vssseg8e32.v v0, (a1), t0
addi a1, a1, 32
vssseg8e32.v v8, (a1), t0
ret
Before this patch, it copied 64 bytes from a0 to a1. After this patch,
the bytes at 32..47 also incorrectly get copied to 16..31.
Please fix this, or else revert the patch.
- Eric