Hi Eric,

Thanks for the bug report and the simple reproducer.

Paolo, I'll send a revert since we don't want to keep Linux broken and to give
you more time to adequately fix the patch. You can then re-send it as a v3 in
the ML.


Thanks,

Daniel


On 7/10/25 2:28 AM, Eric Biggers wrote:
Hi,

On Wed, Mar 12, 2025 at 03:55:47PM +0000, Paolo Savini wrote:
This commit improves the performance of QEMU when emulating strided vector
loads and stores by substituting the call for the helper function with the
generation of equivalent TCG operations.

Signed-off-by: Paolo Savini <paolo.sav...@embecosm.com>
Reviewed-by: Daniel Henrique Barboza <dbarb...@ventanamicro.com>
---
  target/riscv/insn_trans/trans_rvv.c.inc | 323 ++++++++++++++++++++----
  1 file changed, 273 insertions(+), 50 deletions(-)

This recent QEMU patch broke the RISC-V vector optimized ChaCha20 code
in the Linux kernel.  I simplified the reproducer to the following,
which had its behavior changed:

rvv_test_func:
        vsetivli        zero, 1, e32, m1, ta, ma
        li              t0, 64

        vlsseg8e32.v    v0, (a0), t0
        addi            a0, a0, 32
        vlsseg8e32.v    v8, (a0), t0

        vssseg8e32.v    v0, (a1), t0
        addi            a1, a1, 32
        vssseg8e32.v    v8, (a1), t0
        ret

Before this patch, it copied 64 bytes from a0 to a1.  After this patch,
the bytes at 32..47 also incorrectly get copied to 16..31.

Please fix this, or else revert the patch.

- Eric


Reply via email to