Hi Eric, Daniel,

Thanks for bringing this up and for the reproducer Eric. I agree we need to revert. Thanks Daniel for taking care of this. I should have time this month to fix this and send a new version.


Best wishes,

Paolo

On 7/10/25 10:54, Daniel Henrique Barboza wrote:
Hi Eric,


Thanks for the bug report and the simple reproducer.

Paolo, I'll send a revert since we don't want to keep Linux broken and to give you more time to adequately fix the patch. You can then re-send it as a v3 in
the ML.


Thanks,

Daniel


On 7/10/25 2:28 AM, Eric Biggers wrote:
Hi,

On Wed, Mar 12, 2025 at 03:55:47PM +0000, Paolo Savini wrote:
This commit improves the performance of QEMU when emulating strided vector loads and stores by substituting the call for the helper function with the
generation of equivalent TCG operations.

Signed-off-by: Paolo Savini <paolo.sav...@embecosm.com>
Reviewed-by: Daniel Henrique Barboza <dbarb...@ventanamicro.com>
---
  target/riscv/insn_trans/trans_rvv.c.inc | 323 ++++++++++++++++++++----
  1 file changed, 273 insertions(+), 50 deletions(-)

This recent QEMU patch broke the RISC-V vector optimized ChaCha20 code
in the Linux kernel.  I simplified the reproducer to the following,
which had its behavior changed:

rvv_test_func:
    vsetivli    zero, 1, e32, m1, ta, ma
    li        t0, 64

    vlsseg8e32.v    v0, (a0), t0
    addi        a0, a0, 32
    vlsseg8e32.v    v8, (a0), t0

    vssseg8e32.v    v0, (a1), t0
    addi        a1, a1, 32
    vssseg8e32.v    v8, (a1), t0
    ret

Before this patch, it copied 64 bytes from a0 to a1.  After this patch,
the bytes at 32..47 also incorrectly get copied to 16..31.

Please fix this, or else revert the patch.

- Eric


Reply via email to