Hi Jin,
I apologize for the delayed response. I spent quite a bit of time trying to
reproduce
the case, and given the passage of time, it wasn't easy to refine the testing.
Fortunately, you can see the results here.
https://godbolt.org/z/Mc8veW7oT
Using GCC version 14.2.0 should allow you to replicate the issue. If all goes as
expected, you will encounter a "Segmentation fault (core dumped)."
By disassembling the binary, you'll notice the presence of "vsetvli
zero,zero,e32,m4,ta,ma",
which is where the problem lies, just as I mentioned previously.
Thanks for the full example, this is helpful but it still required some more
digging on my side.
I realize now how you came to your conclusion and why you wrote the test that
way.
In QEMU there is a segfault (but no illegal instruction) on
vloxei16.v v8,(t2),v8.
I'm not really sure why yet because vtype and vl are OK. Most likely we
corrupted something before.
On the BPI, however, the more useful information is a SIGILL on
vfmv.s.f v12,fa5.
That's because we do
vsetvli zero,zero,e16,m4,ta,ma
...
vsetvli zero,zero,e8,m2,ta,ma
...
vsetvli zero,zero,e32,m4,ta,ma
The last vsetvl changes the SEW without adjusting LMUL, so changing SEW/LMUL
and VLMAX which is not allowed for vsetvl zero, zero, ...
Subsequently, VILL is set and we SIGILL on the next vector instruction.
So I'd say QEMU should emit a SIGILL here. I'm preparing a QEMU patch for
that.
With your patch the last vsetvl becomes
vsetvli zero,a5,e32,m4,ta,ma
When setting a new VL it is permitted to change SEW/LMUL even though it's not
desirable as we don't need to change VL.
With my patch the vsetvl becomes
vsetvli zero,zero,e32,m8,ta,ma
which doesn't change SEW/LMUL or VL and I think that's what we want.
As summary: Before your patch we only changed SEW which caused a VLMAX-changing
vsetvl, your patch works around that by adjusting the ratio without touching
LMUL. But that still leaves us with PR117955 where the ratio is "correct" but
invalid.
Given all this, I think the only way to fix this is to re-use/copy the vsetvl
information from either prev or next (as in my patch).
I'm going to change bug-10.c into a run test, add the test you last provided as
a run test as well as well as your reduced test from PR117955.
--
Regards
Robin