https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438
--- Comment #4 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Oh. I see what you mean.
I think it may not be the valid optimization.
Since the following codes:
.L3:
vsetvli a5,a0,e32,m1,ta,ma
slli a4,a5,2
vle32.v v1,0(a1)
sub a0,a0,a5
vadd.vv v1,v1,v2
vse32.v v1,0(a2)
add a1,a1,a4
vsetvli a5,zero,e32,m1,ta,ma --- > seems redundant
add a2,a2,a4
vadd.vv v2,v2,v4
bne a0,zero,.L3
Suppose the VLEN = 8 elments. a0 is 13 in the last 2 iterations.
If we remove the VLMAX vsetvl which seems redundant. We may have issues in
some hardware.
Since 13 elements, we can choose to process 6 elements int last second,
and 7 elements in the last iteration.
The VLMAX vadd.vv result is used by next iteration NOT the current iteration.
Then, the vadd.vv will generate 6 elements to the last iteration which need 7
elements.
Then it will cause a bug. So, it is not invalid optimization...