"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes: > Before this patch: > foo: > ble a2,zero,.L5 > csrr a3,vlenb > srli a4,a3,2 > .L3: > minu a5,a2,a4 > vsetvli zero,a5,e32,m1,ta,ma > vle32.v v2,0(a1) > vle32.v v1,0(a0) > vsetvli t1,zero,e32,m1,ta,ma > vadd.vv v1,v1,v2 > vsetvli zero,a5,e32,m1,ta,ma > vse32.v v1,0(a0) > add a1,a1,a3 > add a0,a0,a3 > sub a2,a2,a5 > bne a2,zero,.L3 > .L5: > ret > > After this patch: > > foo: > ble a2,zero,.L5 > csrr a3,vlenb > srli a4,a3,2 > neg a7,a4 -->>>additional instruction > .L3: > minu a5,a2,a4 > vsetvli zero,a5,e32,m1,ta,ma > vle32.v v2,0(a1) > vle32.v v1,0(a0) > vsetvli t1,zero,e32,m1,ta,ma > mv a6,a2 -->>>additional instruction > vadd.vv v1,v1,v2 > vsetvli zero,a5,e32,m1,ta,ma > vse32.v v1,0(a0) > add a1,a1,a3 > add a0,a0,a3 > add a2,a2,a7 > bgtu a6,a4,.L3 > .L5: > ret > > There is 1 more instruction in preheader and 1 more instruction in loop. > But I think it's OK for RVV since we will definitely be using SELECT_VL so > this issue will gone.
But what about cases where you won't be using SELECT_VL, such as SLP? Richard