Seems like there is a potential vsetvli optimization chance in the example?

> After this patch:
> foo:
>         lui     a5,%hi(.LC0)
>         flw     fa0,%lo(.LC0)(a5)
>         ble     a1,zero,.L4
> .L3:
>         vsetvli a5,a1,e32,m1,ta,ma
>         vle32.v v1,0(a0)
>         slli    a4,a5,2
>         vsetivli        zero,1,e32,m1,ta,ma

This could just use "vsetvli a5,a1,e32,m1,ta,ma"

>         sub     a1,a1,a5
>         vfmv.s.f        v2,fa0
>         add     a0,a0,a4
>         vsetvli zero,a5,e32,m1,ta,ma

And then this can be removed too.

>         vfredosum.vs    v1,v1,v2
>         vfmv.f.s        fa0,v1
>         bne     a1,zero,.L3
>         ret
> .L4:
>         ret

Reply via email to