https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112326

--- Comment #1 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <pa...@gcc.gnu.org>:

https://gcc.gnu.org/g:5ea3c039b784b0676323243940fd9916b1f6d540

commit r14-5092-g5ea3c039b784b0676323243940fd9916b1f6d540
Author: Juzhe-Zhong <juzhe.zh...@rivai.ai>
Date:   Fri Nov 3 08:36:03 2023 +0800

    RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326]

    With compile option --param=riscv-autovec-preference=fixed-vlmax, we have
    redundant AVL/VL toggling:

            vsetvli a5,a3,e8,mf4,ta,ma -> should be changed into e32m1
            vle32.v v1,0(a1)
            vle32.v v2,0(a0)
            vsetivli        zero,4,e32,m1,ta,ma -> redundant
            slli    a2,a5,2
            vadd.vv v1,v1,v2
            sub     a3,a3,a5
            vsetvli zero,a5,e32,m1,ta,ma -> redundant
            vse32.v v1,0(a4)
            add     a0,a0,a2
            add     a1,a1,a2
            add     a4,a4,a2
            bne     a3,zero,.L3

    The root cause is because we simplify AVL into immediate AVL too early
    in FIXED-VLMAX situation. The later avlprop PASS failed to propagate AVL
    generated by (SELECT_VL/vsetvl VL, AVL) into the normal RVV instruction.

    So we need to remove immedate AVL simplification in 'expand' stage.

    After this patch:

            vsetvli a5,a3,e32,m1,ta,ma
            slli    a2,a5,2
            vle32.v v1,0(a1)
            vle32.v v2,0(a0)
            sub     a3,a3,a5
            vadd.vv v1,v1,v2
            vse32.v v1,0(a4)
            add     a0,a0,a2
            add     a1,a1,a2
            add     a4,a4,a2
            bne     a3,zero,.L3

    After the removed simplification, the following situation should be fixed:
    typedef int8_t vnx2qi __attribute__ ((vector_size (2)));

    __attribute__ ((noipa)) void
    f_vnx2qi (int8_t a, int8_t b, int8_t *out)
    {
      vnx2qi v = {a, b};
      *(vnx2qi *) out = v;
    }

    We should use vsetvili zero, 2 instead of vsetvl a5,zero.
    Such simplification is done in avlprop PASS which is also included in this
patch
    to fix regression of these situation.

            PR target/112326

    gcc/ChangeLog:

            * config/riscv/riscv-avlprop.cc (get_insn_vtype_mode): New
function.
            (simplify_replace_vlmax_avl): Ditto.
            (pass_avlprop::execute): Add immediate AVL simplification.
            * config/riscv/riscv-protos.h (imm_avl_p): Rename.
            * config/riscv/riscv-v.cc (const_vlmax_p): Ditto.
            (imm_avl_p): Ditto.
            (emit_vlmax_insn): Adapt for new interface name.
            * config/riscv/vector.md (mode_idx): New attribute.

    gcc/testsuite/ChangeLog:

            * gcc.target/riscv/rvv/autovec/pr112326.c: New test.

Reply via email to