On 11/8/23 23:08, pan2...@intel.com wrote:
From: Pan Li <pan2...@intel.com>

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
   uint8_t arr[32] = {
     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
     1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
   };

   return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
   lui     a5,%hi(.LANCHOR0)
   addi    sp,sp,-32
   addi    a5,a5,%lo(.LANCHOR0)
   li      a3,32
   vl2re64.v       v2,0(a5)
   vsetvli zero,a3,e8,m1,ta,ma
   vs2r.v  v2,0(sp)             <== Unnecessary store to stack
   vle8.v  v1,0(sp)             <== Ditto
   vs1r.v  v1,0(a0)
   addi    sp,sp,32
   jr      ra

After this patch:
test:
   lui     a5,%hi(.LANCHOR0)
   addi    a5,a5,%lo(.LANCHOR0)
   li      a4,32
   addi    sp,sp,-32
   vsetvli zero,a4,e8,m1,ta,ma
   vle8.v  v1,0(a5)
   vs1r.v  v1,0(a0)
   addi    sp,sp,32
   jr      ra

Below tests are passed within this patch:

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression test.

        PR target/111720

gcc/ChangeLog:

        * dse.cc (get_stored_val): Allow vector mode if the read
        bitsize is less than stored bitsize.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
        * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
We're always getting the lowpart here AFAICT and it appears that all the right thing should happen if gen_lowpart_common fails (it returns NULL, which bubbles up and is the right return value from get_stored_val if it can't be optimized).

Did you want to use known_le so that you'd pick up the case when the two modes are the same size? Or was known_lt the test you really wanted (and if so, why).


OK using known_lt, or known_le. If you decide to change to known_le, you'll need to bootstrap & regression test again on x86.



jeff

Reply via email to