Hi, after deeply diging into this issue:
I figure out what is happening, this is the V3 patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/623052.html
There is a comprehensive explanation in commit log.
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-06-28 11:16
To: juzhe.zh...@rivai
Hi, Richi.
After I dig into the codes and experiment:
https://godbolt.org/z/hMf5nsPeK
This example is VNx8QI, GCC works fine for RVV using 1-bit compact mask.
ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));
ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2))
Hi, Richi.
Thanks for taking care of this issue.
From my observation, VNx2BI is using 2-bit mask: 00 = 0, 01 = 1
VNx4BI is using 4-bit mask: = 0, 0001
= 1
This perfectly works for ARM SVE since this is the layout of ARM mask register.
However, RVV is al
The difference between v1 and v2 is the compact mask generation:
v1 :
+rtx
+rvv_builder::compact_mask () const
+{
+ /* Use the container mode with SEW = 8 and LMUL = 1. */
+ unsigned container_size
+= MAX (CEIL (npatterns (), 8), BYTES_PER_RISCV_VECTOR.to_constant () / 8);
+ machine_mode
I have commented in commit log:
before this patch:
The mask is:
.LC1:
.byte 68 > 0b01000100
However, this is incorrect for RVV since RVV always uses 1-bit compact mask,
now after this patch:
.LC1:
.byte 10 > 0b1010
juzhe.zh...@rivai.ai
From: Kito Cheng
Date: 2023-