from:"Li, Pan2"

RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

2023-10-05 Thread Li, Pan2

Thanks Jeff and Robin for comments, sorry for late reply.

> Conceptually the rounding mode is just a property.  The call, in effect, 
> should demand a "normal" rounding mode and set the rounding mode to 
> unknown if I understand how this is supposed to work.  If my 
> understanding is wrong, then maybe that's where we should start -- with 
> a good description of the problem ;-)

I think we are on the same page of how it works, I may need to take a look at 
how x86 taking care of this.

> That's probably dead code at this point.  IIRC rth did further work in 
> this space because inserting in the end of the block with the abnormal 
> edge isn't semantically correct.

> It's been 20+ years, but IIRC he adjusted the PRE bitmaps so that we 
> never would need to do an insertion on an abnormal edge.  Search for 
> EDGE_ABNORMAL in gcse.cc.

That is quite old up to a point, will have a try for the EDGE_ABNORMAL case.

> Having said that, it looks like Pan's patch just tries to move some of
> the dirty work from the backend to the mode-switching pass by making it
> easier to do something after a call.  I believe I asked for that back in
> one of the reviews even?

Yes, that is what I would like to do in this PATCH, as the following up of some 
comments from Robin in previous.

Pan

-Original Message-----
From: Robin Dapp  
Sent: Monday, October 2, 2023 4:26 PM
To: Jeff Law ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook

> Conceptually the rounding mode is just a property.  The call, in
> effect, should demand a "normal" rounding mode and set the rounding
> mode to unknown if I understand how this is supposed to work.  If my
> understanding is wrong, then maybe that's where we should start --
> with a good description of the problem ;-)

That's also what I what struggled with last time this was discussed.

Normally, mode switching is used to switch to a requested mode for
an insn or a call and potentially switch back afterwards.

For those riscv intrinsics that specify a variable, non-default rounding
mode we have two options:

- Save and restore before and after each mode-changing intrinsic
 fegetround old_rounding
 fesetround new_rounding 
 actual instruction
 fesetround old_rounding)

- Have mode switching do it for us (lazily) to avoid most of the
storing of the old rounding mode by storing an (e.g.) function-level
rounding-mode backup value.  The backup value is used to lazily
restore the currently valid rounding mode.

The problem with this now is that whenever fesetround gets called
our backup is outdated.  Therefore we need to update our backup after
each function call (as fesetround can of course be present anywhere)
and this is where most of the complications come from.

So in that case the callee _does_ impact the caller via the backup
clobbering.  That was one of my complaints about the whole procedure
last time.  Besides, I didn't see the need for those intrinsics
anyway and would much rather have explicit fesetround calls but well :)

Having said that, it looks like Pan's patch just tries to move some of
the dirty work from the backend to the mode-switching pass by making it
easier to do something after a call.  I believe I asked for that back in
one of the reviews even?

Regards
 Robin

RE: [PATCH] RISC-V: Remove @ of vec_series

2023-10-05 Thread Li, Pan2

Committed, thanks Jeff and Robin.

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, October 4, 2023 11:40 PM
To: Robin Dapp ; Juzhe-Zhong ; 
gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com
Subject: Re: [PATCH] RISC-V: Remove @ of vec_series



On 10/4/23 09:06, Robin Dapp wrote:
> I'm currently in the process of removing some unused @s.
> This is OK.
Agreed.  And if you or Juzhe have other @ cases that are unused, such 
changes should be considered pre-approved.

Jeff

RE: [PATCH v1] RISC-V: Update comments for FP rounding related autovec

2023-10-05 Thread Li, Pan2

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Friday, October 6, 2023 11:09 AM
To: Li, Pan2 
Cc: GCC Patches ; 钟居哲 ; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Update comments for FP rounding related autovec

LGTM

mailto:pan2...@intel.com>> 於 2023年10月6日 週五 10:39 寫道：
From: Pan Li mailto:pan2...@intel.com>>

Some comment is out of date, this patch would like to fix it.

gcc/ChangeLog:

* config/riscv/autovec.md: Update comments.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
 gcc/config/riscv/autovec.md | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 056f2c352f6..53e9d34eea1 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2229,12 +2229,16 @@ (define_expand "avg3_ceil"
 })

 ;; -
-;;  [FP] Math.h.
+;;  [FP] Rounding.
 ;; -
 ;; Includes:
 ;; - ceil/ceilf
 ;; - floor/floorf
 ;; - nearbyint/nearbyintf
+;; - rint/rintf
+;; - round/roundf
+;; - trunc/truncf
+;; - roundeven/roundevenf
 ;; -
 (define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
--
2.34.1

RE: [PATCH v1] RISC-V: Bugfix for legitimize address PR/111634

2023-10-06 Thread Li, Pan2

Thanks Jeff, committed with a better Changelog as your suggestion.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, October 7, 2023 12:53 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for legitimize address PR/111634



On 10/6/23 22:49, pan2...@intel.com wrote:
> From: Pan Li 
> 
> Given we have RTL as below.
> 
> (plus:DI (mult:DI (reg:DI 138 [ g.4_6 ])
>(const_int 8 [0x8]))
>   (lo_sum:DI (reg:DI 167)
>  (symbol_ref:DI ("f") [flags 0x86]  0x7fa96ea1cc60 f>)
> ))
> 
> When handling (plus (plus (mult (a) (mem_shadd_constant)) (fp)) (C)) case,
> the fp will be the lo_sum operand as above. We have assumption that the fp
> is reg but actually not here. It will have ICE when building with option
> --enable-checking=rtl.
> 
> This patch would like to fix it by adding the REG_P to ensure the operand
> is a register. The test case gcc/testsuite/gcc.dg/pr109417.c covered this
> fix when build with --enable-checking=rtl.
> 
>   PR target/111634
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (riscv_legitimize_address): Bugfix.
OK, though the ChangeLog entry could be better.  Perhaps

* config/riscv/riscv.cc (riscv_legitimize_address): Ensure
object is a REG before extracting its register number.


Jeff

RE: [PATCH] RISC-V: Enable more tests of "vect" for RVV

2023-10-07 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, October 7, 2023 10:48 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Enable more tests of "vect" for RVV



On 10/7/23 01:04, Juzhe-Zhong wrote:
> This patch enables almost full coverage vectorization tests for RVV, except 
> these
> following tests (not enabled yet):
> 
> 1. Will enable soon:
> 
> check_effective_target_vect_call_lrint
> check_effective_target_vect_call_btrunc
> check_effective_target_vect_call_btruncf
> check_effective_target_vect_call_ceil
> check_effective_target_vect_call_ceilf
> check_effective_target_vect_call_floor
> check_effective_target_vect_call_floorf
> check_effective_target_vect_call_lceil
> check_effective_target_vect_call_lfloor
> check_effective_target_vect_call_nearbyint
> check_effective_target_vect_call_nearbyintf
> check_effective_target_vect_call_round
> check_effective_target_vect_call_roundf
> 
> 2. Not sure we will need to enable or not:
> 
> check_effective_target_vect_complex_*
> check_effective_target_vect_simd_clones
> check_effective_target_vect_bswap
> check_effective_target_vect_widen_shift
> check_effective_target_vect_widen_mult_*
> check_effective_target_vect_widen_sum_*
> check_effective_target_vect_unpack
> check_effective_target_vect_interleave
> check_effective_target_vect_extract_even_odd
> check_effective_target_vect_pack_trunc
> check_effective_target_vect_check_ptrs
> check_effective_target_vect_sdiv_pow2_si
> check_effective_target_vect_usad_*
> check_effective_target_vect_udot_*
> check_effective_target_vect_sdot_*
> check_effective_target_vect_gather_load_ifn
> 
> After this patch, we will have these following additional FAILs:
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1161.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1161.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s124.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s124.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1279.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1279.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s161.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s161.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s253.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s253.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s271.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s271.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2711.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2711.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2712.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2712.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s272.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s272.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s273.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s273.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s274.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s274.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s276.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s276.c scan-tree-dump vect "vectorized 1 
> loops"

RE: [PATCH] TEST: Fix XPASS of TSVC testsuites for RVV

2023-10-07 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, October 7, 2023 10:44 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] TEST: Fix XPASS of TSVC testsuites for RVV



On 10/7/23 03:23, Juzhe-Zhong wrote:
> Fix these following XPASS FAILs of TSVC for RVV:
> 
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1161.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1161.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s124.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s124.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1279.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1279.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s161.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s161.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s253.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s253.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s271.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s271.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2711.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2711.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2712.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s2712.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s272.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s272.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s273.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s273.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s274.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s274.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s276.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s276.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s278.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s278.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s279.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s279.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s3111.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s3111.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s353.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s353.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s441.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s441.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s443.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s443.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-vif.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-vif.c scan-tree-dump vect "vectorized 1 
> loops"
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/tsvc/vect-tsvc-s11

RE: [PATCH] RISC-V: add static-pie support

2023-10-07 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Sunday, October 8, 2023 12:13 AM
To: Wang, Yanzhang ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@sifive.com; Li, Pan2 
Subject: Re: [PATCH] RISC-V: add static-pie support



On 10/7/23 05:32, yanzhang.w...@intel.com wrote:
> From: Yanzhang Wang 
> 
> We only need to pass options to the linker when static-pie is passed.
> There's another patch to enable static-pie in glibc. And we need to
> enable in GCC first.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/linux.h: Pass the static-pie specific options to
> the linker.
OK.
jeff

RE: [PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread Li, Pan2

Sure thing, will send V2 for this change.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 9, 2023 5:04 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen

Remove these functions:


+static void

+emit_vec_sll_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)

+{

+  rtx sll_ops[] = {op_0, op_1, op_2};

+  insn_code icode = code_for_pred_scalar (ASHIFT, vec_mode);

+

+  emit_vlmax_insn (icode, BINARY_OP, sll_ops);

+}

+

+static void

+emit_vec_srl_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)

+{

+  rtx srl_ops[] = {op_0, op_1, op_2};

+  insn_code icode = code_for_pred_scalar (LSHIFTRT, vec_mode);

+

+  emit_vlmax_insn (icode, BINARY_OP, srl_ops);

+}

+

+static void

+emit_vec_or (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)

+{

+  rtx or_ops[] = {op_0, op_1, op_2};

+  insn_code icode = code_for_pred (IOR, vec_mode);

+

+  emit_vlmax_insn (icode, BINARY_OP, or_ops);

+}

+

Instead,

For sll, you should use :
rtx tmp
= expand_binop (Pmode, ashl_optab, op_1,
gen_int_mode (8, Pmode), NULL_RTX, 0,
OPTAB_DIRECT);

For srl, you should use:
rtx tmp
= expand_binop (Pmode, lshiftrt_optab, op_1,
gen_int_mode (8, Pmode), NULL_RTX, 0,
OPTAB_DIRECT);


For or, you should use:
expand_binop (Pmode, ior_optab, tmp, dest, NULL_RTX, 0,
   OPTAB_DIRECT);


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-09 16:51
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Refine bswap16 auto vectorization code gen
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to refine the code gen for the bswap16.

We will have VEC_PERM_EXPR after rtl expand when invoking
__builtin_bswap. It will generate about 9 instructions in
loop as below, no matter it is bswap16, bswap32 or bswap64.

  .L2:
1 vle16.v v4,0(a0)
2 vmv.v.x v2,a7
3 vand.vv v2,v6,v2
4 sllia2,a5,1
5 vrgatherei16.vv v1,v4,v2
6 sub a4,a4,a5
7 vse16.v v1,0(a3)
8 add a0,a0,a2
9 add a3,a3,a2
  bne a4,zero,.L2

But for bswap16 we may have a even simple code gen, which
has only 7 instructions in loop as below.

  .L5
1 vle8.v  v2,0(a5)
2 addia5,a5,32
3 vsrl.vi v4,v2,8
4 vsll.vi v2,v2,8
5 vor.vv  v4,v4,v2
6 vse8.v  v4,0(a4)
7 addia4,a4,32
  bne a5,a6,.L5

Unfortunately, this way will make the insn in loop will grow up to
13 and 24 for bswap32 and bswap64. Thus, we will refine the code
gen for the bswap16 only, and leave both the bswap32 and bswap64
as is.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vec_sll_scalar): New help func
impl for emit vsll.vi/vsll.vx
(emit_vec_srl_scalar): Likewise for vsrl.vi/vsrl.vx.
(emit_vec_or): Likewise for vor.vv.
(shuffle_bswap_pattern): New func impl for shuffle bswap.
(expand_vec_perm_const_1): Add shuffle bswap pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker.
* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-v.cc   | 117 ++
.../riscv/rvv/autovec/unop/bswap16-0.c|  17 +++
.../riscv/rvv/autovec/unop/bswap16-run-0.c|  44 +++
.../riscv/rvv/autovec/vls/bswap16-0.c |  34 +
.../gcc.target/riscv/rvv/autovec/vls/perm-4.c |   4 +-
5 files changed, 214 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 23633a2a74d..3e3b5f2e797 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -878,6 +878,33 @@ emit_vlmax_decompress_insn (rtx target, rtx op0, rtx op1, 
rtx mask)
   emit_vlmax_masked_gather_mu_insn (target, op1, sel, mask);
}
+static void
+emit_vec_sll_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)
+{
+  rtx sll_ops[] = {op_0, op_1, op_2};
+  insn_code icode = code_for_pred_scalar (ASHIFT, vec_mode);
+
+  emit_vlmax_insn (icode, BINARY_OP, sll_ops);
+}
+
+static void
+emit_vec_srl_scalar (rtx op_0, rtx op_1, rtx op_2, machine_mode vec_mode)
+{
+  rtx srl_ops[] = {op_0, op_1, op_2};
+  insn_code icode = code_for_pred_scalar (LSHIFTRT, vec_mode);
+
+  emit_vlmax_insn (icode, BINARY_OP, srl_ops);
+}
+
+static void
+emit_vec_or (rtx op_0, rtx o

RE: [PATCH] RISC-V Regression test: Fix FAIL of pr45752.c for RVV

2023-10-09 Thread Li, Pan2

Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, October 9, 2023 9:07 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V Regression test: Fix FAIL of pr45752.c for RVV

On Mon, 9 Oct 2023, Juzhe-Zhong wrote:

> RVV use load_lanes with stride = 5 vectorize this case with 
> -fno-vect-cost-model
> instead of SLP.

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr45752.c: Adapt dump check for target supports 
> load_lanes with stride = 5.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/pr45752.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/pr45752.c 
> b/gcc/testsuite/gcc.dg/vect/pr45752.c
> index e8b364f29eb..3c87d9b04fc 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr45752.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr45752.c
> @@ -159,4 +159,4 @@ int main (int argc, const char* argv[])
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
>  /* { dg-final { scan-tree-dump-times "gaps requires scalar epilogue loop" 0 
> "vect" } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> {target { ! { vect_load_lanes && vect_strided5 } } } } } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen

2023-10-09 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 9, 2023 9:11 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen

LGTM now.

Thanks.


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-09 21:09
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v2] RISC-V: Refine bswap16 auto vectorization code gen
From: Pan Li mailto:pan2...@intel.com>>

Update in v2

* Remove emit helper functions.
* Take expand_binop instead.

Original log:

This patch would like to refine the code gen for the bswap16.

We will have VEC_PERM_EXPR after rtl expand when invoking
__builtin_bswap. It will generate about 9 instructions in
loop as below, no matter it is bswap16, bswap32 or bswap64.

  .L2:
1 vle16.v v4,0(a0)
2 vmv.v.x v2,a7
3 vand.vv v2,v6,v2
4 sllia2,a5,1
5 vrgatherei16.vv v1,v4,v2
6 sub a4,a4,a5
7 vse16.v v1,0(a3)
8 add a0,a0,a2
9 add a3,a3,a2
  bne a4,zero,.L2

But for bswap16 we may have a even simple code gen, which
has only 7 instructions in loop as below.

  .L5
1 vle8.v  v2,0(a5)
2 addia5,a5,32
3 vsrl.vi v4,v2,8
4 vsll.vi v2,v2,8
5 vor.vv  v4,v4,v2
6 vse8.v  v4,0(a4)
7 addia4,a4,32
  bne a5,a6,.L5

Unfortunately, this way will make the insn in loop will grow up to
13 and 24 for bswap32 and bswap64. Thus, we will refine the code
gen for the bswap16 only, and leave both the bswap32 and bswap64
as is.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (shuffle_bswap_pattern): New func impl
for shuffle bswap.
(expand_vec_perm_const_1): Add handling for shuffle bswap pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/perm-4.c: Adjust checker.
* gcc.target/riscv/rvv/autovec/unop/bswap16-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/bswap16-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-v.cc   | 91 +++
.../riscv/rvv/autovec/unop/bswap16-0.c| 17 
.../riscv/rvv/autovec/unop/bswap16-run-0.c| 44 +
.../riscv/rvv/autovec/vls/bswap16-0.c | 34 +++
.../gcc.target/riscv/rvv/autovec/vls/perm-4.c |  4 +-
5 files changed, 188 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/bswap16-run-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/bswap16-0.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 23633a2a74d..c72e411f125 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3030,6 +3030,95 @@ shuffle_decompress_patterns (struct expand_vec_perm_d *d)
   return true;
}
+static bool
+shuffle_bswap_pattern (struct expand_vec_perm_d *d)
+{
+  HOST_WIDE_INT diff;
+  unsigned i, size, step;
+
+  if (!d->one_vector_p || !d->perm[0].is_constant (&diff) || !diff)
+return false;
+
+  step = diff + 1;
+  size = step * GET_MODE_UNIT_BITSIZE (d->vmode);
+
+  switch (size)
+{
+case 16:
+  break;
+case 32:
+case 64:
+  /* We will have VEC_PERM_EXPR after rtl expand when invoking
+ __builtin_bswap. It will generate about 9 instructions in
+ loop as below, no matter it is bswap16, bswap32 or bswap64.
+.L2:
+ 1 vle16.v v4,0(a0)
+ 2 vmv.v.x v2,a7
+ 3 vand.vv v2,v6,v2
+ 4 sllia2,a5,1
+ 5 vrgatherei16.vv v1,v4,v2
+ 6 sub a4,a4,a5
+ 7 vse16.v v1,0(a3)
+ 8 add a0,a0,a2
+ 9 add a3,a3,a2
+bne a4,zero,.L2
+
+ But for bswap16 we may have a even simple code gen, which
+ has only 7 instructions in loop as below.
+.L5
+ 1 vle8.v  v2,0(a5)
+ 2 addia5,a5,32
+ 3 vsrl.vi v4,v2,8
+ 4 vsll.vi v2,v2,8
+ 5 vor.vv  v4,v4,v2
+ 6 vse8.v  v4,0(a4)
+ 7 addia4,a4,32
+bne a5,a6,.L5
+
+ Unfortunately, the instructions in loop will grow to 13 and 24
+ for bswap32 and bswap64. Thus, we will leverage vrgather (9 insn)
+ for both the bswap64 and bswap32, but take shift and or (7 insn)
+ for bswap16.
+   */
+default:
+  return false;
+}
+
+  for (i = 0; i < step; i++)
+if (!d->perm.series_p (i, step, diff - i, step))
+  return false;
+
+  if (d->testing_p)
+return true;
+
+  machine_mode vhi_mode;
+  poly_uint64 vhi_nunits = exact_div (GET_MODE_NUNITS (d->vmode), 2);
+
+  if (!get_vector_mode (HImode, vhi_nunits).exists (&vhi_mode))
+return false;
+
+  /* Step-1: Move op0 to src with VHI mode.  */
+  rtx src = gen_reg_rtx (vhi_mode);
+  emit_move_insn (src, gen_lowpart (vhi_mode, d

RE: [PATCH V2] RISC-V: Support movmisalign of RVV VLA modes

2023-10-09 Thread Li, Pan2

Committed, thanks Robin.

Pan

-Original Message-
From: Robin Dapp  
Sent: Monday, October 9, 2023 9:54 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@gmail.com; kito.ch...@sifive.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH V2] RISC-V: Support movmisalign of RVV VLA modes

Thanks, for now this LGTM.

Regards
 Robin

RE: [PATCH] RISC-V Regression test: Adapt SLP tests like ARM SVE

2023-10-09 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 9, 2023 9:49 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression test: Adapt SLP tests like ARM SVE



On 10/9/23 07:37, Juzhe-Zhong wrote:
> Like ARM SVE, RVV is vectorizing these 2 cases in the same way.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-23.c: Add RVV like ARM SVE.
>   * gcc.dg/vect/slp-perm-10.c: Ditto.
OK
jeff

RE: [PATCH] RISC-V Regression test: Fix FAIL of slp-reduc-4.c for RVV

2023-10-09 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 9, 2023 9:52 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression test: Fix FAIL of slp-reduc-4.c for RVV



On 10/9/23 07:41, Juzhe-Zhong wrote:
> RVV vectortizes this case with stride8 load_lanes.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-reduc-4.c: Adapt test for stride8 load_lanes.
OK.  Similar question as my last ack.  Do we want a follow-up here which 
tests the .vect dump for the ! { vect_load_lanes && vec_strided8 } case?

jeff

RE: [PATCH] RISC-V Regression test: Fix FAIL of slp-12a.c

2023-10-09 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 9, 2023 9:53 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression test: Fix FAIL of slp-12a.c



On 10/9/23 07:35, Juzhe-Zhong wrote:
> This case is vectorized by stride8 load_lanes.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/slp-12a.c: Adapt for stride 8 load_lanes.
OK.  Same question as last two ACKs.

jeff

RE: [PATCH] RISC-V Regression tests: Fix FAIL of pr97832* for RVV

2023-10-09 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 9, 2023 9:53 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression tests: Fix FAIL of pr97832* for RVV



On 10/9/23 07:15, Juzhe-Zhong wrote:
> These cases are vectorized by vec_load_lanes with strided = 8 instead of SLP
> with -fno-vect-cost-model.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr97832-2.c: Adapt dump check for target supports 
> load_lanes with stride = 8.
>   * gcc.dg/vect/pr97832-3.c: Ditto.
>   * gcc.dg/vect/pr97832-4.c: Ditto.
OK.  Same question as last 3 acks.

jeff

RE: [PATCH] RISC-V Regression test: Fix slp-perm-4.c FAIL for RVV

2023-10-09 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 9, 2023 10:28 PM
To: juzhe.zhong 
Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression test: Fix slp-perm-4.c FAIL for RVV



On 10/9/23 08:21, juzhe.zhong wrote:
> Do you mean add a check whether it is vectorized or not？
Yes.

> 
> Sounds reasonable， I can add that in another patch.
Sounds good.  Thanks.

jeff

RE: [PATCH] RISC-V: Add available vector size for RVV

2023-10-09 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Tuesday, October 10, 2023 11:20 AM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Add available vector size for RVV

LGTM

On Mon, Oct 9, 2023 at 4:23 PM Juzhe-Zhong  wrote:
>
> For RVV, we have VLS modes enable according to TARGET_MIN_VLEN
> from M1 to M8.
>
> For example, when TARGET_MIN_VLEN = 128 bits, we enable
> 128/256/512/1024 bits VLS modes.
>
> This patch fixes following FAIL:
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects  
> scan-tree-dump-times slp2 "optimized: basic block" 2
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: 
> basic block" 2
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp: Add 256/512/1024
>
> ---
>  gcc/testsuite/lib/target-supports.exp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index af52c38433d..dc366d35a0a 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } {
> lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2
>  } elseif { [istarget riscv*-*-*] } {
> if { [check_effective_target_riscv_v] } {
> -   lappend result 0 32 64 128
> +   lappend result 0 32 64 128 256 512 1024
> }
> lappend result 128
>  } else {
> --
> 2.36.3
>

RE: [PATCH] RISC-V Regression: Fix FAIL of pr65947-8.c for RVV

2023-10-10 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, October 10, 2023 9:24 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression: Fix FAIL of pr65947-8.c for RVV



On 10/10/23 06:55, Juzhe-Zhong wrote:
> This test is testing fold_extract_last pattern so it's more reasonable use
> vect_fold_extract_last instead of specifying targets.
> 
> This is the vect_fold_extract_last property:
> proc check_effective_target_vect_fold_extract_last { } {
>  return [expr { [check_effective_target_aarch64_sve]
>  || [istarget amdgcn*-*-*]
>  || [check_effective_target_riscv_v] }]
> }
> 
> include ARM SVE/GCN/RVV.
> 
> It perfectly matches what we want and more reasonable, better maintainment.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/pr65947-8.c: Use vect_fold_extract_last.
OK
jeff

RE: [PATCH] RISC-V Regression: Make match patterns more accurate

2023-10-10 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, October 10, 2023 9:47 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V Regression: Make match patterns more accurate



On 10/9/23 20:47, Juzhe-Zhong wrote:
> This patch fixes following 2 FAILs in RVV regression since the check is not 
> accurate.
> 
> It's inspired by Robin's previous patch:
> https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac...@gmail.com/
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern.
>   * gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.
OK.   We might see other ports flipping to a pass if they were 
exhibiting the same behavior with failing to vectorize with the first 
selected type, but passing on the second type.

Jeff

RE: [PATCH] RISC-V Regression: Fix FAIL of predcom-2.c

2023-10-10 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, October 10, 2023 9:49 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression: Fix FAIL of predcom-2.c



On 10/9/23 20:58, Juzhe-Zhong wrote:
> Like GCN, add -fno-tree-vectorize.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/predcom-2.c: Add riscv.
OK.
jeff

RE: [PATCH] RISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV

2023-10-10 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, October 10, 2023 11:29 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV



On 10/10/23 08:49, Juzhe-Zhong wrote:
> As Richard suggested: 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632288.html
> 
> Add vect_ext_char_longlong to fix FAIL for RVV.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-multitypes-16.c: Adapt check for RVV.
>   * lib/target-supports.exp: Add vect_ext_char_longlong property.
OK
jeff

RE: [PATCH] RISC-V Regression: Make pattern match more accurate of vect-live-2.c

2023-10-10 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Tuesday, October 10, 2023 11:26 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de
Subject: Re: [PATCH] RISC-V Regression: Make pattern match more accurate of 
vect-live-2.c



On 10/10/23 08:57, Juzhe-Zhong wrote:
> Like previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632400.html
> https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac...@gmail.com/
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-live-2.c: Make pattern match more accurate.
OK
jeff

RE: [PATCH V3] RISC-V: Fix incorrect index(offset) of gather/scatter

2023-10-11 Thread Li, Pan2

Committed, thanks Robin.

Pan

-Original Message-
From: Robin Dapp  
Sent: Wednesday, October 11, 2023 5:56 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@gmail.com; kito.ch...@sifive.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH V3] RISC-V: Fix incorrect index(offset) of gather/scatter

LGTM, thanks.

Regards
 Robin

RE: [PATCH v1] RISC-V: Support FP irintf auto vectorization

2023-10-11 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, October 12, 2023 10:02 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support FP irintf auto vectorization

LGTM。 Thanks。


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-12 09:52
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Support FP irintf auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the FP irintf auto vectorization.

* int irintf (float)

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.

Given we have code like:

void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_irintf (in[i]);
}

Before this patch:
.L3:
  ...
  flw  fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw   a5,-4(a0)
  ...
  bne  a1,a4,.L3

After this patch:
.L3:
  ...
  vle32.v v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v v1,0(a0)
  ...
  bne a2,zero,.L3

The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

* config/riscv/autovec.md (lrint2): Rename from.
(lrint2): Rename to.
* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/autovec.md   |  9 ++-
gcc/config/riscv/vector-iterators.md  | 74 +--
.../riscv/rvv/autovec/unop/math-irint-0.c | 14 
.../riscv/rvv/autovec/unop/math-irint-run-0.c | 63 
.../riscv/rvv/autovec/vls/math-irint-0.c  | 30 
5 files changed, 149 insertions(+), 41 deletions(-)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-irint-0.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index dc76a01d82c..c3a51e22ceb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2240,6 +2240,7 @@ (define_expand "avg3_ceil"
;; - trunc/truncf
;; - roundeven/roundevenf
;; - lrint/lrintf
+;; - irintf
;; -
(define_expand "ceil2"
   [(match_operand:V_VLSF 0 "register_operand")
@@ -2311,12 +2312,12 @@ (define_expand "roundeven2"
   }
)
-(define_expand "lrint2"
-  [(match_operand: 0 "register_operand")
-   (match_operand:V_VLS_FCONVERTL 1 "register_operand")]
+(define_expand "lrint2"
+  [(match_operand:0 "register_operand")
+   (match_operand:V_VLS_FCONVERT_I_L_LL 1 "register_operand")]
   "TARGET_VECTOR && !flag_trapping_math && !flag_rounding_math"
   {
-riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
+riscv_vector::expand_vec_lrint (operands[0], operands[1], mode, 
mode);
 DONE;
   }
)
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index bb0c46ea30a..96ddd34c958 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3281,8 +3281,8 @@ (define_mode_attr vnnconvert [
   (V512DI "v512hf")
])
-;; L indicates convert to long
-(define_mode_attr VLCONVERT [
+;; Convert to int, long and long long
+(define_mode_attr V_I_L_LL_CONVERT [
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
@@ -3298,7 +3298,7 @@ (define_mode_attr VLCONVERT [
   (V512DF "V512DI")
])
-(define_mode_attr vlconvert [
+(define_mode_attr v_i_l_ll_convert [
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
@@ -3314,40 +3314,40 @@ (define_mode_attr vlconvert [
   (V512DF "v512di")
])
-(define_mode_iterator V_VLS_FCONVERTL [
-  (RVVM8SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM4SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")
-  (RVVM2SF "TARGET_VECTOR_ELEN_FP_32 && !TARGET_64BIT")

RE: [PATCH v1] RISC-V: Support FP llrint auto vectorization

2023-10-11 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Thursday, October 12, 2023 11:34 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization

LGTM


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-12 11:28
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Support FP llrint auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

This patch would like to support the FP llrint auto vectorization.

* long long llrint (double)

This will be the CVT from DF => DI from the standard name's perpsective,
which has been covered in previous PATCH(es). Thus, this patch only add
some test cases.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-llrint-0.c| 14 +
.../rvv/autovec/unop/math-llrint-run-0.c  | 63 +++
.../riscv/rvv/autovec/unop/test-math.h|  2 +
.../riscv/rvv/autovec/vls/math-llrint-0.c | 30 +
4 files changed, 109 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
new file mode 100644
index 000..2d90d232ba1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llrint:
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
new file mode 100644
index 000..6b69f5568e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llrint (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llrint (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llrint (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llrint (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llrint (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llrint (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llrint (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llrint (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llrint 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llrint 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llrint 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llrint 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llrint (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llrint (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llrint 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, __builtin_llrint 
(9223372036854775808.0), 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llrint 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, __builtin_llrint 
(-9223372036854777856.0), 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llrint 
(__builtin_

RE: [PATCH v1] RISC-V: Support FP llrint auto vectorization

2023-10-11 Thread Li, Pan2

Sorry for misleading here.

When implement the llrint after lrint, I realize llrint (DF => SF) are 
supported by the lrint already in the previous patche(es).
Because they same the same standard name as well as the mode iterator.

Thus, I may have 2 options here for the patch naming.

1. Only mentioned test cases for llrint.
2. Named as support similar to lrint.

After some consideration from the situation like search from the git logs, I 
choose option 2 here and add some description in
as well.

Finally, is there any best practices for this case? Thank again for comments.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, October 12, 2023 1:05 PM
To: Li, Pan2 
Cc: juzhe.zh...@rivai.ai; gcc-patches ; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization

Did I miss something? the title says support but it seems only testcase??

On Wed, Oct 11, 2023 at 8:38 PM Li, Pan2  wrote:
>
> Committed, thanks Juzhe.
>
>
>
> Pan
>
>
>
> From: juzhe.zh...@rivai.ai 
> Sent: Thursday, October 12, 2023 11:34 AM
> To: Li, Pan2 ; gcc-patches 
> Cc: Li, Pan2 ; Wang, Yanzhang ; 
> kito.cheng 
> Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization
>
>
>
> LGTM
>
>
>
> 
>
> juzhe.zh...@rivai.ai
>
>
>
> From: pan2.li
>
> Date: 2023-10-12 11:28
>
> To: gcc-patches
>
> CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
>
> Subject: [PATCH v1] RISC-V: Support FP llrint auto vectorization
>
> From: Pan Li 
>
>
>
> This patch would like to support the FP llrint auto vectorization.
>
>
>
> * long long llrint (double)
>
>
>
> This will be the CVT from DF => DI from the standard name's perpsective,
>
> which has been covered in previous PATCH(es). Thus, this patch only add
>
> some test cases.
>
>
>
> gcc/testsuite/ChangeLog:
>
>
>
> * gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
>
> * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
>
> * gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
>
> * gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.
>
>
>
> Signed-off-by: Pan Li 
>
> ---
>
> .../riscv/rvv/autovec/unop/math-llrint-0.c| 14 +
>
> .../rvv/autovec/unop/math-llrint-run-0.c  | 63 +++
>
> .../riscv/rvv/autovec/unop/test-math.h|  2 +
>
> .../riscv/rvv/autovec/vls/math-llrint-0.c | 30 +
>
> 4 files changed, 109 insertions(+)
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c
>
>
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> new file mode 100644
>
> index 000..2d90d232ba1
>
> --- /dev/null
>
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> @@ -0,0 +1,14 @@
>
> +/* { dg-do compile } */
>
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
> -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } 
> */
>
> +/* { dg-final { check-function-bodies "**" "" } } */
>
> +
>
> +#include "test-math.h"
>
> +
>
> +/*
>
> +** test_double_int64_t___builtin_llrint:
>
> +**   ...
>
> +**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
>
> +**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
>
> +**   ...
>
> +*/
>
> +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
>
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
>
> new file mode 100644
>
> index 000..6b69f5568e9
>
> --- /dev/null
>
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
>
> @@ -0,0 +1,63 @@
>
> +/* { dg-do run { target { riscv_v && rv64 } } } */
>
> +/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize 
> -fno-vect-cost-model -ffast-math" } */
>
> +
>
> +#include "test-math.h"
>
> +
>
> +#define ARRAY_SIZE 128
>
> +
>
> +double in[ARRAY_SIZE];
>
> +int64_t out[ARRAY_SIZE];
>
> +int64_t ref[ARRAY_SIZE];
>
> +
>
> +TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llrint)
>
> +TEST_ASSERT (int64_t)
>
> +
>
>

RE: [PATCH v1] RISC-V: Support FP llrint auto vectorization

2023-10-12 Thread Li, Pan2

Sure thing,  thanks a lot and will follow the guidance.

Pan

From: Kito Cheng 
Sent: Thursday, October 12, 2023 10:42 PM
To: Li, Pan2 
Cc: 钟居哲 ; gcc-patches ; Wang, 
Yanzhang 
Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization

I would prefer first approach since it no changes other than adding testcase, 
that might confusing other people.


Li, Pan2 mailto:pan2...@intel.com>> 於 2023年10月11日 週三 23:12 
寫道：
Sorry for misleading here.

When implement the llrint after lrint, I realize llrint (DF => SF) are 
supported by the lrint already in the previous patche(es).
Because they same the same standard name as well as the mode iterator.

Thus, I may have 2 options here for the patch naming.

1. Only mentioned test cases for llrint.
2. Named as support similar to lrint.

After some consideration from the situation like search from the git logs, I 
choose option 2 here and add some description in
as well.

Finally, is there any best practices for this case? Thank again for comments.

Pan

-Original Message-
From: Kito Cheng mailto:kito.ch...@gmail.com>>
Sent: Thursday, October 12, 2023 1:05 PM
To: Li, Pan2 mailto:pan2...@intel.com>>
Cc: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>
Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization

Did I miss something? the title says support but it seems only testcase??

On Wed, Oct 11, 2023 at 8:38 PM Li, Pan2 
mailto:pan2...@intel.com>> wrote:
>
> Committed, thanks Juzhe.
>
>
>
> Pan
>
>
>
> From: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai> 
> mailto:juzhe.zh...@rivai.ai>>
> Sent: Thursday, October 12, 2023 11:34 AM
> To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
> mailto:gcc-patches@gcc.gnu.org>>
> Cc: Li, Pan2 mailto:pan2...@intel.com>>; Wang, Yanzhang 
> mailto:yanzhang.w...@intel.com>>; kito.cheng 
> mailto:kito.ch...@gmail.com>>
> Subject: Re: [PATCH v1] RISC-V: Support FP llrint auto vectorization
>
>
>
> LGTM
>
>
>
> 
>
> juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>
>
>
>
> From: pan2.li<http://pan2.li>
>
> Date: 2023-10-12 11:28
>
> To: gcc-patches
>
> CC: juzhe.zhong; pan2.li<http://pan2.li>; yanzhang.wang; kito.cheng
>
> Subject: [PATCH v1] RISC-V: Support FP llrint auto vectorization
>
> From: Pan Li mailto:pan2...@intel.com>>
>
>
>
> This patch would like to support the FP llrint auto vectorization.
>
>
>
> * long long llrint (double)
>
>
>
> This will be the CVT from DF => DI from the standard name's perpsective,
>
> which has been covered in previous PATCH(es). Thus, this patch only add
>
> some test cases.
>
>
>
> gcc/testsuite/ChangeLog:
>
>
>
> * gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
>
> * gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
>
> * gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
>
> * gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.
>
>
>
> Signed-off-by: Pan Li mailto:pan2...@intel.com>>
>
> ---
>
> .../riscv/rvv/autovec/unop/math-llrint-0.c| 14 +
>
> .../rvv/autovec/unop/math-llrint-run-0.c  | 63 +++
>
> .../riscv/rvv/autovec/unop/test-math.h|  2 +
>
> .../riscv/rvv/autovec/vls/math-llrint-0.c | 30 +
>
> 4 files changed, 109 insertions(+)
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
>
> create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c
>
>
>
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> new file mode 100644
>
> index 000..2d90d232ba1
>
> --- /dev/null
>
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
>
> @@ -0,0 +1,14 @@
>
> +/* { dg-do compile } */
>
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
> -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } 
> */
>
> +/* { dg-final { check-function-bodies "**" "" } } */
>
> +
>
> +#include "test-math.h"
>
> +
>
> +/*
>
> +** test_double_int64_t___builtin_llrint:
>
> +**   ...
>
> +**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
>
> +**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9

RE: [PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases

2023-10-12 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 10:26 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases

LGTM。


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 10:22
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Leverage stdint-gcc.h for RVV test cases
From: Pan Li mailto:pan2...@intel.com>>

Leverage stdint-gcc.h for the int64_t types instead of typedef.
Or we may have conflict with stdint-gcc.h in somewhere else.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: Include
stdint-gcc.h for int types.
* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/test-math.h: Remove int64_t
typedef.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c | 1 +
.../gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c   | 1 +
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h | 2 --
3 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
index 2d90d232ba1..4bf125f8cc8 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c
@@ -2,6 +2,7 @@
/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
/* { dg-final { check-function-bodies "**" "" } } */
+#include 
#include "test-math.h"
/*
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
index 6b69f5568e9..409175a8dff 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c
@@ -1,6 +1,7 @@
/* { dg-do run { target { riscv_v && rv64 } } } */
/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+#include 
#include "test-math.h"
#define ARRAY_SIZE 128
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
index 3867bc50a14..a1c9d55bd48 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/test-math.h
@@ -68,8 +68,6 @@
#define FRM_RMM 4
#define FRM_DYN 7
-typedef long long int64_t;
-
static inline void
set_rm (unsigned rm)
{
--
2.34.1

RE: [PATCH v1] RISC-V: Add test for FP iroundf auto vectorization

2023-10-12 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zhong 
Sent: Friday, October 13, 2023 1:39 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Add test for FP iroundf auto vectorization

lgtm
 Replied Message 
From
pan2...@intel.com<mailto:pan2...@intel.com>
Date
10/13/2023 13:33
To
gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>
Cc
juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>,
pan2...@intel.com<mailto:pan2...@intel.com>,
yanzhang.w...@intel.com<mailto:yanzhang.w...@intel.com>,
kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>
Subject
[PATCH v1] RISC-V: Add test for FP iroundf auto vectorization

RE: [PATCH v1] RISC-V: Add test for FP llround auto vectorization

2023-10-12 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 2:19 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Add test for FP llround auto vectorization

OK



juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 14:15
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Add test for FP llround auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llround (double);

This patch would like to add the test cases for ensuring the correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llround-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llround-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-llround-0.c   | 20 ++
.../rvv/autovec/unop/math-llround-run-0.c | 64 +++
.../riscv/rvv/autovec/vls/math-llround-0.c| 30 +
3 files changed, 114 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llround-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
new file mode 100644
index 000..4f8b4553a91
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llround:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+4
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
new file mode 100644
index 000..c5b60847cc7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llround-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llround)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llround (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llround (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llround (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llround (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llround (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llround (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llround (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llround (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llround 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llround 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llround 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llround 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llround (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llround (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llround 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llround 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llround 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_l

RE: [PATCH v1] RISC-V: Add test for FP llceil auto vectorization

2023-10-13 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 3:33 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Add test for FP llceil auto vectorization

OK


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 15:20
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Add test for FP llceil auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llceil (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-llceil-0.c| 20 ++
.../rvv/autovec/unop/math-llceil-run-0.c  | 64 +++
.../riscv/rvv/autovec/vls/math-llceil-0.c | 30 +
3 files changed, 114 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llceil-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
new file mode 100644
index 000..3480c3ea91d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llceil:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+3
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
new file mode 100644
index 000..5ccbe64ffb5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llceil-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llceil)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llceil (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llceil (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llceil (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llceil (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llceil (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llceil (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llceil (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llceil (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llceil 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llceil 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llceil 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llceil 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llceil (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llceil (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llceil 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llceil 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llceil 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_llceil 
(-__builtin_inf ()), 20)
+TEST_INI

RE: [PATCH v1] RISC-V: Add test for FP iceil auto vectorization

2023-10-13 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 4:08 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Add test for FP iceil auto vectorization

Ok


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 16:06
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Add test for FP iceil auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int iceil (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-iceil-0.c | 19 ++
.../riscv/rvv/autovec/unop/math-iceil-run-0.c | 63 +++
.../riscv/rvv/autovec/vls/math-iceil-0.c  | 30 +
3 files changed, 112 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-iceil-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
new file mode 100644
index 000..2d4a1d163d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-0.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_iceilf:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+3
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
new file mode 100644
index 000..714173a7f8b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-iceil-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_iceilf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_iceilf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_iceilf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_iceilf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_iceilf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_iceilf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_iceilf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_iceilf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_iceilf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_iceilf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_iceilf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_iceilf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_iceilf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_iceilf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_iceilf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_iceilf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_iceilf (-2147483648.0), 17)
+TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_iceilf (__builtin_inff 
()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_iceilf 
(-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_iceilf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT

RE: [PATCH v1] RISC-V: Add test for FP ifloor auto vectorization

2023-10-13 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 4:42 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Add test for FP ifloor auto vectorization

OK


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 16:23
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Add test for FP ifloor auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

int ifloor (float);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-ifloor-0.c| 19 ++
.../rvv/autovec/unop/math-ifloor-run-0.c  | 63 +++
.../riscv/rvv/autovec/vls/math-ifloor-0.c | 30 +
3 files changed, 112 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-ifloor-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
new file mode 100644
index 000..b9ec415d690
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-0.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "test-math.h"
+
+/*
+** test_float_int___builtin_ifloorf:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+2
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
new file mode 100644
index 000..8ef4da0ea88
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ifloor-run-0.c
@@ -0,0 +1,63 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+float in[ARRAY_SIZE];
+int out[ARRAY_SIZE];
+int ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (float, int, __builtin_ifloorf)
+TEST_ASSERT (int)
+
+TEST_INIT_CVT (float, 1.2, int, __builtin_ifloorf (1.2), 1)
+TEST_INIT_CVT (float, -1.2, int, __builtin_ifloorf (-1.2), 2)
+TEST_INIT_CVT (float, 0.5, int, __builtin_ifloorf (0.5), 3)
+TEST_INIT_CVT (float, -0.5, int, __builtin_ifloorf (-0.5), 4)
+TEST_INIT_CVT (float, 0.1, int, __builtin_ifloorf (0.1), 5)
+TEST_INIT_CVT (float, -0.1, int, __builtin_ifloorf (-0.1), 6)
+TEST_INIT_CVT (float, 3.0, int, __builtin_ifloorf (3.0), 7)
+TEST_INIT_CVT (float, -3.0, int, __builtin_ifloorf (-3.0), 8)
+TEST_INIT_CVT (float, 8388607.5, int, __builtin_ifloorf (8388607.5), 9)
+TEST_INIT_CVT (float, 8388609.0, int, __builtin_ifloorf (8388609.0), 10)
+TEST_INIT_CVT (float, -8388607.5, int, __builtin_ifloorf (-8388607.5), 11)
+TEST_INIT_CVT (float, -8388609.0, int, __builtin_ifloorf (-8388609.0), 12)
+TEST_INIT_CVT (float, 0.0, int, __builtin_ifloorf (-0.0), 13)
+TEST_INIT_CVT (float, -0.0, int, __builtin_ifloorf (-0.0), 14)
+TEST_INIT_CVT (float, 2147483520.0, int, __builtin_ifloorf (2147483520.0), 15)
+TEST_INIT_CVT (float, 2147483648.0, int, 0x7fff, 16)
+TEST_INIT_CVT (float, -2147483648.0, int, __builtin_ifloorf (-2147483648.0), 
17)
+TEST_INIT_CVT (float, -2147483904.0, int, 0x8000, 18)
+TEST_INIT_CVT (float, __builtin_inf (), int, __builtin_ifloorf (__builtin_inff 
()), 19)
+TEST_INIT_CVT (float, -__builtin_inf (), int, __builtin_ifloorf 
(-__builtin_inff ()), 20)
+TEST_INIT_CVT (float, __builtin_nanf (""), int, 0x7fff, 21)
+
+int
+main ()
+{
+  RUN_TEST_CVT (float, int, 1, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 2, __builtin_ifloorf, in, out, ref, ARRAY_SIZE);
+  RUN_TEST_CVT (float, int, 3, __builtin_ifloorf, in, out,

RE: [PATCH v1] RISC-V: Add test for FP llfloor auto vectorization

2023-10-13 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Friday, October 13, 2023 6:31 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Add test for FP llfloor auto vectorization

OK


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-13 17:49
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Add test for FP llfloor auto vectorization
From: Pan Li mailto:pan2...@intel.com>>

The below FP API are supported already by sharing the same standard
name, as well as the machine mode.

long long llfloor (double);

This patch would like to add the test cases for ensuring the
correctness.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
.../riscv/rvv/autovec/unop/math-llfloor-0.c   | 20 ++
.../rvv/autovec/unop/math-llfloor-run-0.c | 64 +++
.../riscv/rvv/autovec/vls/math-llfloor-0.c| 30 +
3 files changed, 114 insertions(+)
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/math-llfloor-0.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
new file mode 100644
index 000..4b10f966015
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-0.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include 
+#include "test-math.h"
+
+/*
+** test_double_int64_t___builtin_llfloor:
+**   frrm\s+[atx][0-9]+
+**   ...
+**   fsrmi\s+2
+**   ...
+**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma
+**   vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+
+**   ...
+**   fsrm\s+[atx][0-9]+
+**   ret
+*/
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
new file mode 100644
index 000..22829132e96
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llfloor-run-0.c
@@ -0,0 +1,64 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-additional-options "-std=c99 -O3 -ftree-vectorize -fno-vect-cost-model 
-ffast-math" } */
+
+#include 
+#include "test-math.h"
+
+#define ARRAY_SIZE 128
+
+double in[ARRAY_SIZE];
+int64_t out[ARRAY_SIZE];
+int64_t ref[ARRAY_SIZE];
+
+TEST_UNARY_CALL_CVT (double, int64_t, __builtin_llfloor)
+TEST_ASSERT (int64_t)
+
+TEST_INIT_CVT (double, 1.2, int64_t, __builtin_llfloor (1.2), 1)
+TEST_INIT_CVT (double, -1.2, int64_t, __builtin_llfloor (-1.2), 2)
+TEST_INIT_CVT (double, 0.5, int64_t, __builtin_llfloor (0.5), 3)
+TEST_INIT_CVT (double, -0.5, int64_t, __builtin_llfloor (-0.5), 4)
+TEST_INIT_CVT (double, 0.1, int64_t, __builtin_llfloor (0.1), 5)
+TEST_INIT_CVT (double, -0.1, int64_t, __builtin_llfloor (-0.1), 6)
+TEST_INIT_CVT (double, 3.0, int64_t, __builtin_llfloor (3.0), 7)
+TEST_INIT_CVT (double, -3.0, int64_t, __builtin_llfloor (-3.0), 8)
+TEST_INIT_CVT (double, 4503599627370495.5, int64_t, __builtin_llfloor 
(4503599627370495.5), 9)
+TEST_INIT_CVT (double, 4503599627370497.0, int64_t, __builtin_llfloor 
(4503599627370497.0), 10)
+TEST_INIT_CVT (double, -4503599627370495.5, int64_t, __builtin_llfloor 
(-4503599627370495.5), 11)
+TEST_INIT_CVT (double, -4503599627370496.0, int64_t, __builtin_llfloor 
(-4503599627370496.0), 12)
+TEST_INIT_CVT (double, 0.0, int64_t, __builtin_llfloor (-0.0), 13)
+TEST_INIT_CVT (double, -0.0, int64_t, __builtin_llfloor (-0.0), 14)
+TEST_INIT_CVT (double, 9223372036854774784.0, int64_t, __builtin_llfloor 
(9223372036854774784.0), 15)
+TEST_INIT_CVT (double, 9223372036854775808.0, int64_t, 0x7fff, 16)
+TEST_INIT_CVT (double, -9223372036854775808.0, int64_t, __builtin_llfloor 
(-9223372036854775808.0), 17)
+TEST_INIT_CVT (double, -9223372036854777856.0, int64_t, 0x8000, 18)
+TEST_INIT_CVT (double, __builtin_inf (), int64_t, __builtin_llfloor 
(__builtin_inf ()), 19)
+TEST_INIT_CVT (double, -__builtin_inf (), int64_t, __builtin_l

RE: [PATCH] RISC-V Regression: Fix FAIL of bb-slp-68.c for RVV

2023-10-13 Thread Li, Pan2

Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Friday, October 13, 2023 8:00 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V Regression: Fix FAIL of bb-slp-68.c for RVV

On Fri, 13 Oct 2023, Juzhe-Zhong wrote:

> Like comment said, this test failed on 64 bytes vector.
> Both RVV and GCN has 64 bytes vector.
> 
> So it's more reasonable to use vect512.

OK

> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/bb-slp-68.c: Use vect512.
> 
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-68.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
> index e7573a14933..2dd3d8ee90c 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
> @@ -20,4 +20,4 @@ void foo ()
>  
>  /* We want to have the store group split into 4, 2, 4 when using 32byte 
> vectors.
> Unfortunately it does not work when 64-byte vectors are available.  */
> -/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail amdgcn-*-* 
> } } } */
> +/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail vect512 } } 
> } */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-18 Thread Li, Pan2

Thanks Richard, let's wait for a while incase there are comments from others 
due to not familiar with these parts.

Pan

-Original Message-
From: Richard Biener  
Sent: Wednesday, October 18, 2023 2:34 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 

Subject: Re: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

On Wed, Oct 18, 2023 at 3:20 AM  wrote:
>
> From: Pan Li 
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to remove this data
> type size check and unblock the standard name like lrintmn2.
>
> Passed the x86 bootstrap and regression test already.

OK.

On x86 we seem to have lrintsfdi2 but not lrintv4sfv4di2, with SLP
vectorization we could expect to see the following vectorized after
the patch (with loop vectorization you'll see us pre-select same sized
vector types)

long int x[4];
float y[4];

void foo ()
{
  x[0] = __builtin_lrintf (y[0]);
  x[1] = __builtin_lrintf (y[1]);
  x[2] = __builtin_lrintf (y[2]);
  x[3] = __builtin_lrintf (y[3]);
}


> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_call): Remove data size
> check.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/tree-vect-stmts.cc | 13 -
>  1 file changed, 13 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index b3a56498595..326e000a71d 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3529,19 +3529,6 @@ vectorizable_call (vec_info *vinfo,
>
>return false;
>  }
> -  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
> - just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
> - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
> - by a pack of the two vectors into an SI vector.  We would need
> - separate code to handle direct VnDI->VnSI IFN_CTZs.  */
> -  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
> -{
> -  if (dump_enabled_p ())
> -   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -"mismatched vector sizes %T and %T\n",
> -vectype_in, vectype_out);
> -  return false;
> -}
>
>if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
>!= VECTOR_BOOLEAN_TYPE_P (vectype_in))
> --
> 2.34.1
>

RE: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

2023-10-20 Thread Li, Pan2

Hi Richard Biener,

The CI of linaro-toolch...@lists.linaro.org reports some aarch64 regression of 
this change, I will double check about it soon.

FAIL: 12 regressions

regressions.sum:
=== gcc tests ===

Running gcc:gcc.target/aarch64/sve/aarch64-sve.exp ...
FAIL: gcc.target/aarch64/sve/clrsb_1.c (internal compiler error: in 
expand_fn_using_insn, at internal-fn.cc:284)
FAIL: gcc.target/aarch64/sve/clrsb_1.c (test for excess errors)
FAIL: gcc.target/aarch64/sve/clrsb_1.c scan-assembler-times 
\\tcls\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d\\n 2
FAIL: gcc.target/aarch64/sve/clrsb_1.c scan-assembler-times 
\\tuzp1\\tz[0-9]+\\.s, z[0-9]+\\.s, z[0-9]+\\.s\\n 1
FAIL: gcc.target/aarch64/sve/clz_1.c (internal compiler error: in 
expand_fn_using_insn, at internal-fn.cc:284)
FAIL: gcc.target/aarch64/sve/clz_1.c (test for excess errors)
FAIL: gcc.target/aarch64/sve/clz_1.c scan-assembler-times \\tclz\\tz[0-9]+\\.d, 
p[0-7]/m, z[0-9]+\\.d\\n 2
... and 7 more entries

Pan

-Original Message-
From: Richard Biener  
Sent: Wednesday, October 18, 2023 2:34 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 

Subject: Re: [PATCH v1] RISC-V: Remove the type size restriction of vectorizer

On Wed, Oct 18, 2023 at 3:20 AM  wrote:
>
> From: Pan Li 
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to remove this data
> type size check and unblock the standard name like lrintmn2.
>
> Passed the x86 bootstrap and regression test already.

OK.

On x86 we seem to have lrintsfdi2 but not lrintv4sfv4di2, with SLP
vectorization we could expect to see the following vectorized after
the patch (with loop vectorization you'll see us pre-select same sized
vector types)

long int x[4];
float y[4];

void foo ()
{
  x[0] = __builtin_lrintf (y[0]);
  x[1] = __builtin_lrintf (y[1]);
  x[2] = __builtin_lrintf (y[2]);
  x[3] = __builtin_lrintf (y[3]);
}


> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_call): Remove data size
> check.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/tree-vect-stmts.cc | 13 -
>  1 file changed, 13 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index b3a56498595..326e000a71d 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3529,19 +3529,6 @@ vectorizable_call (vec_info *vinfo,
>
>return false;
>  }
> -  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
> - just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
> - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
> - by a pack of the two vectors into an SI vector.  We would need
> - separate code to handle direct VnDI->VnSI IFN_CTZs.  */
> -  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
> -{
> -  if (dump_enabled_p ())
> -   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -"mismatched vector sizes %T and %T\n",
> -vectype_in, vectype_out);
> -  return false;
> -}
>
>if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
>!= VECTOR_BOOLEAN_TYPE_P (vectype_in))
> --
> 2.34.1
>

RE: [PATCH v2] RISC-V: Support partial VLS mode when preference fixed-vlmax [PR111857]

2023-10-20 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

RE: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math

2023-10-22 Thread Li, Pan2

Yes, it is required by the second cvt. The unmasked elements keep the original 
values.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 23, 2023 9:35 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in 
math

UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,

Are they still necessary ?

juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-23 09:26
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math
From: Pan Li mailto:pan2...@intel.com>>

For math function autovec, there will be one step like

rtx tmp = gen_reg_rtx (vec_int_mode);
emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);

The MU will leave the tmp (aka dest register) register unmasked elements
unchanged and it is undefined here. This patch would like to adjust the
MU to MA.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum insn_type): Add new type
values.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): Add undef merge
operand handling.
(expand_vec_ceil): Take MA instead of MU for tmp register.
(expand_vec_floor): Ditto.
(expand_vec_nearbyint): Ditto.
(expand_vec_rint): Ditto.
(expand_vec_round): Ditto.
(expand_vec_roundeven): Ditto.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-protos.h |  5 +
gcc/config/riscv/riscv-v.cc | 24 
2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f7a9a02f1f9..5dc97c2adc0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -306,6 +306,11 @@ enum insn_type : unsigned int
   UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
   UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P,
   UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_DYN = UNARY_OP_TAMA | FRM_DYN_P,
+  UNARY_OP_TAMA_FRM_RUP = UNARY_OP_TAMA | FRM_RUP_P,
+  UNARY_OP_TAMA_FRM_RDN = UNARY_OP_TAMA | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_RMM = UNARY_OP_TAMA | FRM_RMM_P,
+  UNARY_OP_TAMA_FRM_RNE = UNARY_OP_TAMA | FRM_RNE_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 383af55fe3a..91ad6a61fa8 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4108,10 +4108,18 @@ static void
emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask,
  insn_type type, machine_mode vec_mode)
{
-  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
   insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_mode);
-  emit_vlmax_insn (icode, type, cvt_x_ops);
+  if (type & USE_VUNDEF_MERGE_P)
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
+  else
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
}
static void
@@ -4157,7 +4165,7 @@ expand_vec_ceil (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   /* Step-3: Convert to integer on mask, with rounding up (aka ceil).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RUP, vec_fp_mode);
   /* Step-4: Convert to floating-point on mask for the final result.
  To avoid unnecessary frm register access, we use RUP here and it will
@@ -4182,7 +4190,7 @@ expand_vec_floor (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   /* Step-3: Convert to integer on mask, with rounding down (aka floor).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RDN, vec_fp_mode);
   /* Step-4: Convert to floating-point on mask for the floor result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_RDN, vec_fp_mode);
@@ -4208,7 +4216,7 @@ expand_vec_nearbyint (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   /* Step-4: Convert to integer on mask, with rounding down (aka nearbyint).  
*/
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_DYN, vec_fp_mode);
   /* Step-5: Convert to floating-point on mask for the nearbyint result.  */
   emit_vec_cvt_f_x (op_0,

RE: RE: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math

2023-10-22 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 23, 2023 9:44 AM
To: Li, Pan2 ; gcc-patches 
Cc: Wang, Yanzhang ; kito.cheng 
Subject: Re: RE: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register 
in math

OK。 LGTM。


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: Li, Pan2<mailto:pan2...@intel.com>
Date: 2023-10-23 09:42
To: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>; 
gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: Wang, Yanzhang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: RE: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in 
math
Yes, it is required by the second cvt. The unmasked elements keep the original 
values.

Pan

From: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai> 
mailto:juzhe.zh...@rivai.ai>>
Sent: Monday, October 23, 2023 9:35 AM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>; kito.cheng 
mailto:kito.ch...@gmail.com>>
Subject: Re: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in 
math

UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,

Are they still necessary ?

juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-23 09:26
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for merging undefined tmp register in math
From: Pan Li mailto:pan2...@intel.com>>

For math function autovec, there will be one step like

rtx tmp = gen_reg_rtx (vec_int_mode);
emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);

The MU will leave the tmp (aka dest register) register unmasked elements
unchanged and it is undefined here. This patch would like to adjust the
MU to MA.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (enum insn_type): Add new type
values.
* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): Add undef merge
operand handling.
(expand_vec_ceil): Take MA instead of MU for tmp register.
(expand_vec_floor): Ditto.
(expand_vec_nearbyint): Ditto.
(expand_vec_rint): Ditto.
(expand_vec_round): Ditto.
(expand_vec_roundeven): Ditto.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-protos.h |  5 +
gcc/config/riscv/riscv-v.cc | 24 
2 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index f7a9a02f1f9..5dc97c2adc0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -306,6 +306,11 @@ enum insn_type : unsigned int
   UNARY_OP_FRM_RMM = UNARY_OP | FRM_RMM_P,
   UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P,
   UNARY_OP_FRM_RDN = UNARY_OP | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_DYN = UNARY_OP_TAMA | FRM_DYN_P,
+  UNARY_OP_TAMA_FRM_RUP = UNARY_OP_TAMA | FRM_RUP_P,
+  UNARY_OP_TAMA_FRM_RDN = UNARY_OP_TAMA | FRM_RDN_P,
+  UNARY_OP_TAMA_FRM_RMM = UNARY_OP_TAMA | FRM_RMM_P,
+  UNARY_OP_TAMA_FRM_RNE = UNARY_OP_TAMA | FRM_RNE_P,
   UNARY_OP_TAMU_FRM_DYN = UNARY_OP_TAMU | FRM_DYN_P,
   UNARY_OP_TAMU_FRM_RUP = UNARY_OP_TAMU | FRM_RUP_P,
   UNARY_OP_TAMU_FRM_RDN = UNARY_OP_TAMU | FRM_RDN_P,
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 383af55fe3a..91ad6a61fa8 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4108,10 +4108,18 @@ static void
emit_vec_cvt_x_f (rtx op_dest, rtx op_src, rtx mask,
  insn_type type, machine_mode vec_mode)
{
-  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
   insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, vec_mode);
-  emit_vlmax_insn (icode, type, cvt_x_ops);
+  if (type & USE_VUNDEF_MERGE_P)
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
+  else
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
}
static void
@@ -4157,7 +4165,7 @@ expand_vec_ceil (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   /* Step-3: Convert to integer on mask, with rounding up (aka ceil).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMU_FRM_RUP, vec_fp_mode);
+  emit_vec_cvt_x_f (tmp, op_1, mask, UNARY_OP_TAMA_FRM_RUP, vec_fp_mode);
   /* Step-4: Convert to floating-point on mask for the final result.
  To avoid unnecessary frm register access, we use RUP here and it will
@@ -41

RE: [PATCH] RISC-V: Fix AVL_TYPE attribute of tuple mode mov

2023-10-22 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, October 23, 2023 10:24 AM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Fix AVL_TYPE attribute of tuple mode mov



On 10/22/23 16:46, Juzhe-Zhong wrote:
> The tuple mode mov pattern doesn't have avl_type so it is invalid 
> attribute.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/vector.md: Fix avl_type attribute of tuple mov.
Presumably you got a fault or something similar trying to compute the 
avl_type attr when trying to access operands[7]? from this code:

> (eq_attr "type" 
> "vlde,vldff,vste,vimov,vimov,vimov,vfmov,vext,vimerge,\
>   
> vfsqrt,vfrecp,vfmerge,vfcvtitof,vfcvtftoi,vfwcvtitof,\
>   
> vfwcvtftoi,vfwcvtftof,vfncvtitof,vfncvtftoi,vfncvtftof,\
>   vfclass,vired,viwred,vfredu,vfredo,vfwredu,vfwredo,\
>   vimovxv,vfmovfv,vlsegde,vlsegdff")
>(symbol_ref "INTVAL (operands[7])")
>  (eq_attr "type" "vldm,vstm,vimov,vmalu,vmalu")


OK for the trunk.

Jeff

RE: [PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc

2023-10-23 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 23, 2023 3:56 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc

LGTM。


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-23 15:53
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for merging undef tmp register for trunc
From: Pan Li mailto:pan2...@intel.com>>

For trunc function autovec, there will be one step like below take MU
for the merge operand.

rtx tmp = gen_reg_rtx (vec_int_mode);
emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode);

The MU will leave the tmp (aka dest register) register unmasked elements
unchanged and it is undefined here. This patch would like to adjust the
MU to MA.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (emit_vec_cvt_x_f_rtz): Add insn type
arg.
(expand_vec_trunc): Take MA instead of MU for cvt_x_f_rtz.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-v.cc | 16 
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 91ad6a61fa8..fb6a4e561db 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4144,12 +4144,20 @@ emit_vec_cvt_f_x (rtx op_dest, rtx op_src, rtx mask,
static void
emit_vec_cvt_x_f_rtz (rtx op_dest, rtx op_src, rtx mask,
-   machine_mode vec_mode)
+   insn_type type, machine_mode vec_mode)
{
-  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
   insn_code icode = code_for_pred (FIX, vec_mode);
-  emit_vlmax_insn (icode, UNARY_OP_TAMU, cvt_x_ops);
+  if (type & USE_VUNDEF_MERGE_P)
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
+  else
+{
+  rtx cvt_x_ops[] = {op_dest, mask, op_dest, op_src};
+  emit_vlmax_insn (icode, type, cvt_x_ops);
+}
}
void
@@ -4285,7 +4293,7 @@ expand_vec_trunc (rtx op_0, rtx op_1, machine_mode 
vec_fp_mode,
   /* Step-3: Convert to integer on mask, rounding to zero (aka truncate).  */
   rtx tmp = gen_reg_rtx (vec_int_mode);
-  emit_vec_cvt_x_f_rtz (tmp, op_1, mask, vec_fp_mode);
+  emit_vec_cvt_x_f_rtz (tmp, op_1, mask, UNARY_OP_TAMA, vec_fp_mode);
   /* Step-4: Convert to floating-point on mask for the rint result.  */
   emit_vec_cvt_f_x (op_0, tmp, mask, UNARY_OP_TAMU_FRM_DYN, vec_fp_mode);
--
2.34.1

RE: [PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt

2023-10-23 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Monday, October 23, 2023 5:57 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt

LGTM。


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-10-23 17:54
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Remove unnecessary asm check for vec cvt
From: Pan Li mailto:pan2...@intel.com>>

The vsetvl asm check is unnecessary for the vector convert. We
should be focus for constrait and leave the vsetvl test to the
vsetvl pass.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/cvt-0.c: Remove the vsetvl
asm check from func body.
* gcc.target/riscv/rvv/autovec/unop/cvt-1.c: Ditto.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c | 3 +--
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
index 762b1408994..7d66ed3e943 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-0.c
@@ -7,9 +7,8 @@
/*
** test_int65_to_fp16:
**   ...
-**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma
**   vfncvt\.f\.x\.w\s+v[0-9]+,\s*v[0-9]+
-**   vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma
+**   ...
**   vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+
**   ...
*/
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
index 3180ba3612c..af08c51ef8b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/cvt-1.c
@@ -7,9 +7,8 @@
/*
** test_uint65_to_fp16:
**   ...
-**   vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*mf2,\s*ta,\s*ma
**   vfncvt\.f\.xu\.w\s+v[0-9]+,\s*v[0-9]+
-**   vsetvli\s+zero,\s*zero,\s*e16,\s*mf4,\s*ta,\s*ma
+**   ...
**   vfncvt\.f\.f\.w\s+v[0-9]+,\s*v[0-9]+
**   ...
*/
--
2.34.1

RE: [PATCH V2] RISC-V: Fix ICE for the fusion case from vsetvl to scalar move[PR111927]

2023-10-23 Thread Li, Pan2

Committed, thanks Kito.

Pan

From: Kito Cheng 
Sent: Monday, October 23, 2023 5:50 PM
To: Juzhe-Zhong 
Cc: GCC Patches ; Kito Cheng ; 
Jeff Law ; Robin Dapp 
Subject: Re: [PATCH V2] RISC-V: Fix ICE for the fusion case from vsetvl to 
scalar move[PR111927]

LGTM

Juzhe-Zhong mailto:juzhe.zh...@rivai.ai>> 於 2023年10月23日 
週一 17:41 寫道：
ICE:

during RTL pass: vsetvl
: In function 'riscv_lms_f32':
:240:1: internal compiler error: in merge, at 
config/riscv/riscv-vsetvl.cc:1997
  240 | }

In general compatible_p (avl_equal_p) has:

if (next.has_vl () && next.vl_used_by_non_rvv_insn_p ())
  return false;

Don't fuse AVL of vsetvl if the VL operand is used by non-RVV instructions.

It is reasonable to add it into 'can_use_next_avl_p' since we don't want to
fuse AVL of vsetvl into a scalar move instruction which doesn't demand AVL.
And after the fusion, we will alway use compatible_p to check whether the demand
is correct or not.

PR target/111927

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc: Fix bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr111927.c: New test.

---
 gcc/config/riscv/riscv-vsetvl.cc  |  23 +++
 .../gcc.target/riscv/rvv/vsetvl/pr111927.c| 170 ++
 2 files changed, 193 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111927.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 47b459fddd4..f3922a051c5 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1541,6 +1541,29 @@ private:
   inline bool can_use_next_avl_p (const vsetvl_info &prev,
  const vsetvl_info &next)
   {
+/* Forbid the AVL/VL propagation if VL of NEXT is used
+   by non-RVV instructions.  This is because:
+
+bb 2:
+  PREV: scalar move (no AVL)
+bb 3:
+  NEXT: vsetvl a5(VL), a4(AVL) ...
+  branch a5,zero
+
+   Since user vsetvl instruction is no side effect instruction
+   which should be placed in the correct and optimal location
+   of the program by the previous PASS, it is unreasonable that
+   VSETVL PASS tries to move it to another places if it used by
+   non-RVV instructions.
+
+   Note: We only forbid the cases that VL is used by the following
+   non-RVV instructions which will cause issues.  We don't forbid
+   other cases since it won't cause correctness issues and we still
+   more demand info are fused backward.  The later LCM algorithm
+   should know the optimal location of the vsetvl.  */
+if (next.has_vl () && next.vl_used_by_non_rvv_insn_p ())
+  return false;
+
 if (!next.has_nonvlmax_reg_avl () && !next.has_vl ())
   return true;

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111927.c 
b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111927.c
new file mode 100644
index 000..ab599add57f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111927.c
@@ -0,0 +1,170 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+
+#include "riscv_vector.h"
+
+#define RISCV_MATH_LOOPUNROLL
+#define RISCV_MATH_VECTOR
+typedef  float float32_t;
+
+  typedef struct
+  {
+  uint16_t numTaps;/**< number of coefficients in the filter. */
+  float32_t *pState;   /**< points to the state variable array. The 
array is of length numTaps+blockSize-1. */
+  float32_t *pCoeffs;  /**< points to the coefficient array. The array 
is of length numTaps. */
+  float32_t mu;/**< step size that controls filter coefficient 
updates. */
+  } riscv_lms_instance_f32;
+
+
+void riscv_lms_f32(
+  const riscv_lms_instance_f32 * S,
+  const float32_t * pSrc,
+float32_t * pRef,
+float32_t * pOut,
+float32_t * pErr,
+uint32_t blockSize)
+{
+float32_t *pState = S->pState; /* State pointer */
+float32_t *pCoeffs = S->pCoeffs;   /* Coefficient pointer 
*/
+float32_t *pStateCurnt;/* Points to the 
current sample of the state */
+float32_t *px, *pb;/* Temporary pointers 
for state and coefficient buffers */
+float32_t mu = S->mu;  /* Adaptive factor */
+float32_t acc, e;  /* Accumulator, error */
+float32_t w;   /* Weight factor */
+uint32_t numTaps = S->numTaps; /* Number of filter 
coefficients in the filter */
+uint32_t tapCnt, blkCnt;   /* Loop counters */
+
+  /* Initializations of error,  difference, Coefficient update */
+  e = 0.0f;
+  w = 0.0f;
+
+  /* S->pState points to state array which contains previous frame (numTaps - 
1) samples */
+  /* pStateCurnt points to the location where the new input data should be 
written */
+  pStateCurnt = &(S

RE: Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-26 Thread Li, Pan2

Just apply v2 version for RV32 with spike riscv-sim for confirmation.

This patch only increased 2 popcount run failures as well as 2 dump failures, 
and the mask_gather_load_run-11.c is PASS within spike.

Pan

-Original Message-
From: juzhe.zh...@rivai.ai  
Sent: Thursday, October 26, 2023 9:27 AM
To: Patrick O'Neill ; gcc-patches 

Cc: kito.cheng ; Kito.cheng ; 
jeffreyalaw ; Robin Dapp 
Subject: Re: Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV 
auto-vectorization

I think it's QEMU issue:

line 15: 1520161 Aborted                 (core dumped) 
QEMU_CPU="$(march-to-cpu-opt --get-riscv-tag $1)" qemu-riscv$xlen -r 5.10 
"${qemu_args[@]}" -L ${RISC_V_SYSROOT} "$@"
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
execution test

I use SPIKE works fine. This is my SPIKE configuration

spike \
    --isa=rv64gcv_zvfh_zfh \
    --misaligned \
    ${PK_PATH}/pk${xlen} "$@"



juzhe.zh...@rivai.ai
 
From: Patrick O'Neill
Date: 2023-10-26 09:22
To: juzhe.zh...@rivai.ai; gcc-patches
CC: kito.cheng; Kito.cheng; jeffreyalaw; Robin Dapp
Subject: Re: [PATCH] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

On 10/25/23 17:49, juzhe.zh...@rivai.ai wrote:
FAIL: gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul4-5.c -O3 -ftree-vectorize 
--param riscv-autovec-lmul=dynamic  scan-assembler e32,m4
FAIL: gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-2.c -O3 -ftree-vectorize 
--param riscv-autovec-lmul=dynamic  scan-assembler e32,m8

These 2 FAILs are bogus. Testcases need to be adapted, I notice I didn't 
include this in this patch.

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test
FAIL: gcc.target/riscv/rvv/autovec/unop/popcount-run-1.c execution test

These 2 already exist on the trunk for RV32.

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
execution test 
This FAIL for RV64 is odd. I don't have it.  Could you share me the debug log ?
rv64gcv debug log:

Executing on host: 
/scratch/tc-testing/tc-avl/build-rv64gcv/build-gcc-linux-stage2/gcc/xgcc 
-B/scratch/tc-testing/tc-avl/build-rv64gcv/build-gcc-linux-stage2/gcc/  
/scratch/tc-testing/tc-avl/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
  -march=rv64gcv -mabi=lp64d -mcmodel=medlow   -fdiagnostics-plain-output   
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param 
riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math -mcmodel=medany  -lm 
 -o ./mask_gather_load_run-11.exe    (timeout = 600)
spawn -ignore SIGHUP 
/scratch/tc-testing/tc-avl/build-rv64gcv/build-gcc-linux-stage2/gcc/xgcc 
-B/scratch/tc-testing/tc-avl/build-rv64gcv/build-gcc-linux-stage2/gcc/ 
/scratch/tc-testing/tc-avl/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
 -march=rv64gcv -mabi=lp64d -mcmodel=medlow -fdiagnostics-plain-output 
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param 
riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math -mcmodel=medany -lm -o 
./mask_gather_load_run-11.exe
PASS: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
(test for excess errors)
spawn riscv64-unknown-linux-gnu-run ./mask_gather_load_run-11.exe
mask_gather_load_run-11.exe: 
/scratch/tc-testing/tc-avl/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c:98:
 main: Assertion `dest_uint16_t_uint8_t[i * 2] == dest2_uint16_t_uint8_t[i * 
2]' failed.
/scratch/tc-testing/tc-avl/build-rv64gcv/../scripts/wrapper/qemu/riscv64-unknown-linux-gnu-run:
 line 15: 1520161 Aborted (core dumped) 
QEMU_CPU="$(march-to-cpu-opt --get-riscv-tag $1)" qemu-riscv$xlen -r 5.10 
"${qemu_args[@]}" -L ${RISC_V_SYSROOT} "$@"
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c 
execution test

rv32gcv debug log:

Executing on host: 
/scratch/tc-testing/tc-avl/build-rv32gcv/build-gcc-linux-stage2/gcc/xgcc 
-B/scratch/tc-testing/tc-avl/build-rv32gcv/build-gcc-linux-stage2/gcc/  
/scratch/tc-testing/tc-avl/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
  -march=rv32gcv -mabi=ilp32d -mcmodel=medlow   -fdiagnostics-plain-output   
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param 
riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math -mcmodel=medany  -lm 
 -o ./mask_gather_load_run-11.exe    (timeout = 600)
spawn -ignore SIGHUP 
/scratch/tc-testing/tc-avl/build-rv32gcv/build-gcc-linux-stage2/gcc/xgcc 
-B/scratch/tc-testing/tc-avl/build-rv32gcv/build-gcc-linux-stage2/gcc/ 
/scratch/tc-testing/tc-avl/gcc/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
 -march=rv32gcv -mabi=ilp32d -mcmodel=medlow -fdiagnostics-plain-output 
-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param 
riscv-autovec-lmul=m8 -fno-vect-c

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2

Thanks Richard for comments.

> Can you explain why this is necessary?  In particular what is lhs_rtx
> mode vs ops[0].value mode?

For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.

The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
The ops[0].value is (reg:VNx2DI 104).

The restriction removing make the vector rtl enter expand_fn_using_insn and of 
course hit the INTEGER_P assertion.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, October 26, 2023 4:38 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 
; Richard Sandiford 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>
> From: Pan Li 
>
> Update in v2:
>
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
>
> Original log:
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to remove this data
> type size check and unblock the standard name like lrintmn2.
>
> The below test are passed for this patch.
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression tests.
>
> gcc/ChangeLog:
>
> * internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
> * tree-vect-stmts.cc (vectorizable_call): Remove size check.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
> * gcc.target/aarch64/sve/clz_1.c: Ditto.
> * gcc.target/aarch64/sve/popcount_1.c: Ditto.
> * gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/internal-fn.cc  |  3 ++-
>  gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>  gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>  .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>  gcc/tree-vect-stmts.cc  | 13 -
>  6 files changed, 6 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 61d5a9e4772..17c0f4c3805 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
> unsigned int noutputs,
> emit_move_insn (lhs_rtx, ops[0].value);
>else
> {
> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
> +  || VECTOR_INTEGER_TYPE_P (TREE_TYPE (lhs)));

Can you explain why this is necessary?  In particular what is lhs_rtx
mode vs ops[0].value mode?

>   convert_move (lhs_rtx, ops[0].value, 0);

I'm not sure convert_move handles vector modes correctly.  Richard
probably added this code, CCed.

Richard.

> }
>  }
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> index bdc9856faaf..940d08bbc7b 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c
> @@ -18,5 +18,4 @@ clrsb_64 (unsigned int *restrict dst, uint64_t *restrict 
> src, int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.s, p[0-7]/m, 
> z[0-9]+\.s\n} 1 } } */
> -/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 2 } } */
> -/* { dg-final { scan-assembler-times {\tuzp1\tz[0-9]+\.s, z[0-9]+\.s, 
> z[0-9]+\.s\n} 1 } } */
> +/* { dg-final { scan-assembler-times {\tcls\tz[0-9]+\.d, p[0-7]/m, 
> z[0-9]+\.d\n} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> index 0c7a4e6d768..58b8ff406d2 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/clz_1.c
> @@ -18,5 +18,4 @@ clz_64 (unsigned int *restrict dst, uint64_t *restrict src, 
> int size)
>  }
>
>  /* { dg-final { scan-assembler-times {\tclz\tz[0-9]+\.s, p[0-7]/m, 
> z[0-

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2

> But I think this shows we mid-selected the optab, a convert_move is certainly 
> not correct unconditionally here (the target might not support that)

Make sense, we can wait a while for the confirmation from Richard S.

If convert_move is not designed for Vector (looks like mostly up to a point), I 
am not sure if we can fix the assertion like below

...
else If (VECTOR_INTERGER_TYPE (TREE_TYPE(lhs)))
  return;
else
  {
gcc_checking_assert (INTEGRAL_TYPE_TYPE_P (TREE_TYPE (lhs)));
convert_move (lhs_rtx, ops[0].value, 0);
  }

Aka bypass the vector here, but I am afraid this change may make the llrintf 
(SF => DI) not working on standard name.
Let me have a try and keep you posted.

Pan
  

-Original Message-
From: Richard Biener  
Sent: Thursday, October 26, 2023 10:00 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 
; Richard Sandiford 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer



> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
> 
> Thanks Richard for comments.
> 
>> Can you explain why this is necessary?  In particular what is lhs_rtx
>> mode vs ops[0].value mode?
> 
> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
> 
> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
> The ops[0].value is (reg:VNx2DI 104).
> 
> The restriction removing make the vector rtl enter expand_fn_using_insn and 
> of course hit the INTEGER_P assertion.

But I think this shows we mid-selected the optab, a convert_move is certainly 
not correct unconditionally here (the target might not support that)

> Pan
> 
> -Original Message-
> From: Richard Biener  
> Sent: Thursday, October 26, 2023 4:38 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Liu, Hongtao 
> ; Richard Sandiford 
> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
> 
>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>> 
>> From: Pan Li 
>> 
>> Update in v2:
>> 
>> * Fix one ICE of type assertion.
>> * Adjust some test cases for aarch64 sve and riscv vector.
>> 
>> Original log:
>> 
>> The vectoriable_call has one restriction of the size of data type.
>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>> when try to vectorize function call like lrintf.
>> 
>> void
>> test_lrintf (long *out, float *in, unsigned count)
>> {
>>  for (unsigned i = 0; i < count; i++)
>>out[i] = __builtin_lrintf (in[i]);
>> }
>> 
>> lrintf.c:5:26: missed: couldn't vectorize loop
>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>> 
>> Then the standard name pattern like lrintmn2 cannot work for different
>> data type size like SF => DI. This patch would like to remove this data
>> type size check and unblock the standard name like lrintmn2.
>> 
>> The below test are passed for this patch.
>> 
>> * The x86 bootstrap and regression test.
>> * The aarch64 regression test.
>> * The risc-v regression tests.
>> 
>> gcc/ChangeLog:
>> 
>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> gcc/internal-fn.cc  |  3 ++-
>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>> gcc/tree-vect-stmts.cc  | 13 -
>> 6 files changed, 6 insertions(+), 21 deletions(-)
>> 
>> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
>> index 61d5a9e4772..17c0f4c3805 100644
>> --- a/gcc/internal-fn.cc
>> +++ b/gcc/internal-fn.cc
>> @@ -281,7 +281,8 @@ expand_fn_using_insn (gcall *stmt, insn_code icode, 
>> unsigned int noutputs,
>>emit_move_insn (lhs_rtx, ops[0].value);
>>   else
>>{
>> - gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs)));
>> + gcc_checking_assert (INTEGRAL_TYPE_P (TREE_TYPE (lhs))
>> +

RE: [PATCH v2] VECT: Remove the type size restriction of vectorizer

2023-10-26 Thread Li, Pan2

Thanks Richard S for comments.

> In other words, I don't think simply removing the test from the vectoriser
> is correct.  It needs to be replaced by something more selective.

Does it mean we need to check if the internal fun allow different modes/sizes 
here?

For example, standard name lrintmn2 (m, n mode) is allowed here, while rintm2 
(only m mode) isn't.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Friday, October 27, 2023 1:47 AM
To: Richard Biener 
Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; Liu, Hongtao 
Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer

Richard Biener  writes:
>> Am 26.10.2023 um 13:59 schrieb Li, Pan2 :
>> 
>> Thanks Richard for comments.
>> 
>>> Can you explain why this is necessary?  In particular what is lhs_rtx
>>> mode vs ops[0].value mode?
>> 
>> For testcase gcc.target/aarch64/sve/popcount_1.c, the rtl are list as below.
>> 
>> The lhs_rtx is (reg:VNx2SI 98 [ vect__5.36 ]).
>> The ops[0].value is (reg:VNx2DI 104).
>> 
>> The restriction removing make the vector rtl enter expand_fn_using_insn and 
>> of course hit the INTEGER_P assertion.
>
> But I think this shows we mid-selected the optab, a convert_move is certainly 
> not correct unconditionally here (the target might not support that)

Agreed.  Allowing TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out)
makes sense if the called function allows the input and output modes
to vary.  That's true for internal functions that eventually map to
two-mode optabs.  But we can't remove the condition for calls to
other functions, at least not without some fix-ups.

ISTM that the problem being hit is the one described by the removed
comment.

In other words, I don't think simply removing the test from the vectoriser
is correct.  It needs to be replaced by something more selective.

Thanks,
Richard

>> Pan
>> 
>> -Original Message-
>> From: Richard Biener  
>> Sent: Thursday, October 26, 2023 4:38 PM
>> To: Li, Pan2 
>> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
>> ; kito.ch...@gmail.com; Liu, Hongtao 
>> ; Richard Sandiford 
>> Subject: Re: [PATCH v2] VECT: Remove the type size restriction of vectorizer
>> 
>>> On Thu, Oct 26, 2023 at 4:18 AM  wrote:
>>> 
>>> From: Pan Li 
>>> 
>>> Update in v2:
>>> 
>>> * Fix one ICE of type assertion.
>>> * Adjust some test cases for aarch64 sve and riscv vector.
>>> 
>>> Original log:
>>> 
>>> The vectoriable_call has one restriction of the size of data type.
>>> Aka DF to DI is allowed but SF to DI isn't. You may see below message
>>> when try to vectorize function call like lrintf.
>>> 
>>> void
>>> test_lrintf (long *out, float *in, unsigned count)
>>> {
>>>  for (unsigned i = 0; i < count; i++)
>>>out[i] = __builtin_lrintf (in[i]);
>>> }
>>> 
>>> lrintf.c:5:26: missed: couldn't vectorize loop
>>> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>>> 
>>> Then the standard name pattern like lrintmn2 cannot work for different
>>> data type size like SF => DI. This patch would like to remove this data
>>> type size check and unblock the standard name like lrintmn2.
>>> 
>>> The below test are passed for this patch.
>>> 
>>> * The x86 bootstrap and regression test.
>>> * The aarch64 regression test.
>>> * The risc-v regression tests.
>>> 
>>> gcc/ChangeLog:
>>> 
>>>* internal-fn.cc (expand_fn_using_insn): Add vector int assertion.
>>>* tree-vect-stmts.cc (vectorizable_call): Remove size check.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>>* gcc.target/aarch64/sve/clrsb_1.c: Adjust checker.
>>>* gcc.target/aarch64/sve/clz_1.c: Ditto.
>>>* gcc.target/aarch64/sve/popcount_1.c: Ditto.
>>>* gcc.target/riscv/rvv/autovec/unop/popcount.c: Ditto.
>>> 
>>> Signed-off-by: Pan Li 
>>> ---
>>> gcc/internal-fn.cc  |  3 ++-
>>> gcc/testsuite/gcc.target/aarch64/sve/clrsb_1.c  |  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/clz_1.c|  3 +--
>>> gcc/testsuite/gcc.target/aarch64/sve/popcount_1.c   |  3 +--
>>> .../gcc.target/riscv/rvv/autovec/unop/popcount.c|  2 +-
>>> gcc/tree-vect-stmts.cc  | 13 -
>>> 6 fi

RE: [PATCH v1] RISC-V: Fix one range-loop-construct warning of avlprop

2023-10-28 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, October 28, 2023 11:00 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Fix one range-loop-construct warning of avlprop



On 10/28/23 08:51, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to fix one warning of avlprop as below.
> 
> ../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual
> unsigned int pass_avlprop::execute(function*)':
> ../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable
> 'candidate' creates a copy from type 'const std::pair rtl_ssa::insn_info*>' [-Werror=range-loop-construct]
>346 |   for (const auto candidate : m_candidates)
>|   ^
> ../../gcc/config/riscv/riscv-avlprop.cc:346:23: note: use reference type
> to prevent copying
>346 |   for (const auto candidate : m_candidates)
>|   ^
>|   &
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv-avlprop.cc (pass_avlprop::execute): Use
>   reference type to prevent copying.
OK
jeff
>

RE: [Ready to commit V3] RISC-V: Add AVL propagation PASS for RVV auto-vectorization

2023-10-29 Thread Li, Pan2

Should be fixed by the below PATCH, feel free to ping me if any issues.

https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634616.html

Pan

-Original Message-
From: Andreas Schwab  
Sent: Saturday, October 28, 2023 4:16 PM
To: 钟居哲 
Cc: patrick ; gcc-patches ; 
kito.cheng ; rdapp.gcc 
Subject: Re: [Ready to commit V3] RISC-V: Add AVL propagation PASS for RVV 
auto-vectorization

../../gcc/config/riscv/riscv-avlprop.cc: In member function 'virtual unsigned 
int pass_avlprop::execute(function*)':
../../gcc/config/riscv/riscv-avlprop.cc:346:23: error: loop variable 
'candidate' creates a copy from type 'const std::pair' [-Werror=range-loop-construct]
  346 |   for (const auto candidate : m_candidates)
  |   ^
../../gcc/config/riscv/riscv-avlprop.cc:346:23: note: use reference type to 
prevent copying
  346 |   for (const auto candidate : m_candidates)
  |   ^
  |   &

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

RE: [PATCH] RISC-V: Fix bugs of handling scalar of SEW64 vx instruction in RV32

2023-10-30 Thread Li, Pan2

Committed, thanks Robin.

Pan

-Original Message-
From: Robin Dapp  
Sent: Monday, October 30, 2023 3:42 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@sifive.com; kito.ch...@gmail.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Fix bugs of handling scalar of SEW64 vx 
instruction in RV32

Thanks, LGTM.

Regards
 Robin

RE: Re: [PATCH v6] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic

2023-10-31 Thread Li, Pan2

Thanks xuli for enabling this feature, we can update the CI of 
rvv-intrinsic-doc for overloaded API(s) after committed.

Pan

-Original Message-
From: Li Xu  
Sent: Tuesday, October 31, 2023 7:37 PM
To: juzhe.zh...@rivai.ai
Cc: gcc-patches ; kito.cheng ; 
palmer 
Subject: Re: Re: [PATCH v6] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for 
RVV intrinsic

All overload and non-overload intrinsics have been tested successfully on gcc 
and g++.

Thanks.


> -原始邮件-发件人:"juzhe.zh...@rivai.ai" 
> 发送时间:2023-10-31 17:07:11 (星期二)收件人:"Li Xu" 
> , gcc-patches 
> 抄送:"kito.cheng" , palmer 
> , "Li Xu" 主题:Re: [PATCH v6] 
> RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV intrinsic
> 
> LGTM from my side.
> 
> Give kito one more day to review it.
> 
> Thanks for support this feature !
> 
> juzhe.zh...@rivai.ai
>  
> From: Li Xu
> Date: 2023-10-31 17:03
> To: gcc-patches
> CC: kito.cheng; palmer; juzhe.zhong; xuli
> Subject: [PATCH v6] RISC-V: Implement RESOLVE_OVERLOADED_BUILTIN for RVV 
> intrinsic
> From: xuli 
>  
> Update in v6:
> * Rename maybe_require_frm_p to may_require_frm_p.
> * Rename maybe_require_vxrm_p to may_require_vxrm_p.
> * Move may_require_frm_p and may_require_vxrm_p to function_base.
>  
> Update in v5:
> * Split has_vxrm_or_frm_p into maybe_require_frm_p and
>   maybe_require_vxrm_p.
> * Adjust comments.
>  
> Update in v4:
> * Remove class function_resolver.
> * Remove function get_non_overloaded_instance.
> * Add overloaded hash traits for non-overloaded intrinsic.
> * All overloaded intrinsics are implemented, and the tests pass.
>  
> Update in v3:
>  
> * Rewrite comment for overloaded function add.
> * Move get_non_overloaded_instance to function_base.
>  
> Update in v2:
>  
> * Add get_non_overloaded_instance for function instance.
> * Fix overload check for policy function.
> * Enrich the test cases check.
>  
> Original log:
>  
> This patch would like add the framework to support the RVV overloaded
> intrinsic API in riscv-xxx-xxx-gcc, like riscv-xxx-xxx-g++ did.
>  
> However, it almost leverage the hook TARGET_RESOLVE_OVERLOADED_BUILTIN
> with below steps.
>  
> * Register overloaded functions.
> * Add function_resolver for overloaded function resolving.
> * Add resolve API for function shape with default implementation.
> * Implement HOOK for navigating the overloaded API to non-overloaded API.
>  
> gcc/ChangeLog:
>  
>     * config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): New 
> function for the hook.
>     (riscv_register_pragmas): Register the hook.
>     * config/riscv/riscv-protos.h (resolve_overloaded_builtin): New decl.
>     * config/riscv/riscv-vector-builtins-bases.cc: New function impl.
>     * config/riscv/riscv-vector-builtins-shapes.cc (build_one): Register 
> overloaded function.
>     * config/riscv/riscv-vector-builtins.cc (struct 
> non_overloaded_registered_function_hasher): New hash table.
>     (function_builder::add_function): Add overloaded arg.
>     (function_builder::add_unique_function): Map overloaded function to 
> non-overloaded function.
>     (function_builder::add_overloaded_function): New API impl.
>     (registered_function::overloaded_hash): Calculate hash value.
>     (has_vxrm_or_frm_p): New function impl.
>     (non_overloaded_registered_function_hasher::hash): Ditto.
>     (non_overloaded_registered_function_hasher::equal): Ditto.
>     (handle_pragma_vector): Allocate space for hash table.
>     (resolve_overloaded_builtin): New function impl.
>     * config/riscv/riscv-vector-builtins.h 
> (function_base::may_require_frm_p): Ditto.
>     (function_base::may_require_vxrm_p): Ditto.
>  
> gcc/testsuite/ChangeLog:
>  
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vadd.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vfadd.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vget_vset.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vloxseg2ei16.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vmv.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv32_vreinterpret.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vadd.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vfadd.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vget_vset.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vloxseg2ei16.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vmv.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_rv64_vreinterpret.c: New test.
>     * gcc.target/riscv/rvv/base/overloaded_vadd.h: New test.
>     * gcc.target/riscv/rvv/base/overloaded_vfadd.h: New test.
>     * gcc.target/riscv/rvv/base/overloaded_vget_vset.h: New test.
>     * gcc.target/riscv/rvv/base/overloaded_vloxseg2ei16.h: New test.
>     * gcc.target/riscv/rvv/base/overloaded_vmv.h: New test.

RE: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

2023-10-31 Thread Li, Pan2

Passed the x86 bootstrap and regression tests.

Pan

-Original Message-
From: Juzhe-Zhong  
Sent: Tuesday, October 31, 2023 5:59 PM
To: gcc-patches@gcc.gnu.org
Cc: rguent...@suse.de; jeffreya...@gmail.com; richard.sandif...@arm.com; 
rdapp@gmail.com; Juzhe-Zhong 
Subject: [PATCH V2] OPTABS/IFN: Add 
mask_len_strided_load/mask_len_strided_store OPTABS/IFN

As previous Richard's suggested, we should support strided load/store in
loop vectorizer instead hacking RISC-V backend.

This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.

The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store but with
changing vector offset into scalar stride.

We don't have strided_load/strided_store and 
mask_strided_load/mask_strided_store since
it't unlikely RVV will have such optabs and we can't add the patterns that we 
can't test them.


gcc/ChangeLog:

* doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
* internal-fn.cc (internal_load_fn_p): Ditto.
(internal_strided_fn_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
(internal_strided_fn_supported_p): Ditto.
* internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
(MASK_LEN_STRIDED_STORE): Ditto.
* internal-fn.h (internal_strided_fn_p): Ditto.
(internal_strided_fn_supported_p): Ditto.
* optabs.def (OPTAB_CD): Ditto.

---
 gcc/doc/md.texi | 51 +
 gcc/internal-fn.cc  | 44 ++
 gcc/internal-fn.def |  4 
 gcc/internal-fn.h   |  2 ++
 gcc/optabs.def  |  2 ++
 5 files changed, 103 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index fab2513105a..5bac713a0dd 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element @var{i} of the 
result should
 be loaded from memory and clear if element @var{i} of the result should be 
undefined.
 Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
 
+@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction pattern
+@item @samp{mask_len_strided_load@var{m}@var{n}}
+Load several separate memory locations into a destination vector of mode 
@var{m}.
+Operand 0 is a destination vector of mode @var{m}.
+Operand 1 is a scalar base address and operand 2 is a scalar stride of mode 
@var{n}.
+The instruction can be seen as a special case of 
@code{mask_len_gather_load@var{m}@var{n}}
+with an offset vector that is a @code{vec_series} with operand 1 as base and 
operand 2 as step.
+For each element index i:
+
+@itemize @bullet
+@item
+extend the stride to address width, using zero
+extension if operand 3 is 1 and sign extension if operand 3 is zero;
+@item
+multiply the extended stride by operand 4;
+@item
+add the result to the base; and
+@item
+load the value at that address (operand 1 + @var{i} * multiplied and extended 
stride) into element @var{i} of operand 0.
+@end itemize
+
+Similar to mask_len_load, the instruction loads at most (operand 6 + operand 
7) elements from memory.
+Bit @var{i} of the mask is set if element @var{i} of the result should
+be loaded from memory and clear if element @var{i} of the result should be 
undefined.
+Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
+
 @cindex @code{scatter_store@var{m}@var{n}} instruction pattern
 @item @samp{scatter_store@var{m}@var{n}}
 Store a vector of mode @var{m} into several distinct memory locations.
@@ -5131,6 +5157,31 @@ at most (operand 6 + operand 7) elements of (operand 4) 
to memory.
 Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be 
stored.
 Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
 
+@cindex @code{mask_len_strided_store@var{m}@var{n}} instruction pattern
+@item @samp{mask_len_strided_store@var{m}@var{n}}
+Store a vector of mode m into several distinct memory locations.
+Operand 0 is a scalar base address and operand 1 is scalar stride of mode 
@var{n}.
+Operand 2 is the vector of values that should be stored, which is of mode 
@var{m}.
+The instruction can be seen as a special case of 
@code{mask_len_scatter_store@var{m}@var{n}}
+with an offset vector that is a @code{vec_series} with operand 1 as base and 
operand 2 as step.
+For each element index i:
+
+@itemize @bullet
+@item
+extend the stride to address width, using zero
+extension if operand 2 is 1 and sign extension if operand 2 is zero;
+@item
+multiply the extended stride by operand 3;
+@item
+add the result to the base; and
+@item
+store element @var{i} of operand 4 to that address (operand 1 + @var{i} * 
multiplied and extended stride).
+@end itemize
+
+Similar to mask_len_store, the instruction stores at most (operand 6 + operand 
7) elements of (operand 4) to memory.
+Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be 
stored.
+Mask e

RE: [PATCH] VECT: Support mask_len_strided_load/mask_len_strided_store in loop vectorize

2023-10-31 Thread Li, Pan2

Passed the x86 bootstrap and regression tests.

Pan

-Original Message-
From: Juzhe-Zhong  
Sent: Tuesday, October 31, 2023 6:08 PM
To: gcc-patches@gcc.gnu.org
Cc: richard.sandif...@arm.com; rguent...@suse.de; jeffreya...@gmail.com; 
Juzhe-Zhong 
Subject: [PATCH] VECT: Support mask_len_strided_load/mask_len_strided_store in 
loop vectorize

This patch support loop vectorizer generate direct strided load/store IFN
if targets enable it.

Note that this patch provide the ability that target enabling strided 
load/store but without gather/scatter
can vectorize stride memory access.

gcc/ChangeLog:

* optabs-query.cc (supports_vec_gather_load_p): Support strided 
load/store.
(supports_vec_scatter_store_p): Ditto.
* optabs-query.h (supports_vec_gather_load_p): Ditto.
(supports_vec_scatter_store_p): Ditto.
* tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Ditto.
(vect_check_gather_scatter): Ditto.
* tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto.
(vect_truncate_gather_scatter_offset): Ditto.
(vect_use_strided_gather_scatters_p): Ditto.
(vect_get_strided_load_store_ops): Ditto.
(vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_gather_scatter_fn_p): Ditto.

---
 gcc/optabs-query.cc| 27 ++-
 gcc/optabs-query.h |  4 +--
 gcc/tree-vect-data-refs.cc | 71 --
 gcc/tree-vect-stmts.cc | 46 +---
 gcc/tree-vectorizer.h  |  3 +-
 5 files changed, 109 insertions(+), 42 deletions(-)

diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
index 947ccef218c..ea594baf15d 100644
--- a/gcc/optabs-query.cc
+++ b/gcc/optabs-query.cc
@@ -670,14 +670,19 @@ supports_vec_convert_optab_p (optab op, machine_mode mode)
for at least one vector mode.  */
 
 bool
-supports_vec_gather_load_p (machine_mode mode)
+supports_vec_gather_load_p (machine_mode mode, bool strided_p)
 {
   if (!this_fn_optabs->supports_vec_gather_load[mode])
 this_fn_optabs->supports_vec_gather_load[mode]
   = (supports_vec_convert_optab_p (gather_load_optab, mode)
-|| supports_vec_convert_optab_p (mask_gather_load_optab, mode)
-|| supports_vec_convert_optab_p (mask_len_gather_load_optab, mode)
-? 1 : -1);
+|| supports_vec_convert_optab_p (mask_gather_load_optab, mode)
+|| supports_vec_convert_optab_p (mask_len_gather_load_optab, mode)
+|| (strided_p
+&& convert_optab_handler (mask_len_strided_load_optab, mode,
+  Pmode)
+ != CODE_FOR_nothing)
+  ? 1
+  : -1);
 
   return this_fn_optabs->supports_vec_gather_load[mode] > 0;
 }
@@ -687,14 +692,20 @@ supports_vec_gather_load_p (machine_mode mode)
for at least one vector mode.  */
 
 bool
-supports_vec_scatter_store_p (machine_mode mode)
+supports_vec_scatter_store_p (machine_mode mode, bool strided_p)
 {
   if (!this_fn_optabs->supports_vec_scatter_store[mode])
 this_fn_optabs->supports_vec_scatter_store[mode]
   = (supports_vec_convert_optab_p (scatter_store_optab, mode)
-|| supports_vec_convert_optab_p (mask_scatter_store_optab, mode)
-|| supports_vec_convert_optab_p (mask_len_scatter_store_optab, mode)
-? 1 : -1);
+|| supports_vec_convert_optab_p (mask_scatter_store_optab, mode)
+|| supports_vec_convert_optab_p (mask_len_scatter_store_optab,
+ mode)
+|| (strided_p
+&& convert_optab_handler (mask_len_strided_store_optab, mode,
+  Pmode)
+ != CODE_FOR_nothing)
+  ? 1
+  : -1);
 
   return this_fn_optabs->supports_vec_scatter_store[mode] > 0;
 }
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 920eb6a1b67..7c22edc5a78 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -191,8 +191,8 @@ bool can_compare_and_swap_p (machine_mode, bool);
 bool can_atomic_exchange_p (machine_mode, bool);
 bool can_atomic_load_p (machine_mode);
 bool lshift_cheap_p (bool);
-bool supports_vec_gather_load_p (machine_mode = E_VOIDmode);
-bool supports_vec_scatter_store_p (machine_mode = E_VOIDmode);
+bool supports_vec_gather_load_p (machine_mode = E_VOIDmode, bool = false);
+bool supports_vec_scatter_store_p (machine_mode = E_VOIDmode, bool = false);
 bool can_vec_extract (machine_mode, machine_mode);
 
 /* Version of find_widening_optab_handler_and_mode that operates on
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index d5c9c4a11c2..d374849b0a7 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -3913,9 +3913,9 @@ vect_prune_runtime_alias_test_list (loop_vec_info 
loop_vinfo)
*IFN_OUT and the vector type for the offset in *OFFSET_VECTYPE_OUT.  */
 
 bool
-vect_g

RE: [PATCH v3] VECT: Refine the type size restriction of call vectorizer

2023-10-31 Thread Li, Pan2

> can you instead amend vectorizable_internal_function to contain the check,
> returning IFN_LAST if it doesn't hold?

Sure, will send v4 for this.

Pan

-Original Message-
From: Richard Biener  
Sent: Tuesday, October 31, 2023 8:58 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 

Subject: Re: [PATCH v3] VECT: Refine the type size restriction of call 
vectorizer

On Mon, Oct 30, 2023 at 1:23 PM  wrote:
>
> From: Pan Li 
>
> Update in v3:
>
> * Add func to predicate type size is legal or not for vectorizer call.
>
> Update in v2:
>
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
>
> Original log:
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
> out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to refine this data
> type size check and unblock the standard name like lrintmn2 on conditions.
>
> The type size of vectype_out need to be exactly the same as the type
> size of vectype_in when the vectype_out size isn't participating in
> the optab selection. While there is no such restriction when the
> vectype_out is somehow a part of the optab query.
>
> The below test are passed for this patch.
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression tests.
> * Ensure the lrintf standard name in risc-v.
>
> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_type_size_legal_p): New
> func impl to predicate the type size is legal or not.
> (vectorizable_call): Leverage vectorizable_type_size_legal_p.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/tree-vect-stmts.cc | 51 +++---
>  1 file changed, 38 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index a9200767f67..24b3448d961 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1430,6 +1430,35 @@ vectorizable_internal_function (combined_fn cfn, tree 
> fndecl,
>return IFN_LAST;
>  }
>
> +/* Return TRUE when the type size is legal for the call vectorizer,
> +   or FALSE.
> +   The type size of both the vectype_in and vectype_out should be
> +   exactly the same when vectype_out isn't participating the optab.
> +   While there is no restriction for type size when vectype_out
> +   is part of the optab query.
> + */
> +static bool
> +vectorizable_type_size_legal_p (internal_fn ifn, tree vectype_out,
> +   tree vectype_in)
> +{
> +  bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
> +
> +  if (ifn == IFN_LAST || !direct_internal_fn_p (ifn))
> +return same_size_p;
> +
> +  const direct_internal_fn_info &difn_info = direct_internal_fn (ifn);
> +
> +  if (!difn_info.vectorizable)
> +return same_size_p;
> +
> +  /* According to vectorizable_internal_function, the type0/1 < 0 indicates
> + the vectype_out participating the optable selection.  Aka the type size
> + check can be skipped here.  */
> +  if (difn_info.type0 < 0 || difn_info.type1 < 0)
> +return true;

can you instead amend vectorizable_internal_function to contain the check,
returning IFN_LAST if it doesn't hold?

> +
> +  return same_size_p;
> +}
>
>  static tree permute_vec_elements (vec_info *, tree, tree, tree, 
> stmt_vec_info,
>   gimple_stmt_iterator *);
> @@ -3361,19 +3390,6 @@ vectorizable_call (vec_info *vinfo,
>
>return false;
>  }
> -  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
> - just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
> - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
> - by a pack of the two vectors into an SI vector.  We would need
> - separate code to handle direct VnDI->VnSI IFN_CTZs.  */
> -  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
> -{
> -  if (dump_enabled_p ())
> -   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -"mismatche

RE: [PATCH v4] VECT: Refine the type size restriction of call vectorizer

2023-10-31 Thread Li, Pan2

The below test are passed for this patch.

* The x86 bootstrap and regression test.
* The aarch64 regression test.
* The risc-v regression tests.
* Ensure the lrintf standard name in RVV.

Pan

-Original Message-
From: Li, Pan2  
Sent: Tuesday, October 31, 2023 11:10 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 
; richard.guent...@gmail.com
Subject: [PATCH v4] VECT: Refine the type size restriction of call vectorizer

From: Pan Li 

Update in v4:

* Append the check to vectorizable_internal_function.

Update in v3:

* Add func to predicate type size is legal or not for vectorizer call.

Update in v2:

* Fix one ICE of type assertion.
* Adjust some test cases for aarch64 sve and riscv vector.

Original log:

The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.

void
test_lrintf (long *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}

lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type

Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to refine this data
type size check and unblock the standard name like lrintmn2 on conditions.

The type size of vectype_out need to be exactly the same as the type
size of vectype_in when the vectype_out size isn't participating in
the optab selection. While there is no such restriction when the
vectype_out is somehow a part of the optab query.

The below test are passed for this patch.

* The risc-v regression tests.
* Ensure the lrintf standard name in risc-v.

The below test are ongoing.

* The x86 bootstrap and regression test.
* The aarch64 regression test.

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_internal_function): Add type
size check for vectype_out doesn't participating for optab query.
(vectorizable_call): Remove the type size check.

Signed-off-by: Pan Li 
---
 gcc/tree-vect-stmts.cc | 22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index a9200767f67..799b4ab10c7 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree 
fndecl,
   const direct_internal_fn_info &info = direct_internal_fn (ifn);
   if (info.vectorizable)
{
+ bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
  tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
  tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
+
+ /* The type size of both the vectype_in and vectype_out should be
+exactly the same when vectype_out isn't participating the optab.
+While there is no restriction for type size when vectype_out
+is part of the optab query.  */
+ if (type0 != vectype_out && type1 != vectype_out && !same_size_p)
+   return IFN_LAST;
+
  if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
  OPTIMIZE_FOR_SPEED))
return ifn;
@@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo,
 
   return false;
 }
-  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
- just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
- are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
- by a pack of the two vectors into an SI vector.  We would need
- separate code to handle direct VnDI->VnSI IFN_CTZs.  */
-  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
-{
-  if (dump_enabled_p ())
-   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-"mismatched vector sizes %T and %T\n",
-vectype_in, vectype_out);
-  return false;
-}
 
   if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
   != VECTOR_BOOLEAN_TYPE_P (vectype_in))
-- 
2.34.1

RE: [PATCH] RISC-V: Allow dest operand and accumulator operand overlap of widen reduction instruction[PR112327]

2023-11-01 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, November 2, 2023 3:02 AM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; kito.ch...@sifive.com; rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Allow dest operand and accumulator operand overlap 
of widen reduction instruction[PR112327]



On 11/1/23 00:56, Juzhe-Zhong wrote:
> 
> Consider this following intrinsic code:
> 
> void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result)
> {
>  size_t vl;
>  vint16m4_t vSrcA, vSrcB;
>  vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1);
>  while (n > 0) {
>  vl = __riscv_vsetvl_e16m4(n);
>  vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl);
>  vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl);
>  vSum = __riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, 
> vSrcB, vl), vSum, vl);
>  pSrcA += vl;
>  pSrcB += vl;
>  n -= vl;
>  }
>  *result = __riscv_vmv_x_s_i64m1_i64(vSum);
> }
> 
> https://godbolt.org/z/vWd35W7G6
> 
> Before this patch:
> 
> ...
> Loop:
> ...
> vmv1r.v v2,v1
> ...
> vwredsum.vs v1,v8,v2
> ...
> 
> After this patch:
> 
> ...
> Loop:
> ...
> vwredsum.vs   v1,v8,v1
> ...
> 
>   PR target/112327
> 
> gcc/ChangeLog:
> 
>   * config/riscv/vector.md: Add '0'.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/base/pr112327-1.c: New test.
>   * gcc.target/riscv/rvv/base/pr112327-2.c: New test.
OK
jeff

RE: [PATCH v4] VECT: Refine the type size restriction of call vectorizer

2023-11-01 Thread Li, Pan2

Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, November 2, 2023 12:43 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; Liu, Hongtao 

Subject: Re: [PATCH v4] VECT: Refine the type size restriction of call 
vectorizer



> Am 31.10.2023 um 16:10 schrieb pan2...@intel.com:
> 
> From: Pan Li 
> 
> Update in v4:
> 
> * Append the check to vectorizable_internal_function.
> 
> Update in v3:
> 
> * Add func to predicate type size is legal or not for vectorizer call.
> 
> Update in v2:
> 
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
> 
> Original log:
> 
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
> 
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>  for (unsigned i = 0; i < count; i++)
>out[i] = __builtin_lrintf (in[i]);
> }
> 
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
> 
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to refine this data
> type size check and unblock the standard name like lrintmn2 on conditions.
> 
> The type size of vectype_out need to be exactly the same as the type
> size of vectype_in when the vectype_out size isn't participating in
> the optab selection. While there is no such restriction when the
> vectype_out is somehow a part of the optab query.
> 
> The below test are passed for this patch.
> 
> * The risc-v regression tests.
> * Ensure the lrintf standard name in risc-v.
> 
> The below test are ongoing.
> 
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> 

Ok

Thanks,
Richard 

> gcc/ChangeLog:
> 
>* tree-vect-stmts.cc (vectorizable_internal_function): Add type
>size check for vectype_out doesn't participating for optab query.
>(vectorizable_call): Remove the type size check.
> 
> Signed-off-by: Pan Li 
> ---
> gcc/tree-vect-stmts.cc | 22 +-
> 1 file changed, 9 insertions(+), 13 deletions(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index a9200767f67..799b4ab10c7 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1420,8 +1420,17 @@ vectorizable_internal_function (combined_fn cfn, tree 
> fndecl,
>   const direct_internal_fn_info &info = direct_internal_fn (ifn);
>   if (info.vectorizable)
>{
> +  bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
>  tree type0 = (info.type0 < 0 ? vectype_out : vectype_in);
>  tree type1 = (info.type1 < 0 ? vectype_out : vectype_in);
> +
> +  /* The type size of both the vectype_in and vectype_out should be
> + exactly the same when vectype_out isn't participating the optab.
> + While there is no restriction for type size when vectype_out
> + is part of the optab query.  */
> +  if (type0 != vectype_out && type1 != vectype_out && !same_size_p)
> +return IFN_LAST;
> +
>  if (direct_internal_fn_supported_p (ifn, tree_pair (type0, type1),
>  OPTIMIZE_FOR_SPEED))
>return ifn;
> @@ -3361,19 +3370,6 @@ vectorizable_call (vec_info *vinfo,
> 
>   return false;
> }
> -  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
> - just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
> - are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
> - by a pack of the two vectors into an SI vector.  We would need
> - separate code to handle direct VnDI->VnSI IFN_CTZs.  */
> -  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
> -{
> -  if (dump_enabled_p ())
> -dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> - "mismatched vector sizes %T and %T\n",
> - vectype_in, vectype_out);
> -  return false;
> -}
> 
>   if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
>   != VECTOR_BOOLEAN_TYPE_P (vectype_in))
> -- 
> 2.34.1
>

RE: [PATCH] RISC-V: Fix bug of AVL propagation PASS

2023-11-02 Thread Li, Pan2

Committed, thanks Robin.

Pan

-Original Message-
From: Robin Dapp  
Sent: Thursday, November 2, 2023 7:34 PM
To: Juzhe-Zhong ; gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; kito.ch...@gmail.com; kito.ch...@sifive.com; 
jeffreya...@gmail.com
Subject: Re: [PATCH] RISC-V: Fix bug of AVL propagation PASS

LGTM.

Regards
 Robin

RE: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zhong 
Sent: Thursday, November 2, 2023 8:04 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec 
iterator

lgtm
 Replied Message 
From
pan2...@intel.com<mailto:pan2...@intel.com>
Date
11/02/2023 19:48
To
gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>
Cc
juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>,
pan2...@intel.com<mailto:pan2...@intel.com>,
yanzhang.w...@intel.com<mailto:yanzhang.w...@intel.com>,
kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>
Subject
[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

RE: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits [PR111720]

2023-11-02 Thread Li, Pan2

Thanks Richard B for comments.

> when there are integer modes for the vector modes you now go a different path,
> a little less "regressing" would be to write it as
> 
>   if (int_mode_for_mode (src_mode).exists (&src_int_mode)
>&& int_mode_for_mode (mode).exists (&int_mode))
>  {
> ... old code ...
>  }
>   else if (VECTOR_MODE_P (mode) && VECTOR_MODE_P (src_mode))
>  {
> ... new code ...
>}
>   else
>  return NULL_RTX;

That make sense to me, will update it in V2.

> so you're really expecting to generate a subreg here?  Given "vector
> register layout"
> isn't something that's very well defined I fear it's going to be
> difficult to guarantee
> the desired semantics of this function.  IIRC powerpc64le has big-endian lane
> order for example.

This should be one problem here, I may need more consideration here regarding 
different backends.

Pan


-Original Message-
From: Richard Biener  
Sent: Thursday, November 2, 2023 4:20 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; jeffreya...@gmail.com; 
richard.sandif...@arm.com
Subject: Re: [PATCH v1] EXPMED: Allow vector mode for DSE extract_low_bits 
[PR111720]

On Thu, Nov 2, 2023 at 4:15 AM  wrote:
>
> From: Pan Li 
>
> The extract_low_bits only try the scalar mode if the bitsize of
> the mode and src_mode is not equal. When vector mode is given
> from get_stored_val in DSE, it will always fail and return NULL_RTX.
>
> This patch would like to allow the vector mode in the extract_low_bits
> if and only if the size of mode is less than or equals to the size of
> the src_mode.
>
> Given below example code with --param=riscv-autovec-preference=fixed-vlmax.
>
> vuint8m1_t test () {
>   uint8_t arr[32] = {
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
> 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>   };
>
>   return __riscv_vle8_v_u8m1(arr, 32);
> }
>
> Before this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addisp,sp,-32
>   addia5,a5,%lo(.LANCHOR0)
>   li  a3,32
>   vl2re64.v   v2,0(a5)
>   vsetvli zero,a3,e8,m1,ta,ma
>   vs2r.v  v2,0(sp) <== Unnecessary store to stack
>   vle8.v  v1,0(sp) <== Ditto
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> After this patch:
>
> test:
>   lui a5,%hi(.LANCHOR0)
>   addia5,a5,%lo(.LANCHOR0)
>   li  a4,32
>   addisp,sp,-32
>   vsetvli zero,a4,e8,m1,ta,ma
>   vle8.v  v1,0(a5)
>   vs1r.v  v1,0(a0)
>   addisp,sp,32
>   jr  ra
>
> Below tests are passed within this patch:
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression test.
>
> PR target/111720
>
> gcc/ChangeLog:
>
> * expmed.cc (extract_low_bits): Allow vector mode if the
> mode size is less than or equal to src_mode.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-8.c: New test.
> * gcc.target/riscv/rvv/base/pr111720-9.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/expmed.cc | 44 ---
>  .../gcc.target/riscv/rvv/base/pr111720-0.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-1.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-10.c   | 18 
>  .../gcc.target/riscv/rvv/base/pr111720-2.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-3.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-4.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-5.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-6.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-7.c| 21 +
>  .../gcc.target/riscv/rvv/base/pr111720-8.c| 18 
>  .../gcc.target/riscv/rvv/base/pr111720-9.c| 15 +++
>  12 files changed, 227 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr111720-0.c
>  create mode 100644 gcc/testsuite/gcc.target/r

RE: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

2023-11-02 Thread Li, Pan2

Thanks Patrick.

It caused by the underlying codegen is not implemented but expand modes opened. 
Revert it first to unblock others and will fix it ASAP.

Pan

From: Patrick O'Neill 
Sent: Friday, November 3, 2023 6:57 AM
To: Li, Pan2 ; juzhe.zhong 
Cc: gcc-patches@gcc.gnu.org; Wang, Yanzhang ; 
kito.ch...@gmail.com; gnu-toolchain 
Subject: Re: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec 
iterator


Hi Pan,

This patch is causing new failures (ICEs) on trunk:
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/110

Pre-commit CI run:
https://github.com/ewlu/gcc-precommit-ci/issues/553#issuecomment-1790688172

New rv32gcv failures:

FAIL: gcc.dg/vect/fast-math-bb-slp-call-2.c (internal compiler error: in 
expand_vec_lrint, at config/riscv/riscv-v.cc:4134)

FAIL: gcc.dg/vect/fast-math-bb-slp-call-2.c (test for excess errors)

FAIL: gcc.dg/vect/fast-math-vect-call-2.c (internal compiler error: in 
expand_vec_lrint, at config/riscv/riscv-v.cc:4134)

FAIL: gcc.dg/vect/fast-math-vect-call-2.c (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O0  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O0  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O1  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O1  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O2  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O2  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O3 -g  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O3 -g  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -Os  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -Os  (test for excess errors)

New rv64gcv failures:

FAIL: gfortran.dg/pr32533.f90   -O0  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O0  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O1  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O1  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O2  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O2  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -O3 -g  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -O3 -g  (test for excess errors)

FAIL: gfortran.dg/pr32533.f90   -Os  (internal compiler error: in 
expand_vec_lround, at config/riscv/riscv-v.cc:4144)

FAIL: gfortran.dg/pr32533.f90   -Os  (test for excess errors)

Please let me know if you need any additional information.

Thanks,
Patrick
On 11/2/23 05:13, Li, Pan2 wrote:
Committed, thanks Juzhe.

Pan

From: juzhe.zhong <mailto:juzhe.zh...@rivai.ai>
Sent: Thursday, November 2, 2023 8:04 PM
To: Li, Pan2 <mailto:pan2...@intel.com>
Cc: gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>; Li, Pan2 
<mailto:pan2...@intel.com>; Wang, Yanzhang 
<mailto:yanzhang.w...@intel.com>; 
kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec 
iterator

lgtm
 Replied Message 
From
pan2...@intel.com<mailto:pan2...@intel.com>
Date
11/02/2023 19:48
To
gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>
Cc
juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>,
pan2...@intel.com<mailto:pan2...@intel.com>,
yanzhang.w...@intel.com<mailto:yanzhang.w...@intel.com>,
kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>
Subject
[PATCH v1] RISC-V: Refactor prefix [I/L/LL] rounding API autovec iterator

RE: [tree-optimization/111721 V2] VECT: Support SLP for MASK_LEN_GATHER_LOAD with dummy mask

2023-11-03 Thread Li, Pan2

Committed as passed the regression test of aarch64, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Friday, November 3, 2023 3:36 PM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; richard.sandif...@arm.com
Subject: Re: [tree-optimization/111721 V2] VECT: Support SLP for 
MASK_LEN_GATHER_LOAD with dummy mask

On Fri, 3 Nov 2023, Juzhe-Zhong wrote:

> This patch fixes following FAILs for RVV:
> FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump 
> vect "Loop contains only SLP stmts"
> FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP 
> stmts"
> 
> Bootstrap on X86 and regtest passed.
> 
> Ok for trunk ?

OK.  We can walk back if problems with SVE appear.

Thanks,
Richard.

> PR tree-optimization/111721
> 
> gcc/ChangeLog:
> 
> * tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for 
> dummy mask -1.
> * tree-vect-stmts.cc (vectorizable_load): Ditto.
> 
> ---
>  gcc/tree-vect-slp.cc   | 5 ++---
>  gcc/tree-vect-stmts.cc | 5 +++--
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 43d742e3c92..6b8a7b628b6 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -759,9 +759,8 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned 
> char swap,
> if ((dt == vect_constant_def
>  || dt == vect_external_def)
> && !GET_MODE_SIZE (vinfo->vector_mode).is_constant ()
> -   && (TREE_CODE (type) == BOOLEAN_TYPE
> -   || !can_duplicate_and_interleave_p (vinfo, stmts.length (),
> -   type)))
> +   && TREE_CODE (type) != BOOLEAN_TYPE
> +   && !can_duplicate_and_interleave_p (vinfo, stmts.length (), type))
>   {
> if (dump_enabled_p ())
>   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 6ce4868d3e1..8c92bd5d931 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -9825,6 +9825,7 @@ vectorizable_load (vec_info *vinfo,
>  
>tree mask = NULL_TREE, mask_vectype = NULL_TREE;
>int mask_index = -1;
> +  slp_tree slp_op = NULL;
>if (gassign *assign = dyn_cast  (stmt_info->stmt))
>  {
>scalar_dest = gimple_assign_lhs (assign);
> @@ -9861,7 +9862,7 @@ vectorizable_load (vec_info *vinfo,
>   mask_index = vect_slp_child_index_for_operand (call, mask_index);
>if (mask_index >= 0
> && !vect_check_scalar_mask (vinfo, stmt_info, slp_node, mask_index,
> -   &mask, NULL, &mask_dt, &mask_vectype))
> +   &mask, &slp_op, &mask_dt, &mask_vectype))
>   return false;
>  }
>  
> @@ -10046,7 +10047,7 @@ vectorizable_load (vec_info *vinfo,
>  {
>if (slp_node
> && mask
> -   && !vect_maybe_update_slp_op_vectype (SLP_TREE_CHILDREN (slp_node)[0],
> +   && !vect_maybe_update_slp_op_vectype (slp_op,
>   mask_vectype))
>   {
> if (dump_enabled_p ())
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec

2023-11-03 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: 钟居哲 
Sent: Saturday, November 4, 2023 9:43 AM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec

LGTM.


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2023-11-04 09:41
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Remove HF modes of FP to INT rounding autovec
From: Pan Li mailto:pan2...@intel.com>>

The [i|l|ll][rint|round|ceil|floor] internal functions are
defined as DEF_INTERNAL_FLT_FN instead of DEF_INTERNAL_FLT_FLOATN_FN.
Then the *f16 (N=16 of FLOATN) format of these functions are not
available when try to get the ifn from the given cfn in the
vectorizable_call. Aka:

BUILT_IN_LRINTF16 => IFN_LAST (should be IFN_LRINT here)
BUILT_IN_RINTF16 => IFN_RINT

It is better to remove FP16 related modes until the additional
middle-end support is ready. This patch would like to clean the FP16
modes with some comments.

gcc/ChangeLog:

* config/riscv/vector-iterators.md: Remove HF modes.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/vector-iterators.md | 59 +---
1 file changed, 2 insertions(+), 57 deletions(-)

diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index f2d9f60b631..e80eaedc4b3 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -3221,20 +3221,15 @@ (define_mode_attr vnnconvert [
;; V_F2SI_CONVERT: (HF, SF, DF) => SI
;; V_F2DI_CONVERT: (HF, SF, DF) => DI
;;
+;; HF requires additional support from internal function, aka
+;; gcc/internal-fn.def, remove HF shortly until the middle-end is ready.
(define_mode_attr V_F2SI_CONVERT [
-  (RVVM4HF "RVVM8SI") (RVVM2HF "RVVM4SI") (RVVM1HF "RVVM2SI")
-  (RVVMF2HF "RVVM1SI") (RVVMF4HF "RVVMF2SI")
-
   (RVVM8SF "RVVM8SI") (RVVM4SF "RVVM4SI") (RVVM2SF "RVVM2SI")
   (RVVM1SF "RVVM1SI") (RVVMF2SF "RVVMF2SI")
   (RVVM8DF "RVVM4SI") (RVVM4DF "RVVM2SI") (RVVM2DF "RVVM1SI")
   (RVVM1DF "RVVMF2SI")
-  (V1HF "V1SI") (V2HF "V2SI") (V4HF "V4SI") (V8HF "V8SI") (V16HF "V16SI")
-  (V32HF "V32SI") (V64HF "V64SI") (V128HF "V128SI") (V256HF "V256SI")
-  (V512HF "V512SI") (V1024HF "V1024SI")
-
   (V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI") (V8SF "V8SI") (V16SF "V16SI")
   (V32SF "V32SI") (V64SF "V64SI") (V128SF "V128SI") (V256SF "V256SI")
   (V512SF "V512SI") (V1024SF "V1024SI")
@@ -3245,19 +3240,12 @@ (define_mode_attr V_F2SI_CONVERT [
])
(define_mode_attr v_f2si_convert [
-  (RVVM4HF "rvvm8si") (RVVM2HF "rvvm4si") (RVVM1HF "rvvm2si")
-  (RVVMF2HF "rvvm1si") (RVVMF4HF "rvvmf2si")
-
   (RVVM8SF "rvvm8si") (RVVM4SF "rvvm4si") (RVVM2SF "rvvm2si")
   (RVVM1SF "rvvm1si") (RVVMF2SF "rvvmf2si")
   (RVVM8DF "rvvm4si") (RVVM4DF "rvvm2si") (RVVM2DF "rvvm1si")
   (RVVM1DF "rvvmf2si")
-  (V1HF "v1si") (V2HF "v2si") (V4HF "v4si") (V8HF "v8si") (V16HF "v16si")
-  (V32HF "v32si") (V64HF "v64si") (V128HF "v128si") (V256HF "v256si")
-  (V512HF "v512si") (V1024HF "v1024si")
-
   (V1SF "v1si") (V2SF "v2si") (V4SF "v4si") (V8SF "v8si") (V16SF "v16si")
   (V32SF "v32si") (V64SF "v64si") (V128SF "v128si") (V256SF "v256si")
   (V512SF "v512si") (V1024SF "v1024si")
@@ -3268,9 +3256,6 @@ (define_mode_attr v_f2si_convert [
])
(define_mode_iterator V_VLS_F_CONVERT_SI [
-  (RVVM4HF "TARGET_ZVFH") (RVVM2HF "TARGET_ZVFH") (RVVM1HF "TARGET_ZVFH")
-  (RVVMF2HF "TARGET_ZVFH") (RVVMF4HF "TARGET_ZVFH && TARGET_MIN_VLEN > 32")
-
   (RVVM8SF "TARGET_VECTOR_ELEN_FP_32") (RVVM4SF "TARGET_VECTOR_ELEN_FP_32")
   (RVVM2SF "TARGET_VECTOR_ELEN_FP_32") (RVVM1SF "TARGET_VECTOR_ELEN_FP_32")
   (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
@@ -3280,18 +3265,6 @@ (define_mode_iterator V_VLS_F_CONVERT_SI [
   (RVVM2DF "TARGET_VECTOR_ELEN_FP_64")
   (RVVM1DF "TAR

RE: [PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec

2023-11-05 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zhong 
Sent: Sunday, November 5, 2023 5:40 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; Li, Pan2 ; Wang, Yanzhang 
; kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec

lgtm
 Replied Message 
From
pan2...@intel.com<mailto:pan2...@intel.com>
Date
11/05/2023 17:30
To
gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>
Cc
juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>,
pan2...@intel.com<mailto:pan2...@intel.com>,
yanzhang.w...@intel.com<mailto:yanzhang.w...@intel.com>,
kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>
Subject
[PATCH v1] RISC-V: Support FP rint to i/l/ll diff size autovec

RE: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

2024-03-01 Thread Li, Pan2

Sure thing.

Pan

-Original Message-
From: Vineet Gupta  
Sent: Saturday, March 2, 2024 3:00 AM
To: Li, Pan2 ; Kito Cheng ; 钟居哲 

Cc: gcc-patches ; Wang, Yanzhang 
; rdapp.gcc ; Jeff Law 

Subject: Re: [PATCH v3] RISC-V: Introduce gcc option mrvv-vector-bits for RVV

Hi Pan,

On 2/28/24 17:23, Li, Pan2 wrote:
>
> Personally I prefer to remove --param=riscv-autovec-preference=none
> and only allow
>
> mrvv-vector-bits, to avoid tricky(maybe) sematic of none preference.
> However, let’s
>
> wait for a while in case there are some comments from others.
>

We are very interested in this topic. Could you please CC me and Palmer
for future versions of the patchset.

Thx,
-Vineet

>  
>
> Pan
>
>  
>
> *From:*Kito Cheng 
> *Sent:* Wednesday, February 28, 2024 10:55 PM
> *To:* 钟居哲
> *Cc:* Li, Pan2 ; gcc-patches
> ; Wang, Yanzhang ;
> rdapp.gcc ; Jeff Law 
> *Subject:* Re: Re: [PATCH v3] RISC-V: Introduce gcc option
> mrvv-vector-bits for RVV
>
>  
>
> Hmm, maybe only keep --param=riscv-autovec-preference=none and remove
> other two if we think that might still useful? But anyway I have no
> strong opinion to keep that, I mean I am ok to remove whole
> --param=riscv-autovec-preference.
>
>  
>
> 钟居哲  於 2024年2月28日週三 21:59 寫道：
>
> I think it makes more sense to remove
> --param=riscv-autovec-preference and add -mrvv-vector-bits
>
>  
>
> 
>
> juzhe.zh...@rivai.ai
>
>  
>
> *From:* Kito Cheng <mailto:kito.ch...@gmail.com>
>
> *Date:* 2024-02-28 20:56
>
> *To:* pan2.li <mailto:pan2...@intel.com>
>
> *CC:* gcc-patches <mailto:gcc-patches@gcc.gnu.org>;
> juzhe.zhong <mailto:juzhe.zh...@rivai.ai>; yanzhang.wang
> <mailto:yanzhang.w...@intel.com>; rdapp.gcc
> <mailto:rdapp@gmail.com>; jeffreyalaw
> <mailto:jeffreya...@gmail.com>
>
> *Subject:* Re: [PATCH v3] RISC-V: Introduce gcc option
> mrvv-vector-bits for RVV
>
> Take one more look, I think this option should work and
> integrate with
>
> --param=riscv-autovec-preference= since they have similar jobs but
>
> slightly different.
>
>  
>
> We have 3 value for  --param=riscv-autovec-preference=: none,
> scalable
>
> and fixed-vlmax
>
>  
>
> -mrvv-vector-bits=scalable is work like
>
> --param=riscv-autovec-preference=scalable and
>
> -mrvv-vector-bits=zvl is work like
>
> --param=riscv-autovec-preference=fixed-vlmax.
>
>  
>
> So I think...we need to do some conflict check, like:
>
>  
>
> -mrvv-vector-bits=zvl can't work with
> --param=riscv-autovec-preference=scalable
>
> -mrvv-vector-bits=scalable can't work with
>
> --param=riscv-autovec-preference=fixed-vlmax
>
>  
>
> but it may not just alias since there is some useful
> combinations like:
>
>  
>
> -mrvv-vector-bits=zvl with --param=riscv-autovec-preference=none:
>
> NO auto vectorization but intrinsic code still could benefit
> from the
>
> -mrvv-vector-bits=zvl option.
>
>  
>
> -mrvv-vector-bits=scalable with
> --param=riscv-autovec-preference=none
>
> Should still work for VLS code gen, but just disable auto
>
> vectorization per the option semantic.
>
>  
>
> However here is something we need some fix, since
>
> --param=riscv-autovec-preference=none still disable VLS code
> gen for
>
> now, you can see some example here:
>
> https://godbolt.org/z/fMTr3eW7K
>
>  
>
> But I think it's really the right behavior here, this part
> might need
>
> to be fixed in vls_mode_valid_p and some other places.
>
>  
>
>  
>
> Anyway I think we need to check all use sites with
> RVV_FIXED_VLMAX and
>
> RVV_SCALABLE, and need to make sure all use site of
> RVV_FIXED_VLMAX
>
> also checked with RVV_VECTOR_BITS_ZVL.
>
>  
>
>  
>
>  
>
> > -/* Return the VLEN value associated with -march.
>
> > +static int
>
> > +riscv_convert_vector_bits (int min_vlen)
>
>  
>
> Not sure if we really need this function, it seems it al

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-01 Thread Li, Pan2

Yeah, talking about this with robin offline for this fix.

> Yes, but what we set tieable is e.g. V4QI and V2SF.

That comes from different code lines.
Jeff would like to learn more about extract_low_bits, it will first convert to 
int_mode and then call the tieable_p.
And I bet the V4QI and V2SF comes from the if condition for gen_lowpart.

--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
 copy_rtx (store_info->const_rhs));
   else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
 && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-&& targetm.modes_tieable_p (read_mode, store_mode))  // <= V4QI and V2SF 
here.
+&& targetm.modes_tieable_p (read_mode, store_mode)
+&& validate_subreg (read_mode, store_mode, copy_rtx (store_info->rhs),
+   subreg_lowpart_offset (read_mode, store_mode)))
 read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,

Pan

-Original Message-
From: Robin Dapp  
Sent: Thursday, February 29, 2024 9:29 PM
To: Li, Pan2 ; Jeff Law ; 
gcc-patches@gcc.gnu.org
Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
richard.guent...@gmail.com; Wang, Yanzhang ; Liu, 
Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

On 2/29/24 02:38, Li, Pan2 wrote:
>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I 
>> suspect those are going to fail for RISC-V as those aren't tieable.
> 
> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
> RISC-V.
> 
> static bool
> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
> {
>   /* We don't allow different REG_CLASS modes tieable since it
>  will cause ICE in register allocation (RA).
>  E.g. V2SI and DI are not tieable.  */
>   if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
> return false;
>   return (mode1 == mode2
>   || !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>&& GET_MODE_CLASS (mode2) == MODE_FLOAT));
> }

Yes, but what we set tieable is e.g. V4QI and V2SF.

I suggested a target band-aid before:

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 799d7919a4a..982ca1a4250 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8208,6 +8208,11 @@ riscv_modes_tieable_p (machine_mode mode1, machine_mode 
mode2)
  E.g. V2SI and DI are not tieable.  */
   if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
 return false;
+  if (GET_MODE_CLASS (GET_MODE_INNER (mode1)) == MODE_INT
+  && GET_MODE_CLASS (GET_MODE_INNER (mode2)) == MODE_FLOAT
+  && GET_MODE_SIZE (GET_MODE_INNER (mode1))
+   != GET_MODE_SIZE (GET_MODE_INNER (mode2)))
+return false;
   return (mode1 == mode2
  || !(GET_MODE_CLASS (mode1) == MODE_FLOAT
   && GET_MODE_CLASS (mode2) == MODE_FLOAT));

but I don't like that as it just works around something
that I didn't even understand fully...

Regards
 Robin

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-01 Thread Li, Pan2

/match.pd
+++ b/gcc/match.pd
@@ -10276,3 +10276,32 @@ and,
   }
   (if (full_perm_p)
(vec_perm (op@3 @0 @1) @3 @2))
+
+#if GIMPLE
+
+/* Unsigned saturation add, aka:
+   SAT_ADDU = (X + Y) | - ((X + Y) < X) or
+   SAT_ADDU = (X + Y) | - ((X + Y) < Y).  */
+(simplify
+ (bit_ior:c (plus:c@2 @0 @1) (negate (convert (lt @2 @0
+   (if (optimize
+   && INTEGRAL_TYPE_P (type)
+   && types_match (type, TREE_TYPE (@0))
+   && types_match (type, TREE_TYPE (@1))
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && direct_internal_fn_supported_p (IFN_SAT_ADD, type, 
OPTIMIZE_FOR_BOTH))
+   (IFN_SAT_ADD @0 @1)))
+
+/* Unsigned saturation sub , aka
+   SAT_SUBU = x >= y ? x - y : 0.  */
+(simplify
+  (cond (ge @0 @1) (minus @0 @1) integer_zerop)
+(if (optimize
+   && INTEGRAL_TYPE_P (type)
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && types_match (type, TREE_TYPE (@0))
+   && types_match (type, TREE_TYPE (@1))
+   && direct_internal_fn_supported_p (IFN_SAT_SUB, type, 
OPTIMIZE_FOR_BOTH))
+(IFN_SAT_SUB @0 @1)))
+
+#endif
diff --git a/gcc/optabs.def b/gcc/optabs.def
index ad14f9328b9..bebe38c888b 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -111,15 +111,15 @@ OPTAB_NX(add_optab, "add$F$a3")
 OPTAB_NX(add_optab, "add$Q$a3")
 OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfunc)
 OPTAB_VX(addv_optab, "add$F$a3")
-OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', 
gen_signed_fixed_libfunc)
-OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', 
gen_unsigned_fixed_libfunc)
+OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', gen_int_libfunc)
+OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', gen_int_libfunc)
 OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_libfunc)
 OPTAB_NX(sub_optab, "sub$F$a3")
 OPTAB_NX(sub_optab, "sub$Q$a3")
 OPTAB_VL(subv_optab, "subv$I$a3", MINUS, "sub", '3', gen_intv_fp_libfunc)
 OPTAB_VX(subv_optab, "sub$F$a3")
-OPTAB_NL(sssub_optab, "sssub$Q$a3", SS_MINUS, "sssub", '3', 
gen_signed_fixed_libfunc)
-OPTAB_NL(ussub_optab, "ussub$Q$a3", US_MINUS, "ussub", '3', 
gen_unsigned_fixed_libfunc)
+OPTAB_NL(sssub_optab, "sssub$a3", SS_MINUS, "sssub", '3', gen_int_libfunc)
+OPTAB_NL(ussub_optab, "ussub$a3", US_MINUS, "ussub", '3', gen_int_libfunc)
 OPTAB_NL(smul_optab, "mul$Q$a3", MULT, "mul", '3', gen_int_fp_fixed_libfunc)
 OPTAB_NX(smul_optab, "mul$P$a3")
 OPTAB_NX(smul_optab, "mul$F$a3")

Pan

-Original Message-
From: Li, Pan2  
Sent: Tuesday, February 27, 2024 10:36 PM
To: Richard Biener ; Tamar Christina 

Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; kito.ch...@gmail.com; richard.sandiford@arm.com2; 
jeffreya...@gmail.com
Subject: RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
US_PLUS

Thanks Richard and Tammer for moving this forward.

> That said, I would like to see the bigger picture to be kept in mind
> before altering the GIMPLE IL.

> Adding an internal function for an already present optab is a
> no-brainer.  Adding a vectorizer
> and/or if-conversion pattern to make use of this during vectorization
> is existing practice.
> Adding pattern recognition to ISEL or widening-mul passes for
> instructions the CPU can do
> is existing practice and OK.

Thanks for explaining, got the point here.

> So I'd suggest writing some example of both signed and unsigned saturating 
> add and multiply

> Because signed addition, will likely require a branch and signed 
> multiplication would require a
> larger type.

Ack, will prepare one prototype validation patch for add, sub and mul (both 
unsigned and signed) soon.

Pan

-Original Message-
From: Richard Biener  
Sent: Tuesday, February 27, 2024 9:42 PM
To: Tamar Christina 
Cc: Li, Pan2 ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; richard.sandiford@arm.com2; jeffreya...@gmail.com
Subject: Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
US_PLUS

On Tue, Feb 27, 2024 at 1:57 PM Tamar Christina  wrote:
>
> > Thanks Tamar.
> >
> > > Those two cases also *completely* stop vectorization because of either the
> > > control flow or the fact the vectorizer can't handle complex types.
> >
> > Yes, we eventually would like to vectorize the SAT ALU but we start with 
> > scalar part
> > first.

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-04 Thread Li, Pan2

Thanks Jeff for comments.

> But in the case of a vector modes, we can usually reinterpret the 
> underlying bits in whatever mode we want and do any of the usual 
> operations on those bits.

Yes, I think that is why we can allow vector mode in get_stored_val if my 
understanding is correct.
And then the different modes will return by gen_low_part. Unfortunately, there 
are some modes
 (less than a vector bit size like V2SF, V2QI for vlen=128) are considered as 
invalid by validate_subreg, 
and return NULL_RTX result in the final ICE.

Thus, consider stage 4 I wonder if this is a acceptable fix, aka find some 
where to filter-out the invalid
modes before goes to gen_low_part.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, March 4, 2024 6:47 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

On 2/29/24 06:28, Robin Dapp wrote:
> On 2/29/24 02:38, Li, Pan2 wrote:
>>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I
>>> suspect those are going to fail for RISC-V as those aren't tieable.
>>
>> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
>> RISC-V.
>>
>> static bool
>> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>> {
>>/* We don't allow different REG_CLASS modes tieable since it
>>   will cause ICE in register allocation (RA).
>>   E.g. V2SI and DI are not tieable.  */
>>if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
>>  return false;
>>return (mode1 == mode2
>>|| !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>> && GET_MODE_CLASS (mode2) == MODE_FLOAT));
>> }
> 
> Yes, but what we set tieable is e.g. V4QI and V2SF.
But in the case of a vector modes, we can usually reinterpret the 
underlying bits in whatever mode we want and do any of the usual 
operations on those bits.

In my mind that's fundamentally different than the int vs fp case.  If 
we have an integer value in an FP register, we can't really operate on 
the value in any sensible way without first copying it over to the 
integer register file and vice-versa.

Jeff

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-04 Thread Li, Pan2

00) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cgraphunit.cc:2587
#24 0x01a048a3 in compile_file () at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/toplev.cc:476
#25 0x01a079e9 in do_compile () at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/toplev.cc:2154
#26 0x01a07dee in toplev::main (this=0x7fffdcf2, argc=19, 
argv=0x7fffde28) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/toplev.cc:2310
#27 0x03ebcc5d in main (argc=19, argv=0x7fffde28) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/main.cc:39

BTW, does match.pd support nested cond like below? I am debugging into 
gimple_simplify_COND_EXPR for why not hit the pattern...
+(simplify
+  (cond
+(lt @0 integer_zerop)
+(plus:c @0 @1)
+(cond (lt @1 integer_zerop) @1 @0))
+  (IFN_SAT_ADD @0 @1))

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, March 4, 2024 6:31 PM
To: Li, Pan2 
Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
US_PLUS

On Sat, Mar 2, 2024 at 8:46 AM Li, Pan2  wrote:
>
> Hi Richard and Tamar,
>
> I have a try with DEF_INTERNAL_SIGNED_OPTAB_FN for SAT_ADD/SUB/MUL but meet 
> some problem when match.pd.
>
> For unsigned SAT_ADD = (x + y) | - ((x + y) < x), the match.pd can be 
> (bit_ior:c (plus:c@2 @0 @1) (negate (convert (lt @2 @0.
> For unsigned SAT_SUB = x >= y ? x - y : 0, and then match.pd can be (cond (ge 
> @0 @1) (minus @0 @1) integer_zerop).
>
> For signed SAT_ADD/SAT_SUB as below, seems not easy to make the simplify 
> pattern works well as expected up to a point.
> sint64_t sat_add (sint64_t x, sint64_t y)
> {
>   sint64_t a = x ^ y;
>   sint64_t add = x + y;
>   sint64_t b = sum ^ x;
>
>   return (a < 0 || (a >= 0 && b >= 0)) ? add : (MAX_INT64 + (x < 0));
> }
>
> sint64_t sad_sub (sint64_t x, sint64_t y)
> {
>   sint64_t a = x ^ y;
>   sint64_t sub = x - y;
>   sint64_t b = sub ^ x;
>
>   return (a >= 0 || (a < 0 && b >= 0) ? sub : (MAX_INT64 + (x < 0));
> }
>
> For SAT_MUL as below, looks we may need widen type. I am not sure if we can 
> leverage MUL_OVERFLOW or not in match.pd.
>
> uint32_t sat_mul (uint32_t x, uint32_t y)
> {
>   uint64_t mul = (uint64_t)x * (uint64_t)y;
>   return mul > UINT32_MAX ? UINT32_MAX : (uint32_t)mul;
> }
>
> sint32_t sat_mul (sint32_t x, sint32_t y)
> {
>   sint64_t mul = (sint64_t)x * (sint64_t))y;
>
>   return mul <= MAX_INT32 && mul >= MIN_INT32 ? mul : MAX_INT32 + (x ^ y) > 0;
> }
>
> Below diff only contains unsigned SAT_ADD and SAT_SUB for prototype 
> validation.
> I will continue to try the rest part in match.pd and keep you posted.
>
> -
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 80efdf2b7e5..d9ad6fe2b58 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -132,6 +132,9 @@ extern void riscv_asm_output_external (FILE *, const 
> tree, const char *);
>  extern bool
>  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> +extern void riscv_expand_usadd (rtx, rtx, rtx);
> +extern void riscv_expand_ussub (rtx, rtx, rtx);
>
>  #ifdef RTX_CODE
>  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
> *invert_ptr = 0);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 5e984ee2a55..795462526df 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -10655,6 +10655,28 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
>return true;
>  }
>
> +/* Emit insn for the saturation addu, aka (x + y) | - ((x + y) < x).  */
> +void
> +riscv_expand_usadd (rtx dest, rtx x, rtx y)
> +{
> +  fprintf (stdout, "Hit riscv_expand_usadd.\n");
> +  // ToDo
> +}
> +
> +void
> +riscv_expand_ussub (rtx dest, rtx x, rtx y)
> +{
> +  fprintf (stdout, "Hit riscv_expand_ussub.\n");
> +  // ToDo
> +}
> +
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP
>  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 1fec13092e2..e2dbadb3ead 100644
> --- a/gcc/config/riscv/riscv.m

RE: [PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]

2024-03-05 Thread Li, Pan2

Committed, thanks Juzhe.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Tuesday, March 5, 2024 5:15 PM
To: Li, Pan2 ; gcc-patches 
Cc: kito.cheng ; Wang, Yanzhang 
; Li, Pan2 
Subject: Re: [PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize 
[NFC]

LGTM. Thanks for clean up.


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2024-03-05 16:59
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
kito.cheng<mailto:kito.ch...@gmail.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; Pan Li<mailto:pan2...@intel.com>
Subject: [PATCH v1] RISC-V: Cleanup unused code in riscv_v_adjust_bytesize [NFC]
From: Pan Li mailto:pan2...@intel.com>>

Cleanup mode_size related code which is not used anymore. Below tests are
passed for this patch.

* The RVV fully regresssion test.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_v_adjust_bytesize): Cleanup unused
mode_size related code.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv.cc | 4 
1 file changed, 4 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 56cd8d2c23f..691d967de29 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1527,10 +1527,6 @@ riscv_v_adjust_bytesize (machine_mode mode, int scale)
return BYTES_PER_RISCV_VECTOR;
   poly_int64 nunits = GET_MODE_NUNITS (mode);
-  poly_int64 mode_size = GET_MODE_SIZE (mode);
-
-  if (maybe_eq (mode_size, (uint16_t) -1))
- mode_size = riscv_vector_chunks * scale;
   if (nunits.coeffs[0] > 8)
return exact_div (nunits, 8);
--
2.34.1

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-06 Thread Li, Pan2

Thanks Richard for comments.

> gen_int_libfunc will no longer make it emit libcalls for fixed point
> modes, so this can't be correct
> and there's no libgcc implementation for integer mode saturating ops,
> so it's pointless to emit calls
> to them.

Got the pointer here, the OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", 
'3', gen_unsigned_fixed_libfunc)
Is designed for the fixed point, cannot cover integer mode right now.

Given we have saturating integer alu like below, could you help to coach me the 
most reasonable way to represent
It in scalar as well as vectorize part? Sorry not familiar with this part and 
still dig into how it works...

uint32_t sat_uadd (uint32_t a, uint32_t b)
{
  uint32_t add = a + b;
  return add | -(add < a);
}

sint32_t sat_sadd (sint32_t a, sint32_t b)
{
  sint32_t add = a + b;
  sint32_t x = a ^ b;
  sint32_t y = add ^ x;
  return x < 0 ? add : (y >= 0 ? add : INT32_MAX + (x < 0));
}

uint32_t sat_usub (uint32_t a, uint32_t b)
{
  return a >= b ? a - b : 0;
}

sint32_t sat_ssub (sint32_t a, sint32_t b)
{
  sint32_t sub = a - b;
  sint32_t x = a ^ b;
  sint32_t y = sub ^ x;
  return x >= 0 ? sub : (y >= 0 ? sub : INT32_MAX + (x < 0));
}

uint32_t sat_umul (uint32_t a, uint32_t b)
{
  uint64_t mul = a * b;

  return mul <= (uint64_t)UINT32_MAX ? (uint32_t)mul : UINT32_MAX;
}

sint32_t sat_smul (sint32_t a, sint32_t b)
{
  sint64_t mul = a * b;

  return mul >= (sint64_t)INT32_MIN && mul <= (sint64_t)INT32_MAX ? 
(sint32_t)mul : INT32_MAX + ((x ^ y) < 0);
}

uint32_t sat_udiv (uint32_t a, uint32_t b)
{
  return a / b; // never overflow
}

sint32_t sat_sdiv (sint32_t a, sint32_t b)
{
  return a == INT32_MIN && b == -1 ? INT32_MAX : a / b;
}

sint32_t sat_abs (sint32_t a)
{
  return a >= 0 ? a : (a == INT32_MIN ? INT32_MAX : -a);
}

Pan

-Original Message-
From: Richard Biener  
Sent: Tuesday, March 5, 2024 4:41 PM
To: Li, Pan2 
Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
US_PLUS

On Tue, Mar 5, 2024 at 8:09 AM Li, Pan2  wrote:
>
> Thanks Richard for comments.
>
> > I do wonder what the existing usadd patterns with integer vector modes
> > in various targets do?
> > Those define_insn will at least not end up in the optab set I guess,
> > so they must end up
> > being either unused or used by explicit gen_* (via intrinsic
> > functions?) or by combine?
>
> For usadd with vector modes, I think the backend like RISC-V try to leverage 
> instructions
> like Vector Single-Width Saturating Add(aka vsaddu.vv/x/i).
>
> > I think simply changing gen_*_fixed_libfunc to gen_int_libfunc won't
> > work.  Since there's
> > no libgcc support I'd leave it as gen_*_fixed_libfunc thus no library
> > fallback for integers?
>
> Change to gen_int_libfunc follows other int optabs. I am not sure if it will 
> hit the standard name usaddm3 for vector mode.
> But the happy path for scalar modes works up to a point, please help to 
> correct me if any misunderstanding.

gen_int_libfunc will no longer make it emit libcalls for fixed point
modes, so this can't be correct
and there's no libgcc implementation for integer mode saturating ops,
so it's pointless to emit calls
to them.

> #0  riscv_expand_usadd (dest=0x76a8c7c8, x=0x76a8c798, 
> y=0x76a8c7b0) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:10662
> #1  0x029f142a in gen_usaddsi3 (operand0=0x76a8c7c8, 
> operand1=0x76a8c798, operand2=0x76a8c7b0) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/config/riscv/riscv.md:3848
> #2  0x01751e60 in insn_gen_fn::operator() rtx_def*> (this=0x4910e70 ) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/recog.h:441
> #3  0x0180f553 in maybe_gen_insn (icode=CODE_FOR_usaddsi3, nops=3, 
> ops=0x7fffd2c0) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8232
> #4  0x0180fa42 in maybe_expand_insn (icode=CODE_FOR_usaddsi3, nops=3, 
> ops=0x7fffd2c0) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8275
> #5  0x0180fade in expand_insn (icode=CODE_FOR_usaddsi3, nops=3, 
> ops=0x7fffd2c0) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8306
> #6  0x015cebdc in expand_fn_using_insn (stmt=0x76a36480, 
> icode=CODE_FOR_usaddsi3, noutputs=1, ninputs=2) at 
> /home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.cc:254
> #7  0x015

RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-07 Thread Li, Pan2

Thanks a lot for coaching, really save my day. I will have a try for 
usadd/ssadd includes both the scalar and vector (ISEL/widen_mult) modes in v3.

Pan

-Original Message-
From: Richard Biener  
Sent: Thursday, March 7, 2024 4:41 PM
To: Li, Pan2 
Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; 
juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
US_PLUS

On Thu, Mar 7, 2024 at 2:54 AM Li, Pan2  wrote:
>
> Thanks Richard for comments.
>
> > gen_int_libfunc will no longer make it emit libcalls for fixed point
> > modes, so this can't be correct
> > and there's no libgcc implementation for integer mode saturating ops,
> > so it's pointless to emit calls
> > to them.
>
> Got the pointer here, the OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, 
> "usadd", '3', gen_unsigned_fixed_libfunc)
> Is designed for the fixed point, cannot cover integer mode right now.

I think

OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3',
gen_unsigned_fixed_libfunc)

would work fine (just dropping the $Q).

> Given we have saturating integer alu like below, could you help to coach me 
> the most reasonable way to represent
> It in scalar as well as vectorize part? Sorry not familiar with this part and 
> still dig into how it works...

As in your v2, .SAT_ADD for both sat_uadd and sat_sadd, similar for
the other cases.

As I said, use vectorizer patterns and possibly do instruction
selection at ISEL/widen_mult time.

Richard.

> uint32_t sat_uadd (uint32_t a, uint32_t b)
> {
>   uint32_t add = a + b;
>   return add | -(add < a);
> }
>
> sint32_t sat_sadd (sint32_t a, sint32_t b)
> {
>   sint32_t add = a + b;
>   sint32_t x = a ^ b;
>   sint32_t y = add ^ x;
>   return x < 0 ? add : (y >= 0 ? add : INT32_MAX + (x < 0));
> }
>
> uint32_t sat_usub (uint32_t a, uint32_t b)
> {
>   return a >= b ? a - b : 0;
> }
>
> sint32_t sat_ssub (sint32_t a, sint32_t b)
> {
>   sint32_t sub = a - b;
>   sint32_t x = a ^ b;
>   sint32_t y = sub ^ x;
>   return x >= 0 ? sub : (y >= 0 ? sub : INT32_MAX + (x < 0));
> }
>
> uint32_t sat_umul (uint32_t a, uint32_t b)
> {
>   uint64_t mul = a * b;
>
>   return mul <= (uint64_t)UINT32_MAX ? (uint32_t)mul : UINT32_MAX;
> }
>
> sint32_t sat_smul (sint32_t a, sint32_t b)
> {
>   sint64_t mul = a * b;
>
>   return mul >= (sint64_t)INT32_MIN && mul <= (sint64_t)INT32_MAX ? 
> (sint32_t)mul : INT32_MAX + ((x ^ y) < 0);
> }
>
> uint32_t sat_udiv (uint32_t a, uint32_t b)
> {
>   return a / b; // never overflow
> }
>
> sint32_t sat_sdiv (sint32_t a, sint32_t b)
> {
>   return a == INT32_MIN && b == -1 ? INT32_MAX : a / b;
> }
>
> sint32_t sat_abs (sint32_t a)
> {
>   return a >= 0 ? a : (a == INT32_MIN ? INT32_MAX : -a);
> }
>
> Pan
>
> -----Original Message-
> From: Richard Biener 
> Sent: Tuesday, March 5, 2024 4:41 PM
> To: Li, Pan2 
> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; 
> juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
> kito.ch...@gmail.com; jeffreya...@gmail.com
> Subject: Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation 
> US_PLUS
>
> On Tue, Mar 5, 2024 at 8:09 AM Li, Pan2  wrote:
> >
> > Thanks Richard for comments.
> >
> > > I do wonder what the existing usadd patterns with integer vector modes
> > > in various targets do?
> > > Those define_insn will at least not end up in the optab set I guess,
> > > so they must end up
> > > being either unused or used by explicit gen_* (via intrinsic
> > > functions?) or by combine?
> >
> > For usadd with vector modes, I think the backend like RISC-V try to 
> > leverage instructions
> > like Vector Single-Width Saturating Add(aka vsaddu.vv/x/i).
> >
> > > I think simply changing gen_*_fixed_libfunc to gen_int_libfunc won't
> > > work.  Since there's
> > > no libgcc support I'd leave it as gen_*_fixed_libfunc thus no library
> > > fallback for integers?
> >
> > Change to gen_int_libfunc follows other int optabs. I am not sure if it 
> > will hit the standard name usaddm3 for vector mode.
> > But the happy path for scalar modes works up to a point, please help to 
> > correct me if any misunderstanding.
>
> gen_int_libfunc will no longer make it emit libcalls for fixed point
> modes, so this can't be correct
> and there's no libgcc implementation for integer mode saturating ops,
> so it's pointless to e

RE: [PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len and mask

2024-03-09 Thread Li, Pan2

Thanks Richard for comments.

> That said, the assert you run into should be only asserted during transform,
> not during analysis.
Good to learn that the assertion is only valid during transform, I guess we may 
have almost
the same case in vectorizable_load. I will try to test only allow assertion 
during transform, to
see if there is any regressions and send the v2.

> It possibly was before Robins costing reorg?
Sorry, not very sure which commit from robin.

Pan

-Original Message-
From: Richard Biener  
Sent: Friday, March 8, 2024 10:03 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, 
Yanzhang ; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v1] VECT: Bugfix ICE for vectorizable_store when both len 
and mask

On Fri, Mar 8, 2024 at 2:59 PM Richard Biener
 wrote:
>
> On Fri, Mar 8, 2024 at 1:04 AM  wrote:
> >
> > From: Pan Li 
> >
> > This patch would like to fix one ICE in vectorizable_store for both the
> > loop_masks and loop_lens.  The ICE looks like below with "-march=rv64gcv 
> > -O3".
> >
> > during GIMPLE pass: vect
> > test.c: In function ‘d’:
> > test.c:6:6: internal compiler error: in vectorizable_store, at
> > tree-vect-stmts.cc:8691
> > 6 | void d() {
> >   |  ^
> > 0x37a6f2f vectorizable_store
> > .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
> > 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
> > _slp_tree*, _slp_instance*, vec*)
> > .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
> > 0x1db5dca vect_analyze_loop_operations
> > .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
> > 0x1db885b vect_analyze_loop_2
> > .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
> > 0x1dba029 vect_analyze_loop_1
> > .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
> > 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
> > .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
> > 0x1e389d1 try_vectorize_loop_1
> > .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
> > 0x1e38f3d try_vectorize_loop
> > .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
> > 0x1e39230 execute
> > .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
> >
> > Given the masks and the lens cannot be enabled simultanously when loop is
> > using partial vectors.  Thus, we need to ensure the one is disabled when we
> > would like to record the other in check_load_store_for_partial_vectors.  For
> > example, when we try to record loop len, we need to check if the loop mask
> > is disabled or not.
>
> I don't think you can rely on LOOP_VINFO_FULLY_WITH_LENGTH_P during
> analysis.  Instead how we tried to set up things is that we never even try
> both and there is (was?) code to reject partial vector usage when we end
> up recording both lens and masks.

That is, a fix along what you do would have been to split
LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P into
_WITH_LENGTH and _WITH_MASKS, make sure to record both when
a stmt can handle both so in the end we'll have a choice.  Currently
if we end up with both mask and len we don't know whether all stmts
support lens or masks or just some.

But we simply assumed on RISCV you'd never end up with unsupported len
but supported mask I guess.

Richard.

> That said, the assert you run into should be only asserted during transform,
> not during analysis.  It possibly was before Robins costing reorg?
>
> Richard.
>
> > Below testsuites are passed for this patch:
> > * The x86 bootstrap tests.
> > * The x86 fully regression tests.
> > * The aarch64 fully regression tests.
> > * The riscv fully regressison tests.
> >
> > PR target/114195
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Add
> > loop mask/len check before recording as they are mutual exclusion.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/base/pr114195-1.c: New test.
> >
> > Signed-off-by: Pan Li 
> > ---
> >  .../gcc.target/riscv/rvv/base/pr114195-1.c| 15 +++
> >  gcc/tree-vect-stmts.cc| 26 ++-
> >  2 files changed, 35 insertions(+), 6 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> >
> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
> > b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> > new file mode 100644
> > index 000..b0c9d5b81b8
> > --- /dev/null
> >

RE: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-10 Thread Li, Pan2

Committed, thanks Richard.

Pan

-Original Message-
From: Richard Biener  
Sent: Sunday, March 10, 2024 2:53 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, 
Yanzhang ; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and 
store are enabled



> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
> 
> From: Pan Li 
> 
> This patch would like to fix one ICE in vectorizable_store when both the
> loop_masks and loop_lens are enabled.  The ICE looks like below when build
> with "-march=rv64gcv -O3".
> 
> during GIMPLE pass: vect
> test.c: In function ‘d’:
> test.c:6:6: internal compiler error: in vectorizable_store, at
> tree-vect-stmts.cc:8691
>6 | void d() {
>  |  ^
> 0x37a6f2f vectorizable_store
>.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
> 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
> _slp_tree*, _slp_instance*, vec*)
>.../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
> 0x1db5dca vect_analyze_loop_operations
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
> 0x1db885b vect_analyze_loop_2
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
> 0x1dba029 vect_analyze_loop_1
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
> 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
>.../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
> 0x1e389d1 try_vectorize_loop_1
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
> 0x1e38f3d try_vectorize_loop
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
> 0x1e39230 execute
>.../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
> 
> There are two ways to reach vectorizer LD/ST, one is the analysis and
> the other is transform.  We cannot have both the lens and the masks
> enabled during transform but it is valid during analysis.  Given the
> transform doesn't required cost_vec,  we can only enable the assert
> based on cost_vec is NULL or not.
> 
> Below testsuites are passed for this patch:
> * The x86 bootstrap tests.
> * The x86 fully regression tests.
> * The aarch64 fully regression tests.
> * The riscv fully regressison tests.

Ok

Thanks,
Richard 

> gcc/ChangeLog:
> 
>* tree-vect-stmts.cc (vectorizable_store): Enable the assert
>during transform process.
>(vectorizable_load): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.target/riscv/rvv/base/pr114195-1.c: New test.
> 
> Signed-off-by: Pan Li 
> ---
> .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
> gcc/tree-vect-stmts.cc | 18 ++
> 2 files changed, 29 insertions(+), 4 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> 
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> new file mode 100644
> index 000..a67b847112b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
> @@ -0,0 +1,15 @@
> +/* Test that we do not have ice when compile */
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
> +
> +long a, b;
> +extern short c[];
> +
> +void d() {
> +  for (int e = 0; e < 35; e = 2) {
> +a = ({ a < 0 ? a : 0; });
> +b = ({ b < 0 ? b : 0; });
> +
> +c[e] = 0;
> +  }
> +}
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 14a3ffb5f02..e8617439a48 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -8697,8 +8697,13 @@ vectorizable_store (vec_info *vinfo,
>? &LOOP_VINFO_LENS (loop_vinfo)
>: NULL);
> 
> -  /* Shouldn't go with length-based approach if fully masked.  */
> -  gcc_assert (!loop_lens || !loop_masks);
> +  /* The vect_transform_stmt and vect_analyze_stmt will go here but there
> + are some difference here.  We cannot enable both the lens and masks
> + during transform but it is allowed during analysis.
> + Shouldn't go with length-based approach if fully masked.  */
> +  if (cost_vec == NULL)
> +/* The cost_vec is NULL during transfrom.  */
> +gcc_assert ((!loop_lens || !loop_masks));
> 
>   /* Targets with store-lane instructions must not require explicit
>  realignment.  vect_supportable_dr_alignment always returns either
> @@ -10577,8 +10582,13 @@ vectorizable_load (vec_info *vinfo,
>? &LOOP_VINFO_LENS (loop_vinfo)
>: NULL);
> 
> -  /* Shouldn't go with length-based approach if fully masked.  */
> -  gcc_asser

RE: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and store are enabled

2024-03-10 Thread Li, Pan2

> You might want to investigate why you get mask and not Len for a particular 
> stmt.  mixing will cause variable length vectorization to fail.

Yes, the new added gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c cannot 
vectorize, will try to investigate why.

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, March 11, 2024 1:05 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, 
Yanzhang ; rdapp@gmail.com; jeffreya...@gmail.com
Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len and 
store are enabled



> Am 10.03.2024 um 11:02 schrieb Li, Pan2 :
> 
> Committed, thanks Richard.

You might want to investigate why you get mask and not Len for a particular 
stmt.  mixing will cause variable length vectorization to fail.

> Pan
> 
> -Original Message-
> From: Richard Biener 
> Sent: Sunday, March 10, 2024 2:53 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> Wang, Yanzhang ; rdapp@gmail.com; 
> jeffreya...@gmail.com
> Subject: Re: [PATCH v2] VECT: Fix ICE for vectorizable LD/ST when both len 
> and store are enabled
> 
> 
> 
>> Am 10.03.2024 um 04:14 schrieb pan2...@intel.com:
>> 
>> From: Pan Li 
>> 
>> This patch would like to fix one ICE in vectorizable_store when both the
>> loop_masks and loop_lens are enabled.  The ICE looks like below when build
>> with "-march=rv64gcv -O3".
>> 
>> during GIMPLE pass: vect
>> test.c: In function ‘d’:
>> test.c:6:6: internal compiler error: in vectorizable_store, at
>> tree-vect-stmts.cc:8691
>>   6 | void d() {
>> |  ^
>> 0x37a6f2f vectorizable_store
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:8691
>> 0x37b861c vect_analyze_stmt(vec_info*, _stmt_vec_info*, bool*,
>> _slp_tree*, _slp_instance*, vec*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-stmts.cc:13242
>> 0x1db5dca vect_analyze_loop_operations
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:2208
>> 0x1db885b vect_analyze_loop_2
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3041
>> 0x1dba029 vect_analyze_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3481
>> 0x1dbabad vect_analyze_loop(loop*, vec_info_shared*)
>>   .../__RISC-V_BUILD__/../gcc/tree-vect-loop.cc:3639
>> 0x1e389d1 try_vectorize_loop_1
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1066
>> 0x1e38f3d try_vectorize_loop
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1182
>> 0x1e39230 execute
>>   .../__RISC-V_BUILD__/../gcc/tree-vectorizer.cc:1298
>> 
>> There are two ways to reach vectorizer LD/ST, one is the analysis and
>> the other is transform.  We cannot have both the lens and the masks
>> enabled during transform but it is valid during analysis.  Given the
>> transform doesn't required cost_vec,  we can only enable the assert
>> based on cost_vec is NULL or not.
>> 
>> Below testsuites are passed for this patch:
>> * The x86 bootstrap tests.
>> * The x86 fully regression tests.
>> * The aarch64 fully regression tests.
>> * The riscv fully regressison tests.
> 
> Ok
> 
> Thanks,
> Richard
> 
>> gcc/ChangeLog:
>> 
>>   * tree-vect-stmts.cc (vectorizable_store): Enable the assert
>>   during transform process.
>>   (vectorizable_load): Ditto.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>   * gcc.target/riscv/rvv/base/pr114195-1.c: New test.
>> 
>> Signed-off-by: Pan Li 
>> ---
>> .../gcc.target/riscv/rvv/base/pr114195-1.c | 15 +++
>> gcc/tree-vect-stmts.cc | 18 ++
>> 2 files changed, 29 insertions(+), 4 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> 
>> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c 
>> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> new file mode 100644
>> index 000..a67b847112b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr114195-1.c
>> @@ -0,0 +1,15 @@
>> +/* Test that we do not have ice when compile */
>> +/* { dg-do compile } */
>> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize" } */
>> +
>> +long a, b;
>> +extern short c[];
>> +
>> +void d() {
>> +  for (int e = 0; e < 35; e = 2) {
>> +a = ({ a < 0 ? a : 0; });
>> +b = ({ b < 0 ? b : 0; });
>> +
>> +c[e] = 0;
>> +  }
>> +}
>> diff --git a/gcc/tree-vect

RE: [PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-11 Thread Li, Pan2

Thanks Vinnet for reminder.

> While at it, can you also add the support for feature detection macro
> |__riscv_v_fixed_vlen

Kito told me that Greg will help to add that parts. Let's wait the comments 
from Kito.
Personally prefer a separated PATCH to cover that instead of appending here.

Pan

-Original Message-
From: Vineet Gupta  
Sent: Thursday, March 7, 2024 3:19 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, Yanzhang 
; rdapp@gmail.com; pal...@rivosinc.com
Subject: Re: [PATCH v2] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits 
for RVV



On 3/5/24 23:27, pan2...@intel.com wrote:
> From: Pan Li 
>
> Update in v2:
> * Cleanup some unused code.
> * Fix some typo of commit log.
>
> Original log:
>
> This patch would like to introduce one new gcc attribute for RVV.
> This attribute is used to define fixed-length variants of one
> existing sizeless RVV types.
>
> This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
> one args should be the integer constant and its' value is terminated
> by the LMUL and the vector register bits in zvl*b.  For example:
>
> typedef vint32m2_t fixed_vint32m2_t 
> __attribute__((riscv_rvv_vector_bits(128)));
>
> The above type define is valid when -march=rv64gc_zve64d_zvl64b
> (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
> -march=rv64gcv_zvl128b similar to below.
>
> "error: invalid RVV vector size '128', expected size is '256' based on
> LMUL of type and '-mrvv-vector-bits=zvl'"
>
> For the vint*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -
>
> For the vfloat*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, -
>
> For the vbool*_t types only below operations are allowed except
> the CMP and ALU. The CMP and ALU operations on vbool*_t is not
> well defined currently.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
>
> For the vint*x*m*_t tuple types are not suppored in this patch
> which is compatible with clang.
>
> This patch passed the below testsuites.
> * The riscv fully regression tests.

While at it, can you also add the support for feature detection macro
|__riscv_v_fixed_vlen

Thx,
-Vineet
|

>
> gcc/ChangeLog:
>
>   * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
>   New static func to take care of the RVV types decorated by
>   the attributes.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv.cc |  87 +-
>  .../riscv/rvv/base/riscv_rvv_vector_bits-1.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-10.c |  53 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-11.c |  76 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-12.c |  14 +++
>  .../riscv/rvv/base/riscv_rvv_vector_bits-2.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-3.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-4.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-5.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-6.c  |   6 +
>  .../riscv/rvv/base/riscv_rvv_vector_bits-7.c  |  76 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-8.c  |  75 
>  .../riscv/rvv/base/riscv_rvv_vector_bits-9.c  |  76 
>  .../riscv/rvv/base/riscv_rvv_vec

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-11 Thread Li, Pan2

Hi Jeff,

Is there any suggestion(s) for how to fix this ICE in the reasonable approach? 
Thanks a lot.

Pan

-Original Message-
From: Li, Pan2 
Sent: Tuesday, March 5, 2024 2:23 PM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Thanks Jeff for comments.

> But in the case of a vector modes, we can usually reinterpret the 
> underlying bits in whatever mode we want and do any of the usual 
> operations on those bits.

Yes, I think that is why we can allow vector mode in get_stored_val if my 
understanding is correct.
And then the different modes will return by gen_low_part. Unfortunately, there 
are some modes
 (less than a vector bit size like V2SF, V2QI for vlen=128) are considered as 
invalid by validate_subreg, 
and return NULL_RTX result in the final ICE.

Thus, consider stage 4 I wonder if this is a acceptable fix, aka find some 
where to filter-out the invalid
modes before goes to gen_low_part.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, March 4, 2024 6:47 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

On 2/29/24 06:28, Robin Dapp wrote:
> On 2/29/24 02:38, Li, Pan2 wrote:
>>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I
>>> suspect those are going to fail for RISC-V as those aren't tieable.
>>
>> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
>> RISC-V.
>>
>> static bool
>> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>> {
>>/* We don't allow different REG_CLASS modes tieable since it
>>   will cause ICE in register allocation (RA).
>>   E.g. V2SI and DI are not tieable.  */
>>if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
>>  return false;
>>return (mode1 == mode2
>>|| !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>> && GET_MODE_CLASS (mode2) == MODE_FLOAT));
>> }
> 
> Yes, but what we set tieable is e.g. V4QI and V2SF.
But in the case of a vector modes, we can usually reinterpret the 
underlying bits in whatever mode we want and do any of the usual 
operations on those bits.

In my mind that's fundamentally different than the int vs fp case.  If 
we have an integer value in an FP register, we can't really operate on 
the value in any sensible way without first copying it over to the 
integer register file and vice-versa.

Jeff

RE: [PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

2024-03-12 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Tuesday, March 12, 2024 3:11 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Fix some code style issue(s) in riscv-c.cc [NFC]

LGTM :)

On Tue, Mar 12, 2024 at 3:07 PM  wrote:
>
> From: Pan Li 
>
> Notice some code style issue(s) when add __riscv_v_fixed_vlen, includes:
>
> * Meanless empty line.
> * Line greater than 80 chars.
> * Indent with 3 space(s).
> * Argument unalignment.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-c.cc (riscv_ext_version_value): Fix
> code style greater than 80 chars.
> (riscv_cpu_cpp_builtins): Fix useless empty line, indent
> with 3 space(s) and argument unalignment.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv-c.cc | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index 3755ec0b8ef..7029ba88186 100644
> --- a/gcc/config/riscv/riscv-c.cc
> +++ b/gcc/config/riscv/riscv-c.cc
> @@ -37,7 +37,8 @@ along with GCC; see the file COPYING3.  If not see
>  static int
>  riscv_ext_version_value (unsigned major, unsigned minor)
>  {
> -  return (major * RISCV_MAJOR_VERSION_BASE) + (minor * 
> RISCV_MINOR_VERSION_BASE);
> +  return (major * RISCV_MAJOR_VERSION_BASE)
> ++ (minor * RISCV_MINOR_VERSION_BASE);
>  }
>
>  /* Implement TARGET_CPU_CPP_BUILTINS.  */
> @@ -110,7 +111,6 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>  case CM_MEDANY:
>builtin_define ("__riscv_cmodel_medany");
>break;
> -
>  }
>
>if (riscv_user_wants_strict_align)
> @@ -142,9 +142,9 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
>  riscv_ext_version_value (0, 12));
>  }
>
> -   if (TARGET_XTHEADVECTOR)
> - builtin_define_with_int_value ("__riscv_th_v_intrinsic",
> -riscv_ext_version_value (0, 11));
> +  if (TARGET_XTHEADVECTOR)
> +builtin_define_with_int_value ("__riscv_th_v_intrinsic",
> +  riscv_ext_version_value (0, 11));
>
>/* Define architecture extension test macros.  */
>builtin_define_with_int_value ("__riscv_arch_test", 1);
> --
> 2.34.1
>

RE: [PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-14 Thread Li, Pan2

> Shouldn't a major user-facing change like this be discussed in a PR against
> https://github.com/riscv-non-isa/riscv-c-api-doc/ or
> https://github.com/riscv-non-isa/rvv-intrinsic-doc before or concurrent with
> compiler implementation?

I think Kito is working on the spec doc already.

Hi Kito
Could you please help to correct me the behavior of the riscv_rvv_vector_bits 
attribute?
Sort of details and I suspect there is something missing, or different behavior 
compared with clang side.

Pan

-Original Message-
From: Stefan O'Rear  
Sent: Tuesday, March 12, 2024 9:25 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Kito Cheng ; Wang, Yanzhang 
; rdapp@gmail.com; Vineet Gupta 
; Palmer Dabbelt 
Subject: Re: [PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits 
for RVV

On Tue, Mar 12, 2024, at 2:15 AM, pan2...@intel.com wrote:
> From: Pan Li 
>
> Update in v3:
> * Add pre-defined __riscv_v_fixed_vlen when zvl.
>
> Update in v2:
> * Cleanup some unused code.
> * Fix some typo of commit log.
>
> Original log:
>
> This patch would like to introduce one new gcc attribute for RVV.
> This attribute is used to define fixed-length variants of one
> existing sizeless RVV types.
>
> This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
> one args should be the integer constant and its' value is terminated
> by the LMUL and the vector register bits in zvl*b.  For example:
>
> typedef vint32m2_t fixed_vint32m2_t 
> __attribute__((riscv_rvv_vector_bits(128)));
>
> The above type define is valid when -march=rv64gc_zve64d_zvl64b
> (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
> -march=rv64gcv_zvl128b similar to below.
>
> "error: invalid RVV vector size '128', expected size is '256' based on
> LMUL of type and '-mrvv-vector-bits=zvl'"
>
> Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to
> represent the fixed vlen in a RVV vector register.

Shouldn't a major user-facing change like this be discussed in a PR against
https://github.com/riscv-non-isa/riscv-c-api-doc/ or
https://github.com/riscv-non-isa/rvv-intrinsic-doc before or concurrent with
compiler implementation?

-s

> For the vint*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -
>
> For the vfloat*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, -
>
> For the vbool*_t types only below operations are allowed except
> the CMP and ALU. The CMP and ALU operations on vbool*_t is not
> well defined currently.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
>
> For the vint*x*m*_t tuple types are not suppored in this patch
> which is compatible with clang.
>
> This patch passed the below testsuites.
> * The riscv fully regression tests.
>
> gcc/ChangeLog:
>
>   * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define
>   macro __riscv_v_fixed_vlen when zvl.
>   * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
>   New static func to take care of the RVV types decorated by
>   the attributes.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
>   * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-

RE: [PATCH v1] RISC-V: Bugfix function target attribute pollution

2024-03-21 Thread Li, Pan2

Thanks Kito, will commit it after the ICE fix.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 21, 2024 8:33 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Bugfix function target attribute pollution

LGTM, thanks :)

On Wed, Mar 20, 2024 at 2:07 PM  wrote:
>
> From: Pan Li 
>
> This patch depends on below ICE fix.
>
> https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647915.html
>
> The function target attribute should be on a per-function basis.
> For example, we have 3 function as below:
>
> void test_1 () {}
>
> void __attribute__((target("arch=+v"))) test_2 () {}
>
> void __attribute__((target("arch=+zfh"))) test_3 () {}
>
> void test_4 () {}
>
> The scope of the target attribute should not extend the function body.
> Aka, test_3 cannot have the 'v' extension, as well as the test_4
> cannot have both the 'v' and 'zfh' extension.
>
> Unfortunately, for now the test_4 is able to leverage the 'v' and
> the 'zfh' extension which is incorrect.  This patch would like to
> fix the sticking attribute by introduce the commandline subset_list.
> When parse_arch, we always clone from the cmdline_subset_list instead
> of the current_subset_list.
>
> Meanwhile, we correct the print information about arch like below.
>
> .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zbb1p0
>
> The riscv_declare_function_name hook is always after the hook
> riscv_process_target_attr.  Thus, we introduce one hash_map to record
> the 1:1 mapping from fndel to its' subset_list in advance.  And later
> the riscv_declare_function_name is able to get the right information
> about the arch.
>
> Below test are passed for this patch
> * The riscv fully regression test.
>
> PR target/114352
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc (struct riscv_func_target_info):
> New struct for func decl and target name.
> (struct riscv_func_target_hasher): New hasher for hash table mapping
> from the fn_decl to fn_target_name.
> (riscv_func_decl_hash): New func to compute the hash for fn_decl.
> (riscv_func_target_hasher::hash): New func to impl hash interface.
> (riscv_func_target_hasher::equal): New func to impl equal interface.
> (riscv_cmdline_subset_list): New static var for cmdline subset list.
> (riscv_func_target_table_lazy_init): New func to lazy init the func
> target hash table.
> (riscv_func_target_get): New func to get target name from hash table.
> (riscv_func_target_put): New func to put target name into hash table.
> (riscv_func_target_remove_and_destory): New func to remove target
> info from the hash table and destory it.
> (riscv_parse_arch_string): Set the static var cmdline_subset_list.
> * config/riscv/riscv-subset.h (riscv_cmdline_subset_list): New static
> var for cmdline subset list.
> (riscv_func_target_get): New func decl.
> (riscv_func_target_put): Ditto.
> (riscv_func_target_remove_and_destory): Ditto.
> * config/riscv/riscv-target-attr.cc 
> (riscv_target_attr_parser::parse_arch):
> Take cmdline_subset_list instead of current_subset_list when clone.
> (riscv_process_target_attr): Record the func target info to hash 
> table.
> (riscv_option_valid_attribute_p): Add new arg tree fndel.
> * config/riscv/riscv.cc (riscv_declare_function_name): Consume the
> func target info and print the arch message.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr114352-3.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/common/config/riscv/riscv-common.cc   | 105 +++-
>  gcc/config/riscv/riscv-subset.h   |   4 +
>  gcc/config/riscv/riscv-target-attr.cc |  18 ++-
>  gcc/config/riscv/riscv.cc |   7 +-
>  .../gcc.target/riscv/rvv/base/pr114352-3.c| 113 ++
>  5 files changed, 240 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-3.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index d32bf147eca..76ec9bf846c 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -425,11 +425,108 @@ bool riscv_subset_list::parse_failed = false;
>
>  static riscv_subset_list *current_subset_list = NULL;
>
> +static riscv_subset_list *cmdline_subset_list = NULL;
> +
> +struct riscv_func

RE: [PATCH v1] RISC-V: Bugfix ICE for attribute((target("arch=+v"))

2024-03-21 Thread Li, Pan2

Thanks Kito, will send v2 for this change.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 21, 2024 8:39 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

> +
> +  /* Make sure the implied or combined extension is included after add
> + a new std extension to subset list.  For exmaple as below,
> +
> + void __attribute__((target("arch=+v"))) func () with -march=rv64gc.
> +
> + The implied zvl128b and zve64d of the std v should be included.  */
> +  handle_implied_ext (p);
> +  handle_combine_ext ();
> +  check_conflict_ext ();

Extract those 3 function calls to a public function
riscv_subset_list::finalize(),
and then call that at riscv_target_attr_parser::parse_arch rather than here.

> +
> +  return end_of_ext;
>  }

RE: [PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-21 Thread Li, Pan2

> The result of comparison should be vbool* rather than v[u]int*.
> The result of comparison should be vbool* rather than vfloat*,
> otherwise all 1 is not really meanful for floating point value.

> But I know clang generates the same strange/wrong code here...

I see, will update the test cases and double check about it in v4.

> &, ^, | has supported on clang, so I think we should support that as well

Looks gcc lack of such operation right now, so mark the TYPE_INDIVISIBLE_P 
(type) = 0 as aarch64 did.
I have a try but I am afraid we need separated patch to take care of it for 
risk control consideration.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 21, 2024 9:25 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; rdapp@gmail.com; vine...@rivosinc.com; 
pal...@rivosinc.com
Subject: Re: [PATCH v3] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits 
for RVV

> For the vint*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=

The result of comparison should be vbool* rather than v[u]int*.

> * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -
>
> For the vfloat*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=

The result of comparison should be vbool* rather than vfloat*,
otherwise all 1 is not really meanful for floating point value.

But I know clang generates the same strange/wrong code here...

> * ALU: +, -, *, /, -
>
> For the vbool*_t types only below operations are allowed except
> the CMP and ALU. The CMP and ALU operations on vbool*_t is not
> well defined currently.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.

&, ^, | has supported on clang, so I think we should support that as well

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-21 Thread Li, Pan2

Sorry for disturbing, kindly ping for this ICE.

Pan

-Original Message-
From: Li, Pan2  
Sent: Tuesday, March 12, 2024 10:09 AM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Hi Jeff,

Is there any suggestion(s) for how to fix this ICE in the reasonable approach? 
Thanks a lot.

Pan

-Original Message-
From: Li, Pan2 
Sent: Tuesday, March 5, 2024 2:23 PM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Thanks Jeff for comments.

> But in the case of a vector modes, we can usually reinterpret the 
> underlying bits in whatever mode we want and do any of the usual 
> operations on those bits.

Yes, I think that is why we can allow vector mode in get_stored_val if my 
understanding is correct.
And then the different modes will return by gen_low_part. Unfortunately, there 
are some modes
 (less than a vector bit size like V2SF, V2QI for vlen=128) are considered as 
invalid by validate_subreg, 
and return NULL_RTX result in the final ICE.

Thus, consider stage 4 I wonder if this is a acceptable fix, aka find some 
where to filter-out the invalid
modes before goes to gen_low_part.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, March 4, 2024 6:47 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

On 2/29/24 06:28, Robin Dapp wrote:
> On 2/29/24 02:38, Li, Pan2 wrote:
>>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I
>>> suspect those are going to fail for RISC-V as those aren't tieable.
>>
>> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
>> RISC-V.
>>
>> static bool
>> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>> {
>>/* We don't allow different REG_CLASS modes tieable since it
>>   will cause ICE in register allocation (RA).
>>   E.g. V2SI and DI are not tieable.  */
>>if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
>>  return false;
>>return (mode1 == mode2
>>|| !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>> && GET_MODE_CLASS (mode2) == MODE_FLOAT));
>> }
> 
> Yes, but what we set tieable is e.g. V4QI and V2SF.
But in the case of a vector modes, we can usually reinterpret the 
underlying bits in whatever mode we want and do any of the usual 
operations on those bits.

In my mind that's fundamentally different than the int vs fp case.  If 
we have an integer value in an FP register, we can't really operate on 
the value in any sensible way without first copying it over to the 
integer register file and vice-versa.

Jeff

RE: [PATCH v2] RISC-V: Bugfix ICE for attribute((target("arch=+v"))

2024-03-21 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Friday, March 22, 2024 10:24 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v2] RISC-V: Bugfix ICE for __attribute__((target("arch=+v"))

LGTM, thanks :)

On Fri, Mar 22, 2024 at 9:13 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to fix one ICE for __attribute__((target("arch=+v"))
> and likewise extension(s). Given we have sample code as below:
>
> void __attribute__((target("arch=+v")))
> test_2 (int *a, int *b, int *out, unsigned count)
> {
>   unsigned i;
>   for (i = 0; i < count; i++)
>out[i] = a[i] + b[i];
> }
>
> It will have ICE when build with -march=rv64gc -O3.
>
> test.c: In function ‘test_2’:
> test.c:4:1: internal compiler error: Floating point exception
> 4 | {
>   | ^
> 0x1a5891b crash_signal
> .../__RISC-V_BUILD__/../gcc/toplev.cc:319
> 0x7f0a7884251f ???
> ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
> 0x1f51ba4 riscv_hard_regno_nregs
> .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:8143
> 0x1967bb9 init_reg_modes_target()
> .../__RISC-V_BUILD__/../gcc/reginfo.cc:471
> 0x13fc029 init_emit_regs()
> .../__RISC-V_BUILD__/../gcc/emit-rtl.cc:6237
> 0x1a5b83d target_reinit()
> .../__RISC-V_BUILD__/../gcc/toplev.cc:1936
> 0x35e374d save_target_globals()
> .../__RISC-V_BUILD__/../gcc/target-globals.cc:92
> 0x35e381f save_target_globals_default_opts()
> .../__RISC-V_BUILD__/../gcc/target-globals.cc:122
> 0x1f544cc riscv_save_restore_target_globals(tree_node*)
> .../__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:9138
> 0x1f55c36 riscv_set_current_function
> ...
>
> There are two reasons for this ICE.
> 1. The implied extension(s) of v are not well handled and the
>TARGET_MIN_VLEN is 0 which is not reinitialized.  Then the
>size / TARGET_MIN_VLEN will have DivideByZero.
> 2. The machine modes of the vector types will be vary after
>the v extension is introduced.
>
> This patch passed below testsuite:
> 1. The riscv fully regression test.
>
> PR target/114352
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
> Replace implied, combine and check to func finalize.
> (riscv_subset_list::finalize): New func impl to take care of
> implied, combine ext and related checks.
> * config/riscv/riscv-subset.h: Add func decl for finalize.
> * config/riscv/riscv-target-attr.cc 
> (riscv_target_attr_parser::parse_arch):
> Finalize the ext before return succeed.
> * config/riscv/riscv.cc (riscv_set_current_function): Reinit the
> machine mode before when set cur function.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr114352-1.c: New test.
> * gcc.target/riscv/rvv/base/pr114352-2.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/common/config/riscv/riscv-common.cc   | 31 ++
>  gcc/config/riscv/riscv-subset.h   |  2 +
>  gcc/config/riscv/riscv-target-attr.cc |  2 +
>  gcc/config/riscv/riscv.cc |  4 ++
>  .../gcc.target/riscv/rvv/base/pr114352-1.c| 58 +++
>  .../gcc.target/riscv/rvv/base/pr114352-2.c| 27 +
>  6 files changed, 114 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114352-2.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 440127a2af0..15d44245b3c 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1428,16 +1428,7 @@ riscv_subset_list::parse (const char *arch, location_t 
> loc)
>if (p == NULL)
>  goto fail;
>
> -  for (itr = subset_list->m_head; itr != NULL; itr = itr->next)
> -{
> -  subset_list->handle_implied_ext (itr->name.c_str ());
> -}
> -
> -  /* Make sure all implied extensions are included. */
> -  gcc_assert (subset_list->check_implied_ext ());
> -
> -  subset_list->handle_combine_ext ();
> -  subset_list->check_conflict_ext ();
> +  subset_list->finalize ();
>
>return subset_list;
>
> @@ -1467,6 +1458,26 @@ riscv_subset_list::set_loc (location_t loc)
>m_loc = loc;
>  }
>
> +/* Make sure the implied or combined extension is included after add
> +   a new std extension to subset list or likewise.  For exmaple as below,
> +
> +   void __attri

RE: [PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits for RVV

2024-03-22 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Friday, March 22, 2024 6:06 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 
; rdapp@gmail.com; vine...@rivosinc.com; 
pal...@rivosinc.com
Subject: Re: [PATCH v4] RISC-V: Introduce gcc attribute riscv_rvv_vector_bits 
for RVV

LGTM, thanks :)

On Fri, Mar 22, 2024 at 2:55 PM  wrote:
>
> From: Pan Li 
>
> This patch would like to introduce one new gcc attribute for RVV.
> This attribute is used to define fixed-length variants of one
> existing sizeless RVV types.
>
> This attribute is valid if and only if the mrvv-vector-bits=zvl, the only
> one args should be the integer constant and its' value is terminated
> by the LMUL and the vector register bits in zvl*b.  For example:
>
> typedef vint32m2_t fixed_vint32m2_t 
> __attribute__((riscv_rvv_vector_bits(128)));
>
> The above type define is valid when -march=rv64gc_zve64d_zvl64b
> (aka 2(m2) * 64 = 128 for vin32m2_t), and will report error when
> -march=rv64gcv_zvl128b similar to below.
>
> "error: invalid RVV vector size '128', expected size is '256' based on
> LMUL of type and '-mrvv-vector-bits=zvl'"
>
> Meanwhile, a pre-define macro __riscv_v_fixed_vlen is introduced to
> represent the fixed vlen in a RVV vector register.
>
> For the vint*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, %, &, |, ^, >>, <<, ~, -
>
> The CMP will return vint*m*_t the same as aarch64 sve. For example:
> typedef vint32m1_t fixed_vint32m1_t 
> __attribute__((riscv_rvv_vector_bits(128)));
> fixed_vint32m1_t less_than (fixed_vint32m1_t a, fixed_vint32m1_t b)
> {
>   return a < b;
> }
>
> For the vfloat*m*_t below operations are allowed.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
> * CMP: >, <, ==, !=, <=, >=
> * ALU: +, -, *, /, -
>
> The CMP will return vfloat*m*_t the same as aarch64 sve. For example:
> typedef vfloat32m1_t fixed_vfloat32m1_t 
> __attribute__((riscv_rvv_vector_bits(128)));
> fixed_vfloat32m1_t less_than (fixed_vfloat32m1_t a, fixed_vfloat32m1_t b)
> {
>   return a < b;
> }
>
> For the vbool*_t types only below operations are allowed except
> the CMP and ALU. The CMP and ALU operations on vbool*_t is not
> well defined currently.
> * The sizeof.
> * The global variable(s).
> * The element of union and struct.
> * The cast to other equalities.
>
> For the vint*x*m*_t tuple types are not suppored in this patch which is
> compatible with clang.
>
> This patch passed the below testsuites.
> * The riscv fully regression tests.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Add pre-define
> macro __riscv_v_fixed_vlen when zvl.
> * config/riscv/riscv.cc (riscv_handle_rvv_vector_bits_attribute):
> New static func to take care of the RVV types decorated by
> the attributes.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-1.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-10.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-11.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-12.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-13.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-14.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-15.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-16.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-17.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-18.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-2.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-3.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-4.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-5.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-6.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-7.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-8.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits-9.c: New test.
> * gcc.target/riscv/rvv/base/riscv_rvv_vector_bits.h: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv-c.cc

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-22 Thread Li, Pan2

Thanks Jeff for comments.

> As Richi noted using validate_subreg here isn't great.  Does it work to 
> factor out this code from extract_low_bits
>
>>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>>   || !int_mode_for_mode (mode).exists (&int_mode))
>> return NULL_RTX;
>> 
>>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>> return NULL_RTX;
>>   if (!targetm.modes_tieable_p (int_mode, mode))
>> return NULL_RTX;

> And use that in the condition (and in extract_low_bits rather than 
> duplicating the code)?

It can solve the ICE but will forbid all vector modes goes gen_lowpart.
Actually only the vector mode size is less than reg nature size will trigger 
the ICE.
Thus, how about just add one more condition before goes to gen_lowpart as below?

Feel free to correct me if any misunderstandings. 😉!

diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..258d2ccc299 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
 copy_rtx (store_info->const_rhs));
   else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
 && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-&& targetm.modes_tieable_p (read_mode, store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode)
+/* It's invalid in validate_subreg if read_mode size is < reg natural.  */
+&& known_ge (GET_MODE_SIZE (read_mode), REGMODE_NATURAL_SIZE (read_mode)))
 read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, March 23, 2024 2:54 AM
To: Li, Pan2 ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 3/4/24 11:22 PM, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> But in the case of a vector modes, we can usually reinterpret the
>> underlying bits in whatever mode we want and do any of the usual
>> operations on those bits.
> 
> Yes, I think that is why we can allow vector mode in get_stored_val if my 
> understanding is correct.
> And then the different modes will return by gen_low_part. Unfortunately, 
> there are some modes
>   (less than a vector bit size like V2SF, V2QI for vlen=128) are considered 
> as invalid by validate_subreg,
> and return NULL_RTX result in the final ICE.
That doesn't make a lot of sense to me.  Even for vlen=128 I would have 
expected that we can still use a subreg to access low bits.  After all 
we might have had a V16QI vector and done a reduction of some sort 
storing the result in the first element and we have to be able to 
extract that result and move it around.

I'm not real keen on a target workaround.  While extremely safe, I 
wouldn't be surprised if other ports could trigger the ICE and we'd end 
up patching up multiple targets for what is, IMHO, a more generic issue.

As Richi noted using validate_subreg here isn't great.  Does it work to 
factor out this code from extract_low_bits:


>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>   || !int_mode_for_mode (mode).exists (&int_mode))
> return NULL_RTX;
> 
>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> return NULL_RTX;
>   if (!targetm.modes_tieable_p (int_mode, mode))
> return NULL_RTX;

And use that in the condition (and in extract_low_bits rather than 
duplicating the code)?

jeff

ps.  No need to apologize for the pings.  This completely fell off my radar.

RE: [PATCH v1] RISC-V: Allow RVV intrinsic when function target("arch=+v")

2024-03-25 Thread Li, Pan2

Committed, thanks kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Monday, March 25, 2024 8:04 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Allow RVV intrinsic when function 
target("arch=+v")

LGTM, thanks :)

On Mon, Mar 25, 2024 at 3:42 PM  wrote:
>
> From: Pan Li 
>
> This patch would like to allow the RVV intrinsic when function is
> attributed as target("arch=+v") and build with rv64gc.  For example:
>
> vint32m1_t
> __attribute__((target("arch=+v")))
> test_1 (vint32m1_t a, vint32m1_t b, size_t vl)
> {
>   return __riscv_vadd_vv_i32m1 (a, b, vl);
> }
>
> build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below:
> test_1:
>   .option push
>   .option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\
> zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0
>   vsetvli zero,a0,e32,m1,ta,ma
>   vadd.vv v8,v8,v9
>   ret
>
> The riscv_vector.h must be included when leverage intrinisc type(s) and
> API(s).  And the scope of this attribute should not excced the function
> body.  Meanwhile, to make rvv types and API(s) available for this attribute,
> include riscv_vector.h will not report error for now if v is not present
> in march.
>
> Below test are passed for this patch:
> * The riscv fully regression test.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error
> when V is disabled and init the RVV types and intrinic APIs.
> * config/riscv/riscv-vector-builtins.cc (expand_builtin): Report
> error if V ext is disabled.
> * config/riscv/riscv.cc (riscv_return_value_is_vector_type_p):
> Ditto.
> (riscv_arguments_is_vector_type_p): Ditto.
> (riscv_vector_cc_function_p): Ditto.
> * config/riscv/riscv_vector.h: Remove error if V is disable.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pragma-1.c: Remove.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: 
> New test.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: 
> New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv-c.cc   | 18 +++
>  gcc/config/riscv/riscv-vector-builtins.cc |  5 
>  gcc/config/riscv/riscv.cc | 30 ---
>  gcc/config/riscv/riscv_vector.h   |  4 ---
>  .../gcc.target/riscv/rvv/base/pragma-1.c  |  4 ---
>  .../target_attribute_v_with_intrinsic-1.c |  5 
>  .../target_attribute_v_with_intrinsic-2.c | 18 +++
>  .../target_attribute_v_with_intrinsic-3.c | 13 
>  .../target_attribute_v_with_intrinsic-4.c | 10 +++
>  .../target_attribute_v_with_intrinsic-5.c | 12 
>  .../target_attribute_v_with_intrinsic-6.c | 12 
>  .../target_attribute_v_with_intrinsic-7.c |  9 ++
>  .../target_attribute_v_with_intrinsic-8.c | 23 ++
>  13 files changed, 145 insertions(+), 18 deletions(-)
>  delete mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pragma-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
>  create mode 100644 
> gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
>
> diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
> index edb866d51e4..01314037461 100644
> --- a/gcc/config

RE: [PATCH v1] RISC-V: Allow RVV intrinsic for more function target

2024-03-28 Thread Li, Pan2

Thanks kito, looks missed this part in test, let me check it out.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 28, 2024 2:44 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Allow RVV intrinsic for more function target

Just tried something interesting:

$ riscv64-unknown-linux-gnu-gcc -march=rv64gc -O
target_attribute_v_with_intrinsic-9.c -S # Work
$ riscv64-unknown-linux-gnu-gcc -march=rv64gc_zve32x -O
target_attribute_v_with_intrinsic-9.c -S # Not work

Also I guess all zvk* and zvbb may also need to be added as well,
but...I suspect it's not scalable way?

RE: [PATCH v1] RISC-V: Allow RVV intrinsic for more function target

2024-03-28 Thread Li, Pan2

I see. This failure comes from that we have zve32x (TARGET_VECTOR is true) in 
command line, and then we don't do the reinit in riscv_pragma_intrinsic in v1.

As I understand, we need something like below, no matter TARGET_VECTOR is true 
or false.

Int flags_backup = flags;
Int new_flags = flags | ...;

reinit ();

flags = flags_backup ();
reinit ();

> Also I guess all zvk* and zvbb may also need to be added as well,
> but...I suspect it's not scalable way?

If zvk* and zvbb doesn't introduce new modes, I suspect we don't need to add 
here, let me double check about it and update in v2.

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, March 28, 2024 3:32 PM
To: Kito Cheng 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: RE: [PATCH v1] RISC-V: Allow RVV intrinsic for more function target

Thanks kito, looks missed this part in test, let me check it out.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, March 28, 2024 2:44 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH v1] RISC-V: Allow RVV intrinsic for more function target

Just tried something interesting:

$ riscv64-unknown-linux-gnu-gcc -march=rv64gc -O
target_attribute_v_with_intrinsic-9.c -S # Work
$ riscv64-unknown-linux-gnu-gcc -march=rv64gc_zve32x -O
target_attribute_v_with_intrinsic-9.c -S # Not work

Also I guess all zvk* and zvbb may also need to be added as well,
but...I suspect it's not scalable way?

RE: [PATCH] RISC-V: Fix misspelled term builtin in error message

2024-03-31 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Sunday, March 31, 2024 9:05 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH] RISC-V: Fix misspelled term builtin in error message

lgtm

On Sat, Mar 30, 2024 at 8:07 PM  wrote:
>
> From: Pan Li 
>
> This patch would like to fix below misspelled term in error message.
>
> ../../gcc/config/riscv/riscv-vector-builtins.cc:4592:16: error:
> misspelled term 'builtin function' in format; use 'built-in function' instead 
> [-Werror=format-diag]
>  4592 |   "builtin function %qE requires the V ISA extension", 
> exp);
>
> The below tests are passed for this patch.
> * The riscv regression test on rvv.exp and riscv.exp.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-vector-builtins.cc (expand_builtin): Take
> the term built-in over builtin.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c:
> Adjust test dg-error.
> * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c:
> Ditto.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/config/riscv/riscv-vector-builtins.cc   | 2 +-
>  .../riscv/rvv/base/target_attribute_v_with_intrinsic-7.c| 2 +-
>  .../riscv/rvv/base/target_attribute_v_with_intrinsic-8.c| 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
> b/gcc/config/riscv/riscv-vector-builtins.cc
> index e07373d8b57..db9246eed2d 100644
> --- a/gcc/config/riscv/riscv-vector-builtins.cc
> +++ b/gcc/config/riscv/riscv-vector-builtins.cc
> @@ -4589,7 +4589,7 @@ expand_builtin (unsigned int code, tree exp, rtx target)
>
>if (!TARGET_VECTOR)
>  error_at (EXPR_LOCATION (exp),
> - "builtin function %qE requires the V ISA extension", exp);
> + "built-in function %qE requires the V ISA extension", exp);
>
>return function_expander (rfn.instance, rfn.decl, exp, target).expand ();
>  }
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
>  
> b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
> index 520b2e59fae..a4cd67f4f95 100644
> --- 
> a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
> +++ 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c
> @@ -5,5 +5,5 @@
>
>  size_t test_1 (size_t vl)
>  {
> -  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function 
> '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
> +  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function 
> '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
>  }
> diff --git 
> a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
>  
> b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
> index 9032d9d0b43..06ed9a9eddc 100644
> --- 
> a/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
> +++ 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c
> @@ -19,5 +19,5 @@ test_2 ()
>  size_t
>  test_3 (size_t vl)
>  {
> -  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {builtin function 
> '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
> +  return __riscv_vsetvl_e8m4 (vl); /* { dg-error {built-in function 
> '__riscv_vsetvl_e8m4\(vl\)' requires the V ISA extension} } */
>  }
> --
> 2.34.1
>

RE: [PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse

2024-03-31 Thread Li, Pan2

Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Sunday, March 31, 2024 8:54 AM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; Wang, Yanzhang 

Subject: Re: [PATCH] RISC-V: Fix one unused varable in riscv_subset_list::parse

LGTM

On Sat, Mar 30, 2024 at 9:35 PM  wrote:
>
> From: Pan Li 
>
> This patch would like to fix one unused variable as below:
>
> ../../gcc/common/config/riscv/riscv-common.cc: In static member function
> 'static riscv_subset_list* riscv_subset_list::parse(const char*, location_t)':
> ../../gcc/common/config/riscv/riscv-common.cc:1501:19: error: unused variable 
> 'itr'
>   [-Werror=unused-variable]
>  1501 |   riscv_subset_t *itr;
>
> The variable consume code was removed but missed the var itself in
> previous.  Thus, we have unused variable here.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc (riscv_subset_list::parse):
> Remove unused var decl.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/common/config/riscv/riscv-common.cc | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 7095f303cbb..43b7549e3ec 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1498,7 +1498,6 @@ riscv_subset_list::parse (const char *arch, location_t 
> loc)
>  return NULL;
>
>riscv_subset_list *subset_list = new riscv_subset_list (arch, loc);
> -  riscv_subset_t *itr;
>const char *p = arch;
>p = subset_list->parse_base_ext (p);
>if (p == NULL)
> --
> 2.34.1
>

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-04-06 Thread Li, Pan2

Kindly ping for this ice.

Pan

-Original Message-
From: Li, Pan2  
Sent: Saturday, March 23, 2024 1:45 PM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Thanks Jeff for comments.

> As Richi noted using validate_subreg here isn't great.  Does it work to 
> factor out this code from extract_low_bits
>
>>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>>   || !int_mode_for_mode (mode).exists (&int_mode))
>> return NULL_RTX;
>> 
>>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>> return NULL_RTX;
>>   if (!targetm.modes_tieable_p (int_mode, mode))
>> return NULL_RTX;

> And use that in the condition (and in extract_low_bits rather than 
> duplicating the code)?

It can solve the ICE but will forbid all vector modes goes gen_lowpart.
Actually only the vector mode size is less than reg nature size will trigger 
the ICE.
Thus, how about just add one more condition before goes to gen_lowpart as below?

Feel free to correct me if any misunderstandings. 😉!

diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..258d2ccc299 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
 copy_rtx (store_info->const_rhs));
   else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
 && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-&& targetm.modes_tieable_p (read_mode, store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode)
+/* It's invalid in validate_subreg if read_mode size is < reg natural.  */
+&& known_ge (GET_MODE_SIZE (read_mode), REGMODE_NATURAL_SIZE (read_mode)))
 read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, March 23, 2024 2:54 AM
To: Li, Pan2 ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 3/4/24 11:22 PM, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> But in the case of a vector modes, we can usually reinterpret the
>> underlying bits in whatever mode we want and do any of the usual
>> operations on those bits.
> 
> Yes, I think that is why we can allow vector mode in get_stored_val if my 
> understanding is correct.
> And then the different modes will return by gen_low_part. Unfortunately, 
> there are some modes
>   (less than a vector bit size like V2SF, V2QI for vlen=128) are considered 
> as invalid by validate_subreg,
> and return NULL_RTX result in the final ICE.
That doesn't make a lot of sense to me.  Even for vlen=128 I would have 
expected that we can still use a subreg to access low bits.  After all 
we might have had a V16QI vector and done a reduction of some sort 
storing the result in the first element and we have to be able to 
extract that result and move it around.

I'm not real keen on a target workaround.  While extremely safe, I 
wouldn't be surprised if other ports could trigger the ICE and we'd end 
up patching up multiple targets for what is, IMHO, a more generic issue.

As Richi noted using validate_subreg here isn't great.  Does it work to 
factor out this code from extract_low_bits:


>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>   || !int_mode_for_mode (mode).exists (&int_mode))
> return NULL_RTX;
> 
>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> return NULL_RTX;
>   if (!targetm.modes_tieable_p (int_mode, mode))
> return NULL_RTX;

And use that in the condition (and in extract_low_bits rather than 
duplicating the code)?

jeff

ps.  No need to apologize for the pings.  This completely fell off my radar.

RE: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-10 Thread Li, Pan2

Committed, thanks Juzhe and Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Thursday, April 11, 2024 10:50 AM
To: juzhe.zh...@rivai.ai
Cc: Li, Pan2 ; gcc-patches 
Subject: Re: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
switch

I was thinking we may guarded with TARGET_VECTOR and TARGET_HARD_FLOAT
or checking with ABI in riscv_function_value_regno_p, however I think
it's fine with current implementation (no checking) after checking all
use site of `targetm.calls.function_value_regno_p`, so LGTM :)

Thanks Pan for fixing this issue!

On Thu, Apr 11, 2024 at 10:23 AM juzhe.zh...@rivai.ai
 wrote:
>
> Thanks for fixing it. LGTM from my side.
>
> I prefer wait kito for another ACK.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: pan2.li
> Date: 2024-04-11 10:16
> To: gcc-patches
> CC: juzhe.zhong; kito.cheng; Pan Li
> Subject: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
> switch
> From: Pan Li 
>
> This patch would like to fix a ICE in mode sw for below example code.
>
> during RTL pass: mode_sw
> test.c: In function ‘vbool16_t j(vuint64m4_t)’:
> test.c:15:1: internal compiler error: in create_pre_exit, at
> mode-switching.cc:451
>15 | }
>   | ^
> 0x3978f12 create_pre_exit
> __RISCV_BUILD__/../gcc/mode-switching.cc:451
> 0x3979e9e optimize_mode_switching
> __RISCV_BUILD__/../gcc/mode-switching.cc:849
> 0x397b9bc execute
> __RISCV_BUILD__/../gcc/mode-switching.cc:1324
>
> extern size_t get_vl ();
>
> vbool16_t
> test (vuint64m4_t a)
> {
>   unsigned long b;
>   return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ());
> }
>
> The create_pre_exit would like to find a return value copy.  If
> not, there will be a reason in assert but not available for above
> sample code when vector calling convension is enabled by default.
> This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P
> for vector register and then we will have hard_regno_nregs for copy_num,
> aka there is a return value copy.
>
> As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the
> TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless
> cannot be converted to fixed_size_mode.  Thus override the hook
> TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a
> fixed_size_mode.
>
> The below tests are passed for this patch.
> * The fully riscv regression tests.
> * The reproducing test in bugzilla PR114639.
>
> PR target/114639
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_function_value_regno_p): New func
> impl for hook TARGET_FUNCTION_VALUE_REGNO_P.
> (riscv_get_raw_result_mode): New func imple for hook
> TARGET_GET_RAW_RESULT_MODE.
> (TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook.
> (TARGET_GET_RAW_RESULT_MODE): Ditto.
> * config/riscv/riscv.h (V_RETURN): New macro for vector return.
> (GP_RETURN_FIRST): New macro for the first GPR in return.
> (GP_RETURN_LAST): New macro for the last GPR in return.
> (FP_RETURN_FIRST): Diito but for FPR.
> (FP_RETURN_LAST): Ditto.
> (FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by
> TARGET_FUNCTION_VALUE_REGNO_P.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/riscv/rvv/base/pr114639-1.C: New test.
> * gcc.target/riscv/rvv/base/pr114639-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv.cc | 34 +++
> gcc/config/riscv/riscv.h  |  8 +++--
> .../g++.target/riscv/rvv/base/pr114639-1.C| 25 ++
> .../gcc.target/riscv/rvv/base/pr114639-1.c| 14 
> 4 files changed, 79 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/pr114639-1.C
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr114639-1.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 00defa69fd8..91f017dd52a 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -10997,6 +10997,34 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
>return true;
> }
> +/* Implements hook TARGET_FUNCTION_VALUE_REGNO_P.  */
> +
> +static bool
> +riscv_function_value_regno_p (const unsigned regno)
> +{
> +  if (GP_RETURN_FIRST <= regno && regno <= GP_RETURN_LAST)
> +return true;
> +
> +  if (FP_RETURN_FIRST <= regno && regno <= FP_RETURN_LAST)
> +return true;
> +
> +  if (regno == V_RETURN)
> +return true;
> +
> +  return false;
> +}
> +
> +/* Implements hook TARGET_GET_RAW_RESULT_MODE.  */
> +
> +static fixed_size_mode
> +riscv_get_raw_result_mode (int regno)
> +{
> +

RE: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode switch

2024-04-11 Thread Li, Pan2

Thanks for reporting this. Just take a look from my test log that 930623-1.c is 
all pass.

Thus I bet this difference comes from the build option --with-arch=rv32imac but 
my test script take rv64gcv.

> I've built the git revision f3fdcf4a37a with 
> ../gcc-trunk/configure --target=riscv-unknown-elf 
> --prefix=/home/ed/gnu/riscv-unknown-elf --enable-languages=c,c++ 
> --disable-multilib --with-arch=rv32imac --with-abi=ilp32

> I am a bit surprised since the target is not supposed to support floating 
> point
> or vector instructions AFAIK.

Because you specify rv32imac, with doesn't include f/d/v extension, aka 
single/double floating point and vector extension. Thus, related functionality 
are disabled.

> The issue does not happen with gcc-trunk from yesterday.

Ack, will look into it.

Pan

-Original Message-
From: Bernd Edlinger  
Sent: Thursday, April 11, 2024 7:52 PM
To: Li, Pan2 ; Kito Cheng ; 
juzhe.zh...@rivai.ai
Cc: gcc-patches 
Subject: Re: [PATCH v1] RISC-V: Bugfix ICE for the vector return arg in mode 
switch

On 4/11/24 05:03, Li, Pan2 wrote:
> Committed, thanks Juzhe and Kito.
> 
> Pan


Hi Pan,

this commit caused a regression:

FAIL: gcc.c-torture/compile/930623-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/930623-1.c   -O1  (internal compiler error: in 
emit_vec_extract, at config/riscv/riscv-v.cc:5059)
FAIL: gcc.c-torture/compile/930623-1.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/930623-1.c   -O2  (internal compiler error: in 
emit_vec_extract, at config/riscv/riscv-v.cc:5059)
FAIL: gcc.c-torture/compile/930623-1.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/930623-1.c   -O3 -g  (internal compiler error: in 
emit_vec_extract, at config/riscv/riscv-v.cc:5059)
FAIL: gcc.c-torture/compile/930623-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/930623-1.c   -Os  (internal compiler error: in 
emit_vec_extract, at config/riscv/riscv-v.cc:5059)
FAIL: gcc.c-torture/compile/930623-1.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/930623-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error: in emit_vec_extract, at 
config/riscv/riscv-v.cc:5059)
FAIL: gcc.c-torture/compile/930623-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)

gcc/testsuite/gcc.c-torture/compile/930623-1.c:10:3: internal compiler error: 
in emit_vec_extract, at config/riscv/riscv-v.cc:5059^M
0xbba2de riscv_vector::emit_vec_extract(rtx_def*, rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/config/riscv/riscv-v.cc:5059^M
0x186945f riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/config/riscv/riscv.cc:2895^M
0x1ef50b2 gen_movsi(rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/config/riscv/riscv.md:2225^M
0xffc91c rtx_insn* insn_gen_fn::operator()(rtx_def*, 
rtx_def*) const^M
../../gcc-trunk/gcc/recog.h:441^M
0xffc91c emit_move_insn_1(rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/expr.cc:4551^M
0xffcdf4 emit_move_insn(rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/expr.cc:4721^M
0x1002f17 emit_move_multi_word^M
../../gcc-trunk/gcc/expr.cc:4517^M
0xffcdf4 emit_move_insn(rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/expr.cc:4721^M
0x1efc6b7 gen_untyped_call(rtx_def*, rtx_def*, rtx_def*)^M
../../gcc-trunk/gcc/config/riscv/riscv.md:3478^M
0x185fc7c target_gen_untyped_call^M
../../gcc-trunk/gcc/config/riscv/riscv.md:3453^M
0xe8e81f expand_builtin_apply^M
../../gcc-trunk/gcc/builtins.cc:1761^M
0xea053c expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)^M
../../gcc-trunk/gcc/builtins.cc:8001^M
0xff9e27 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)^M
../../gcc-trunk/gcc/expr.cc:12353^M
0xec4c3d expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)^M
../../gcc-trunk/gcc/expr.h:316^M
0xec4c3d expand_call_stmt^M
../../gcc-trunk/gcc/cfgexpand.cc:2865^M
0xec4c3d expand_gimple_stmt_1^M
../../gcc-trunk/gcc/cfgexpand.cc:3932^M
0xec4c3d expand_gimple_stmt^M
../../gcc-trunk/gcc/cfgexpand.cc:4077^M
0xeca206 expand_gimple_basic_block^M
../../gcc-trunk/gcc/cfgexpand.cc:6133^M
0xecc287 execute^M
../../gcc-trunk/gcc/cfgexpand.cc:6872^M
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).^M
Please include the complete backtrace with any bug report.^M
See <https://gcc.gnu.org/bugs/> for instructions.^M
compiler exited with status 1

I've built the git revision f3fdcf4a37a with 
../gcc-trunk/configure --target=riscv-unknown-elf 
--prefix=/home/ed/gnu/riscv-unknown-elf --enable-languages=c,c++ 
--disable-multilib --with-arch=rv32imac --with-abi=ilp32

I am a bit surprised since the target is not supposed to support floating point
or vector instructions AFAIK.

The issue does not happen with gcc-trunk from yesterday.

Regards
Bernd.

RE: [PATCH v1] RISC-V: Cleanup the comments for the psabi

2024-02-01 Thread Li, Pan2

Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Thursday, February 1, 2024 9:39 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com
Subject: Re: [PATCH v1] RISC-V: Cleanup the comments for the psabi



On 1/30/24 18:54, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to cleanup some comments which are out of date or 
> incorrect.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/riscv.cc (riscv_get_arg_info): Cleanup comments.
>   (riscv_pass_by_reference): Ditto.
>   (riscv_fntype_abi): Ditto.
OK
jeff

RE: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-02-01 Thread Li, Pan2

Sorry, it seems the log was eliminated by my cleanup script(s). Let me know 
rerun one newlib for commit id 23cd2961bd2ff63583f46e3499a07bd54491d45c.

Pan

-Original Message-
From: Edwin Lu  
Sent: Friday, February 2, 2024 1:43 AM
To: Li, Pan2 ; juzhe.zh...@rivai.ai; gcc-patches 

Cc: Robin Dapp ; kito.cheng ; 
jeffreyalaw ; palmer ; vineetg 
; Patrick O'Neill 
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

On 1/31/2024 11:29 PM, Li, Pan2 wrote:
> I can somehow reproduce the failures on commit id 
> 23cd2961bd2ff63583f46e3499a07bd54491d45c, configurations as below.
> 
> ./configure --prefix=${install_dir} \
> 
> --with-arch=rv64imafdcv \
> 
> --with-abi=lp64d \
> 
> --with-isa-spec=20191213 \
> 
> --with-sim=qemu
> 
> make -j $(nproc) build-sim SIM=qemu
> 
> make report -j $(nproc) RUNTESTFLAGS=rvv.exp
> 
> = Summary of gcc testsuite =
> 
> | # of unexpected case / # of unique unexpected case
> 
> |gcc |g++ |gfortran |
> 
> rv64imafdcv/lp64d/ medlow |160 /47 |0 /0 |- |
> 
> make: *** [Makefile:1067: report-gcc-newlib] Error 1
> 
> Pan

Hi Pan,

I'm getting similar numbers as well using your steps but I also want to 
confirm whether you are also getting the ICEs or are just getting 
additional scan dump failures. The scan dump failures are a result of 
adding the new scheduling pipelines. I skimmed through them and didn't 
find anything unexpected.

Edwin

> 
> *From:*juzhe.zh...@rivai.ai 
> *Sent:* Thursday, February 1, 2024 3:06 PM
> *To:* Edwin Lu ; gcc-patches 
> *Cc:* Robin Dapp ; kito.cheng 
> ; jeffreyalaw ; palmer 
> ; vineetg ; Patrick O'Neill 
> 
> *Subject:* Re: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to 
> dfa pipelines
> 
> Sorry again. I just realized you have reverted your patches that's why I 
> can pass the testing now.
> 
> I checkout your latest patch commit:
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
>  
> <https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c>
> 
> 
> 
> Then I can reproduce the ICE now:
> 
> 
> 
> bug.c: In function 'popcount32_uint64_tuint64_t':
> 
> bug.c:20:3: internal compiler error: in validate_change_or_fail, at 
> config/riscv/riscv-v.cc:4972
> 
>     20 |   }
> 
>        |   ^
> 
> bug.c:123:3: note: in expansion of macro 'DEF32'
> 
>    123 |   DEF32 (uint64_t, uint64_t)   
>                  \
> 
>        |   ^
> 
> bug.c:444:1: note: in expansion of macro 'DEF_ALL'
> 
>    444 | DEF_ALL ()
> 
>        | ^~~
> 
> 0x1fbf06f riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**, 
> rtx_def*, bool)
> 
>          ../../../../gcc/gcc/config/riscv/riscv-v.cc:4972
> 
> 0x1fe2c60 simplify_replace_vlmax_avl
> 
>          ../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:200
> 
> 0x1fe3b05 pass_avlprop::execute(function*)
> 
>          ../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:506
> 
> Would you mind taking a look at it ?
> 
> 
> 
> juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai>
> 
> *From:*Edwin Lu <mailto:e...@rivosinc.com>
> 
> *Date:* 2024-02-01 14:13
> 
> *To:*juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai>; gcc-patches
> <mailto:gcc-patches@gcc.gnu.org>
> 
> *CC:*Robin Dapp <mailto:rdapp@gmail.com>; kito.cheng
> <mailto:kito.ch...@gmail.com>; jeffreyalaw
> <mailto:jeffreya...@gmail.com>; palmer <mailto:pal...@rivosinc.com>;
> vineetg <mailto:vine...@rivosinc.com>; Patrick O'Neill
> <mailto:patr...@rivosinc.com>
> 
> *Subject:* Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to
> dfa pipelines
> 
>  From what I know, if it was a problem with my dfa reservation assert,
> 
> it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted
> 
> the changes since I don't want to leave things possibly broken
> overnight
> 
> and not knowing which patch is the root cause. I kicked off another set
> 
> of test runs using our full gcc postcommit testing configurations and
> 
> should have those results in tomorrow. Hopefully it was just a missed
> 
> config target I didn't test and wasn't tested on the precommit ci.
> 
> Edwin
> 
> On 1/31/2024 9:42 PM, Edwin Lu wrote:
> 
> > Hi Juzhe,
> 
> >
> 
> > I didn't see any ICEs when I te

RE: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-02-01 Thread Li, Pan2

Hi Edwin,

Just rerun the newlib and there is no ICE but still 160 dump failures as below.

Pan

-Original Message-
From: Li, Pan2  
Sent: Friday, February 2, 2024 11:57 AM
To: Edwin Lu ; juzhe.zh...@rivai.ai; gcc-patches 

Cc: Robin Dapp ; kito.cheng ; 
jeffreyalaw ; palmer ; vineetg 
; Patrick O'Neill 
Subject: RE: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

Sorry, it seems the log was eliminated by my cleanup script(s). Let me know 
rerun one newlib for commit id 23cd2961bd2ff63583f46e3499a07bd54491d45c.

Pan

-Original Message-
From: Edwin Lu  
Sent: Friday, February 2, 2024 1:43 AM
To: Li, Pan2 ; juzhe.zh...@rivai.ai; gcc-patches 

Cc: Robin Dapp ; kito.cheng ; 
jeffreyalaw ; palmer ; vineetg 
; Patrick O'Neill 
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

On 1/31/2024 11:29 PM, Li, Pan2 wrote:
> I can somehow reproduce the failures on commit id 
> 23cd2961bd2ff63583f46e3499a07bd54491d45c, configurations as below.
> 
> ./configure --prefix=${install_dir} \
> 
> --with-arch=rv64imafdcv \
> 
> --with-abi=lp64d \
> 
> --with-isa-spec=20191213 \
> 
> --with-sim=qemu
> 
> make -j $(nproc) build-sim SIM=qemu
> 
> make report -j $(nproc) RUNTESTFLAGS=rvv.exp
> 
> = Summary of gcc testsuite =
> 
> | # of unexpected case / # of unique unexpected case
> 
> |gcc |g++ |gfortran |
> 
> rv64imafdcv/lp64d/ medlow |160 /47 |0 /0 |- |
> 
> make: *** [Makefile:1067: report-gcc-newlib] Error 1
> 
> Pan

Hi Pan,

I'm getting similar numbers as well using your steps but I also want to 
confirm whether you are also getting the ICEs or are just getting 
additional scan dump failures. The scan dump failures are a result of 
adding the new scheduling pipelines. I skimmed through them and didn't 
find anything unexpected.

Edwin

> 
> *From:*juzhe.zh...@rivai.ai 
> *Sent:* Thursday, February 1, 2024 3:06 PM
> *To:* Edwin Lu ; gcc-patches 
> *Cc:* Robin Dapp ; kito.cheng 
> ; jeffreyalaw ; palmer 
> ; vineetg ; Patrick O'Neill 
> 
> *Subject:* Re: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to 
> dfa pipelines
> 
> Sorry again. I just realized you have reverted your patches that's why I 
> can pass the testing now.
> 
> I checkout your latest patch commit:
> 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
>  
> <https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c>
> 
> 
> 
> Then I can reproduce the ICE now:
> 
> 
> 
> bug.c: In function 'popcount32_uint64_tuint64_t':
> 
> bug.c:20:3: internal compiler error: in validate_change_or_fail, at 
> config/riscv/riscv-v.cc:4972
> 
>     20 |   }
> 
>        |   ^
> 
> bug.c:123:3: note: in expansion of macro 'DEF32'
> 
>    123 |   DEF32 (uint64_t, uint64_t)   
>                  \
> 
>        |   ^
> 
> bug.c:444:1: note: in expansion of macro 'DEF_ALL'
> 
>    444 | DEF_ALL ()
> 
>        | ^~~
> 
> 0x1fbf06f riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**, 
> rtx_def*, bool)
> 
>          ../../../../gcc/gcc/config/riscv/riscv-v.cc:4972
> 
> 0x1fe2c60 simplify_replace_vlmax_avl
> 
>          ../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:200
> 
> 0x1fe3b05 pass_avlprop::execute(function*)
> 
>          ../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:506
> 
> Would you mind taking a look at it ?
> 
> 
> 
> juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai>
> 
> *From:*Edwin Lu <mailto:e...@rivosinc.com>
> 
> *Date:* 2024-02-01 14:13
> 
> *To:*juzhe.zh...@rivai.ai <mailto:juzhe.zh...@rivai.ai>; gcc-patches
> <mailto:gcc-patches@gcc.gnu.org>
> 
> *CC:*Robin Dapp <mailto:rdapp@gmail.com>; kito.cheng
> <mailto:kito.ch...@gmail.com>; jeffreyalaw
> <mailto:jeffreya...@gmail.com>; palmer <mailto:pal...@rivosinc.com>;
> vineetg <mailto:vine...@rivosinc.com>; Patrick O'Neill
> <mailto:patr...@rivosinc.com>
> 
> *Subject:* Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to
> dfa pipelines
> 
>  From what I know, if it was a problem with my dfa reservation assert,
> 
> it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted
> 
> the changes since I don't want to leave things possibly broken
> overnight
> 
> and not knowing which patch is the root cause. I kicked off another set
> 
> of test runs using our full gcc post

RE: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-02-02 Thread Li, Pan2

Hi Edwin

> I believe the only problematic failures are the 5 vls calling convention 
> ones where only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) are found.

Does this "only 24" comes from calling-convention-1.c? 

> This is what I'm getting locally (first instance of wrong match):
> v32qi_RET1_ARG8:
> .LFB109:

V32qi will pass the args by reference instead of GPR(s), thus It is expected. I 
think we need to diff the asm code before and after the patch for the whole 
test-file.
The RE "ld\\s+a[0-1],\\s*[0-9]+\\(sp\\)" would like to check vls mode values 
are returned by a[0-1].

Pan

-Original Message-
From: Edwin Lu  
Sent: Saturday, February 3, 2024 8:29 AM
To: Li, Pan2 ; juzhe.zh...@rivai.ai; gcc-patches 

Cc: Robin Dapp ; kito.cheng ; 
jeffreyalaw ; palmer ; vineetg 
; Patrick O'Neill 
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

On 2/1/2024 8:28 PM, Li, Pan2 wrote:
> Hi Edwin,
> 
> Just rerun the newlib and there is no ICE but still 160 dump failures as 
> below.
> 
> Pan
> 

Hi Pan,

Thanks for confirming! Having dump failures is expected. There are 
around 7 more unique failures than I expected 
(https://github.com/patrick-rivos/gcc-postcommit-ci/issues/473 <-- 
postcommit found 46 while I expected 39 
https://inbox.sourceware.org/gcc-patches/12d205cd-3177-48ea-a54e-c2052fdde...@gmail.com/

https://github.com/ewlu/gcc-precommit-ci/issues/1178#issuecomment-1889782987) 

I included the 7 failed tests below and what was found instead.

I believe the only problematic failures are the 5 vls calling convention 
ones where only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) are found.

FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 35
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 33
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 31
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29

This is what I'm getting locally (first instance of wrong match):
v32qi_RET1_ARG8:
.LFB109:
 .cfi_startproc
 li  t1,32
 vsetvli zero,t1,e8,mf8,ta,ma
 vle8.v  v1,0(a1)
 vle8.v  v4,0(a2)
 vle8.v  v3,0(a3)
 vle8.v  v2,0(a4)
 vadd.vv v1,v1,v4
 vadd.vv v1,v1,v3
 vle8.v  v3,0(a5)
 ld  a5,0(sp)  <-- used a5 instead of a1
 vadd.vv v1,v1,v2
 vle8.v  v2,0(a6)
 vadd.vv v1,v1,v3
 vle8.v  v3,0(a7)
 vadd.vv v1,v1,v2
 vle8.v  v2,0(a5)
 vadd.vv v1,v1,v3
 vadd.vv v1,v1,v2
 vse8.v  v1,0(a0)
 ret
 .cfi_endproc

If I understand correctly, this is wrong since we aren't returning 
anything (nothing gets stored in a[0-1])?

Edwin

RE: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread Li, Pan2

Not yet. It is long time since last round run, will make sure there is no 
surprises from that.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Tuesday, February 6, 2024 4:11 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when 
empty args

Did you run the C compiler compile C++ intrinsic test ?


juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: pan2.li<mailto:pan2...@intel.com>
Date: 2024-02-06 16:09
To: gcc-patches<mailto:gcc-patches@gcc.gnu.org>
CC: juzhe.zhong<mailto:juzhe.zh...@rivai.ai>; 
pan2.li<mailto:pan2...@intel.com>; 
yanzhang.wang<mailto:yanzhang.w...@intel.com>; 
kito.cheng<mailto:kito.ch...@gmail.com>
Subject: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty 
args
From: Pan Li mailto:pan2...@intel.com>>

There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make
overloaded func with empty args error.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-c.cc   |  3 +-
gcc/config/riscv/riscv-protos.h   |  2 +-
gcc/config/riscv/riscv-vector-builtins.cc | 23 -
.../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
.../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
5 files changed, 155 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+  fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
bool legitimize_move (rtx, rtx *);
void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, 
unsigned int code,
}
tree
-resolve_overloaded_builtin (unsigned int code, vec *arglist)
+resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl,
+ vec *arglist)
{
   if (code >= vec_safe_length (registered_functions))
 return NULL_TREE;
@@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsig

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1081 matches

Mail list logo