Re: [PATCH] RISC-V: Add RVV FMA auto-vectorization support

2023-05-26 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +;; We can't expand FMA for the following reasons: But we do :) We just haven't selected the proper alternative yet. > +;; 1. Before RA, we don't know which multiply-add instruction is the ideal > one. > +;;The vmacc is the ideal instruction when operands[3] overlaps > operand

Re: [PATCH] RISC-V: Add RVV FMA auto-vectorization support

2023-05-26 Thread Robin Dapp via Gcc-patches
Hi Juzhe, >>> Can you explain these two points (3 and 4, maybe 2) a bit in the comments? >>> I.e. what makes fma different from a normal insn? > You can take a lookt at vector.md. The ternary instruction pattern has  > operands[0] operands[1] operands[2] operands[3] operands[4] operands[5] : > >

Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V

2023-05-30 Thread Robin Dapp via Gcc-patches
Hi Kito, > GNU vector extensions is widly used around this world, and this patch > enable that with RISC-V vector extensions, this can help people > leverage existing code base with RVV, and also can write vector programs in a > familiar way. > > The idea of VLS code gen support is emulate VLS op

Re: [PATCH] RISC-V: Basic VLS code gen for RISC-V

2023-05-30 Thread Robin Dapp via Gcc-patches
>>> but ideally the user would be able to specify -mrvv-size=32 for an >>> implementation with 32 byte vectors and then vector lowering would make use >>> of vectors up to 32 bytes? > > Actually, we don't want to specify -mrvv-size = 32 to enable vectorization on > GNU vectors. > You can take a l

[PATCH] RISC-V: Synthesize power-of-two constants.

2023-05-30 Thread Robin Dapp via Gcc-patches
Hi, I figured I'd send this patch that I quickly hacked together some days back. It's likely going to be controversial because we don't have vector costs in place at all yet and even with costs it's probably debatable as the emitted sequence is longer :) I'm willing to defer or ditch it altogethe

Re: [PATCH] RISC-V: Add vwadd/vwsub/vwmul/vwmulsu.vv lowering optimizaiton for RVV auto-vectorization

2023-05-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > The approach is quite simple and obvious, changing extension pattern > into define_insn_and_split will make combine PASS combine into widen > operations naturally. looks good to me. Tiny nit: I would add a comment above the patterns to clarify why insn_and_split instead of expand. S

Re: [PATCH] RISC-V: Support RVV permutation auto-vectorization

2023-05-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks looks pretty comprehensive already. > +(define_expand "vec_perm" > + [(match_operand:V 0 "register_operand") > + (match_operand:V 1 "register_operand") > + (match_operand:V 2 "register_operand") > + (match_operand: 3 "vector_perm_operand")] > + "TARGET_VECTOR && GET_MODE_

Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-02 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > ... >vsetvli zero,t1,e8,m1,ta,ma > vle8.v v1,0(a4) > vsetvli t3,zero,e16,m2,ta,ma > vsext.vf2 v6,v1 > vsetvli zero,t1,e8,m1,ta,ma > vle8.v v1,0(a5) > vsetvli t3,zero,e16,m2,ta,ma > add t0,a0,t4 > vzext.

Re: [PATCH V2] RISC-V: Add pseudo vwmul.wv pattern to enhance vwmul.vv instruction optimizations

2023-06-02 Thread Robin Dapp via Gcc-patches
>>> I like the code examples in general but find them hard to read >>> at lengths > 5-10 or so.  Could we condense this a bit? > Ok, Do I need to send V2 ? Or condense the commit log when merged the patch? Sure, just condense a bit. No need for V2. Regards Robin

Re: [PATCH] RISC-V: Add RVV vwmacc/vwmaccu/vwmaccsu combine lowering optmization

2023-06-06 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just one/two really minor nits. > +rtx ops[] = {operands[0], operands[1], operands[2], operands[3]}; > +riscv_vector::emit_vlmax_ternary_insn (code_for_pred_widen_mul_plus > (, mode), > +riscv_vector::RVV_WIDEN_TERNOP, ops); Here and in

Re: [PATCH] RISC-V: Add RVV vwmacc/vwmaccu/vwmaccsu combine lowering optmization

2023-06-06 Thread Robin Dapp via Gcc-patches
> These enhance patterns are generated in complicate combining situations. Yes, that's clear. One strategy is to look through combine's output and see which combination results make sense for a particular backend. I was wondering where the unspec-less patterns originate (when we expand everything

Re: [PATCH] RISC-V: Add RVV vwmacc/vwmaccu/vwmaccsu combine lowering optmization

2023-06-06 Thread Robin Dapp via Gcc-patches
> +rtx ops[] = {operands[0], operands[1], operands[2], operands[3]}; > +riscv_vector::emit_vlmax_ternary_insn (code_for_pred_widen_mul_plus > (, mode), > +riscv_vector::RVV_WIDEN_TERNOP, ops); ops is still there ;) No need for another revision thou

Re: [PATCH] RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

2023-06-09 Thread Robin Dapp via Gcc-patches
On 6/9/23 16:32, juzhe.zh...@rivai.ai wrote: > From: Juzhe-Zhong > > This patch fixes the requirement of V_WHOLE and V_FRACT. > E.g. VNx8QI in V_WHOLE has no requirement which is incorrect. > Actually, VNx8QI should be whole(full) mode when TARGET_MIN_VLEN < 128 > since when TARGET_MIN_

Re: [PATCH] RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

2023-06-09 Thread Robin Dapp via Gcc-patches
> I think it shouldn't be with vec_set patch. > Instead, it obviously should be the separate patch. Yes, I didn't mean in the actual same patch. Regards Robin

Re: [PATCH] RISC-V: Add ZVFHMIN autovec block testcase

2023-06-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, no complaints here. Just please make sure you add the commit message or something related as top comment to the test when committing. Somebody who reads the test is not going to want to lookup the commit message to know what's going on. Regards Robin

Re: [PATCH V2] RISC-V: Add ZVFHMIN block autovec testcase

2023-06-12 Thread Robin Dapp via Gcc-patches
> +/* We can't enable FP16 NEG/PLUS/MINUS/MULT/DIV auto-vectorization when > -march="*zvfhmin*". */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 0 > "vect" } } */ Thanks. OK from my side. Regards Robin

Re: [PATCH] RISC-V: Fix V_WHOLE && V_FRACT iterator requirement

2023-06-12 Thread Robin Dapp via Gcc-patches
> +  (VNx16QI "TARGET_MIN_VLEN <= 128") > +  (VNx32QI "TARGET_MIN_VLEN <= 256") > +  (VNx64QI "TARGET_MIN_VLEN >= 64 && TARGET_MIN_VLEN <= 512") > +  (VNx128QI "TARGET_MIN_VLEN >= 128 && TARGET_MIN_VLEN <= 1024") > > This not correct, we always use VNx16QI as LMUL = m1 for min_vlen >= 128. > Requi

Re: [PATCH] RISC-V: Enhance RVV VLA SLP auto-vectorization with decompress operation

2023-06-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, seems a nice improvement, looks good to me. While reading I was wondering if vzext could help synthesize some (zero-based) patterns as well (e.g. 0 3 0 3...). However the sequences I could come up with were not shorter than what we are already emitting, so probably not. Regards Robin

[PATCH] RISC-V: Implement vec_set and vec_extract.

2023-06-12 Thread Robin Dapp via Gcc-patches
Hi, this implements the vec_set and vec_extract patterns for integer and floating-point data types. For vec_set we broadcast the insert value to a vector register and then perform a vslideup with effective length 1 to the requested index. vec_extract is done by sliding down the requested element

[PATCH] RISC-V: Add sign-extending variants for vmv.x.s.

2023-06-12 Thread Robin Dapp via Gcc-patches
Hi, when the destination register of a vmv.x.s needs to be sign extended to XLEN we currently emit an sext insn. Since vmv.x.s performs this implicitly this patch adds two instruction patterns (intended for combine et al.) that include sign_extend for the destination operand. The tests extend th

Re: [PATCH] RISC-V: Add sign-extending variants for vmv.x.s.

2023-06-12 Thread Robin Dapp via Gcc-patches
> Change  > > +(define_insn "@pred_extract_first_sextdi" > > into  > > (define_insn "*pred_extract_first_sextdi" Yeah, I was thinking about this as well right after sending. We will probably never call this directly. Regards Robin

Re: [PATCH] RISC-V: Implement vec_set and vec_extract.

2023-06-12 Thread Robin Dapp via Gcc-patches
> +  /* If the slide offset fits into 5 bits we can > + use the immediate variant instead of the register variant. > + The expander's operand[2] is ops[3] here. */ > +  if (!satisfies_constraint_K (ops[3])) > +    ops[3] = force_reg (Pmode, ops[3]); > > I don't think we need this. maybe_ex

Re: [PATCH] RISC-V: Implement vec_set and vec_extract.

2023-06-12 Thread Robin Dapp via Gcc-patches
> I suggest we implement vector calling convention even though it is not > ratified yet. > We can allow calling convention to be enabled only when > --param=riscv-autovec-preference=fixed-vlmax. > We have such issue: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110119 >

Re: [PATCH v1] RISC-V: Fix one typo in full-vec-movel test

2023-06-13 Thread Robin Dapp via Gcc-patches
> This patch would like to fix one typo when checking assembly of > full-vec-movel. OK. (I actually intended to commit this myself adding some more comments to the iterator change as well as fix the tests, but well...) Regards Robin

Re: [PATCH v1] RISC-V: Fix one typo in full-vec-movel test

2023-06-13 Thread Robin Dapp via Gcc-patches
> Oh. Sorry. Since I want to commit my patch so I asked Pan to commit > your test as well. I think you can resend a fix of this testcase and > drop this patch. No problem, will fix it another time. Pan can just go ahead with this fix now, no need to wait for a maintainer, it's obvious enough. Th

Re: [PATCH] RISC-V: Add more SLP tests

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, as the tests are mostly directly from aarch64's testsuite I would advise comments on where they were taken from as well as a TODO that they should become common tests for a specific target selector (vect_scalable_supported or something). How about some assembly checks for the non-run te

Re: [PATCH V3] RISC-V: Add more SLP tests

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, works for me as is. I just hope somebody is going to take on the task of making different LMUL SLP variants "scannable" at some point because it would definitely increase our test coverage with these tests. (Or split the tests manually and not iterate over LMUL) Regards Robin

Re: [PATCH] RISC-V: Fix bug of VLA SLP auto-vectorization

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, LGTM. You could also add the aarch64 test disclaimer here again, but no need for a V2. Regards Robin

Re: [PATCH v1] RISC-V: Align the predictor style for define_insn_and_split

2023-06-13 Thread Robin Dapp via Gcc-patches
Hi Pan, these failures were present before the patch I suppose? They don't look related. Is this what you meant by "the same as upstream"? > FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/full-vec-move1.c -std=c99 -O3 > -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax (test for > ex

Re: [PATCH v1] RISC-V: Align the predictor style for define_insn_and_split

2023-06-13 Thread Robin Dapp via Gcc-patches
> I don't have a proper sim environment setup yet.  How long does the > testsuite take > with spike?  Have you tried qemu as well? Any numbers on this Pan? How many cores do you use for running the testsuite? Regards Robin

Re: [PATCH v1] RISC-V: Align the predictor style for define_insn_and_split

2023-06-13 Thread Robin Dapp via Gcc-patches
Yes, I agree with the general assessment (and didn't mean to insinuate that the FAILs are compiler's or a fault of the patch. > So these 2 failures in RV32 are not the compile's bugs. I have seen: > /* { dg-do run { target { { {riscv_vector} && {rv64} } } } } */ in > these testcases which can not

Re: [PATCH v1] RISC-V: Align the predictor style for define_insn_and_split

2023-06-14 Thread Robin Dapp via Gcc-patches
> I am not sure. These testcases were added by kito long time ago. > Frankly, I am not familiar with GCC test framework. Ok, I'm going to have a look. Need to verify the zvfh things anyway. Regards Robin

[PATCH] RISC-V: Add (u)int8_t to binop tests.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this patch adds the missing (u)int8_t types to the binop tests. I suggest in the future we have the testsuite pass -march=rv32gcv as well as -march=rv64gcv as options to each test case instead of essentially duplicate the files as we do now. Regards Robin gcc/testsuite/ChangeLog:

Re: [PATCH v2] RISC-V: Bugfix for vec_init repeating auto vectorization in RV32

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi Pan, > This patch would like to fix one bug exported by RV32 test case > multiple_rgroup_run-2.c. The mask should be restricted by elen in > vector, and the condition between the vmv.s.x and the vmv.v.x should > take inner_bits_size rather than constants. exported -> exposed. How about someth

Re: [PATCH] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, > Thanks for fixing this. > > This patch let RVV type (both vector and tuple) return in memory by > default when there is no vector ABI support. It makes sens to me. > > CC more RISC-V folks to comments. so this is intended to fix the PR as well as unblock while we continue with the prelimi

Re: [PATCH V2] RISC-V: Ensure vector args and return use function stack to pass [PR110119]

2023-06-14 Thread Robin Dapp via Gcc-patches
> Oh. I see Robin's email is also wrong. CC Robin too for you  It still arrived via the mailing list ;) > Good to see a Fix patch of the ICE before Vector ABI patch. > Let's wait for more comments. LGTM, this way I don't even need to rewrite my tests. Regards Robin

[PATCH] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this introduces new checks for run tests. Currently we have riscv_vector as well as rv32 and rv64 which all check if GCC (with the current configuration) can build (not execute) the respective tests. Many tests specify e.g. a different -march for vector, though. So the check fails even thou

Re: [PATCH] RISC-V: Use merge approach to optimize vector permutation

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi Juzhe, the general method seems sane and useful (it's not very complicated). I was just distracted by > Selector = { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }, the > common expression: > { 0, nunits + 1, 1, nunits + 2, 2, nunits + 3, ... } > > For this selector, we can use

[PATCH] RISC-V: Add autovec FP binary operations.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this implements the floating-point autovec expanders for binary operations: vfadd, vfsub, vfdiv, vfmul, vfmax, vfmin and adds tests. The existing tests are amended and split up into non-_Float16 and _Float16 flavors as we cannot rely on the zvfh extension being present. As long as we do not

[PATCH] RISC-V: Add autovec FP unary operations.

2023-06-14 Thread Robin Dapp via Gcc-patches
Hi, this patch adds floating-point autovec expanders for vfneg, vfabs as well as vfsqrt and the accompanying tests. vfrsqrt7 will be added at a later time. Similary to the binop tests, there are flavors for zvfh now. Prerequisites as before. Regards Robin gcc/ChangeLog: * config/ris

Re: [PATCH] RISC-V: Add autovec FP unary operations.

2023-06-15 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I like the iterator solution better, I added it to the binops V2 patch with a comment and will post it in a while. Also realized there is already a testcase and the "enabled" attribute is set properly now but I hadn't rebased to the current master branch in a while... Btw. I'm currentl

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
>>> Can you try using the same wording for length and mask operands >>> as for len_load and maskload? Also len_load has the "bias" >>> operand which you omit here - IIRC that was added for s390 which >>> for unknown reason behaves a little different than power. If >>> len support for s390 ever ex

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
> Meh, PoP is now behind a paywall, trying to get through ... I wonder > if there's a nice online html documenting the s390 len_load/store > instructions to better understand the need for the bias. https://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf Look for vector load with length (store). The

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
On 6/15/23 11:18, Robin Dapp wrote: >> Meh, PoP is now behind a paywall, trying to get through ... I wonder >> if there's a nice online html documenting the s390 len_load/store >> instructions to better understand the need for the bias. This is z16, but obviously no changes for vll/vstl: https://p

Re: [PATCH V2] VECT: Support LEN_MASK_ LOAD/STORE to support flow control for length loop control

2023-06-15 Thread Robin Dapp via Gcc-patches
> the minus in 'operand 2 - operand 3' should be a plus if the > bias is really zero or -1. I suppose Yes, that somehow got lost from when the bias was still +1. Maybe Juzhe can fix this in the course of his patch. > that's quite conservative. I think you can do better when the > loads are ali

Re: [PATCH] RISC-V: Add autovec FP unary operations.

2023-06-15 Thread Robin Dapp via Gcc-patches
> Btw. I'm currently running the testsuite with rv64gcv_zfhmin > default march and see some additional FAILs. Will report back. Reporting back - the FAILs are a combination of an older qemu version and not fully comprehensive target selectors. I'm going to send a V2 for the testsuite patch as we

[PATCH v2] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-06-15 Thread Robin Dapp via Gcc-patches
Hi, Changes from v1: - Revamped the target selectors again. - Fixed some syntax as well as caching errors that were still present. - Adjusted some test cases I missed. The current situation with target selectors is improvable at best. We definitely need to discern between being able to build a

[PATCH v2] RISC-V: Add autovec FP binary operations.

2023-06-15 Thread Robin Dapp via Gcc-patches
Hi, changes from V1: - Add VF_AUTO iterator and use it. - Ensured we don't ICE with -march=rv64gcv_zfhmin. this implements the floating-point autovec expanders for binary operations: vfadd, vfsub, vfdiv, vfmul, vfmax, vfmin and adds tests. The existing tests are split up into non-_Float16 and

[PATCH v2] RISC-V: Add autovec FP unary operations.

2023-06-15 Thread Robin Dapp via Gcc-patches
Hi, changes from V1: - Use VF_AUTO iterator. - Don't mention vfsqrt7. This patch adds floating-point autovec expanders for vfneg, vfabs as well as vfsqrt and the accompanying tests. Similary to the binop tests, there are flavors for zvfh now. gcc/ChangeLog: * config/riscv/autovec.m

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-07-12 Thread Robin Dapp via Gcc-patches
> int32_t x = (int32_t)0x1.0p32; > int32_t y = (int32_t)(int64_t)0x1.0p32; > > sets x to 2147483647 and y to 0. >>> >>> Hmm, good question. GENERIC has a direct truncation to unsigned char >>> for example, the C standard generally says if the integral part cannot >>> be represented

Re: [PATCH] RISC-V: Support COND_LEN_* patterns

2023-07-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Return true if the operation is the floating-point operation need FRM. */ > +static bool > +need_frm_p (rtx_code code, machine_mode mode) > +{ > + if (!FLOAT_MODE_P (mode)) > +return false; > + return code != SMIN && code != SMAX; > +} Return true if the operation requires

Re: [PATCH] Add VXRM enum

2023-07-12 Thread Robin Dapp via Gcc-patches
> +enum __RISCV_VXRM { > + __RISCV_VXRM_RNU = 0, > + __RISCV_VXRM_RNE = 1, > + __RISCV_VXRM_RDN = 2, > + __RISCV_VXRM_ROD = 3, > +}; > + > __extension__ extern __inline unsigned long > __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) > vread_csr(enum RVV_CSR csr) We have

[PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi, the recent changes that allowed multi-step conversions for "non-packing/unpacking", i.e. modifier == NONE targets included promoting to-float and demoting to-int variants. This patch adds demoting to-float and promoting to-int handling. Bootstrapped and regtested on x86 and aarch64. A quest

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, no complaints from my side apart from one: > +/* { dg-additional-options "-mcmodel=medany" } */ Please add a comment why we need this. Regards Robin

Re: [PATCH] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-13 Thread Robin Dapp via Gcc-patches
> Can you add testcases? Also the current restriction is because > the variants you add are not always correct and I don't see any > checks that the intermediate type doesn't lose significant bits? The testcases I wanted to add with a follow-up RISC-V patch but I can also try an aarch64 one. So

Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
From my understanding, we dont have RVV instruction for fmax/fmin? > > Unless I'm misunderstanding, we do. The ISA manual says > > === Vector Floating-Point MIN/MAX Instructions > > The vector floating-point `vfmin` and `vfmax` instructions have the > same behavior as the

Re: [PATCH] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-13 Thread Robin Dapp via Gcc-patches
> Is COND _LEN FMA ok for trunk? I can commit it without changing > scatter store testcase fix. > > It makes no sense block cond Len fma support. The middle end support > has already been merged. Then just add a TODO or so that says e.g. "For some reason we exceed the default code model's +-2

Re: [PATCH V2] RISC-V: Enable COND_LEN_FMA auto-vectorization

2023-07-14 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, looks good to me now - did before already actually ;). Regards Robin

[PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-14 Thread Robin Dapp via Gcc-patches
>>> Can you add testcases? Also the current restriction is because >>> the variants you add are not always correct and I don't see any >>> checks that the intermediate type doesn't lose significant bits? I didn't manage to create one for aarch64 nor for x86 because AVX512 has direct conversions e

Re: [PATCH] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +;; - > +;; [INT,FP] Initialize from individual elements > +;; - > +;; Includes: > +;; - vslide1up.vx/vfslide1up.vf > +;; ---

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > This patch fix testcase failed when I build RISC-V GCC with -mcmodel=medany > as default. If set to medany, stack_save_restore.c testcase will fail because > of > the reduced use of s3 registers in assembly (thus calling __riscv_save/store_3 > instead of __riscv_save/store_4). Explici

Re: [PATCH V2] RISC-V: Enable SLP un-order reduction

2023-07-18 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think the purpose of this testcase is to check whether the modifications to > the stack frame are as expected, so it is necessary to specify exactly whether > three or four registers are saved. But I think its need to add another > testcase > which use another option -mcmodel=medany

Re: [PATCH] RISC-V: Fix testcase failed when default -mcmodel=medany

2023-07-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > I think you are rigth, I would like to remove the `-mcmodel=medany` option and > relax assert from `__riscv_save/restore_4` to `__riscv_save/restore_(3|4)` to > let > this testcase not brittle on any -mcmodel. Then I'm also going to add another > testcase (I dont known how to run -ma

Re: [PATCH] VECT: Support floating-point in-order reduction for length loop control

2023-07-19 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I just noticed that we recently started calling things MASK_LEN (instead of LEN_MASK before) with the reductions. Wouldn't we want to be consistent here? Especially as the length takes precedence. I realize the preparational work like optabs is already upstream but still wanted to brin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> +enum reduction_type > +{ > + UNORDERED_REDUDUCTION, > + FOLD_LEFT_REDUDUCTION, > + MASK_LEN_FOLD_LEFT_REDUDUCTION, > +}; There are redundant 'DU's here ;) Wouldn't it be sufficient to have an enum enum reduction_type { UNORDERED, FOLD_LEFT, MASK_LEN_FOLD_LEFT, }; ? Regards Robin

Re: [PATCH] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> The UNORDERED enum will cause ICE since we have UNORDERED in rtx_code. > > Could you give me another enum name? I would have expected it to work when it's namespaced. Regards Robin

Re: [PATCH V2] RISC-V: Support in-order floating-point reduction

2023-07-20 Thread Robin Dapp via Gcc-patches
> LGTM, but I would like make sure Robin is OK too Yes, LGTM as well. Regards Robin

Re: [PATCH v2] vect: Handle demoting FLOAT and promoting FIX_TRUNC.

2023-07-20 Thread Robin Dapp via Gcc-patches
>> cvt_type >> - = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode), >> + = build_nonstandard_integer_type (GET_MODE_BITSIZE >> + (intermediate_mode), >>

Re: [PATCH v6] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Pan, > + for (insn = PREV_INSN (cur_insn); insn; insn = PREV_INSN (insn)) > +{ > + if (INSN_P (insn)) > + { > + if (CALL_P (insn)) > + mode = FRM_MODE_DYN; > + break; > + } > + > + if (insn == BB_HEAD (bb)) > + break; > +} > + > + return mode;

Re: [PATCH] RISC-V: Fixbug for fsflags instruction error using immediate.

2023-07-24 Thread Robin Dapp via Gcc-patches
Hi Jin, this looks reasonable. Would you mind adding (small) test cases still to make sure we don't accidentally reintroduce the problem? Regards Robin

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
Hi Pan, > Given we have a call, we would like to restore before call and then > backup frm after call. Looks current mode switching cannot emit insn > like that, it can only either emit insn before (mostly) or after > (when NOTE_INSN_BASIC_BLOCK_P). Thus, we try to emit the one after > call when n

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-25 Thread Robin Dapp via Gcc-patches
> The call fesetround could be any function in practice, and we never > know if that function might use dynamic rounding mode floating point > operation or not, also we don't know if it will be called fesetround > or not. > > So that's why we want to restore before function call to make sure we >

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
So after thinking about it again - I'm still not really sure I like treating every function as essentially an fesetround. There is a reason why fesetround is special. Does LLVM behave the same way? But supposing we really, really want it and assuming there's consensus: + start_sequence (); + e

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> current llvm didn't do any pre optimization. They always > backup+restore for each rounding mode intrinsic I see. There is still the option of lazily restoring the (entry) FRM before a function call but not read the FRM after every call. Do we have any data on how good or bad the mode-switchi

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> CSR write could be expensive, it will flush whole pipeline in some > RISC-V core implementation… Hopefully not flush but just sequentialize but yes, it's usually a performance concern. However if we set the rounding mode to something else for an intrinsic and then call a function we want to re

Re: [PATCH] RISC-V: Enable basic VLS modes support

2023-07-26 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just some small remarks, all in all no major concerns. > + vmv%m1r.v\t%0,%1" > + "&& (!register_operand (operands[0], mode) > + || !register_operand (operands[1], mode))" > + [(const_int 0)] > + { > +unsigned size = GET_MODE_BITSIZE (mode).to_constant (); > +if (size

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-26 Thread Robin Dapp via Gcc-patches
> I would like to propose that being focus and moving forward for this > patch itself, the underlying other RVV floating point API support and > the RVV instrinsic API fully tests depend on this. Sorry, I didn't mean to ditch LCM/mode switching. I believe it is doing a pretty good job and we shou

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
>> Why do we appear to return a different mode here? We already request >> FRM_MODE_DYN_CALL in mode_needed. It looks like in the whole function >> we do not change the mode so we could just always return the incoming >> mode? > > Because we need to emit 2 insn when meet a call. One before the c

Re: [PATCH v7] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-27 Thread Robin Dapp via Gcc-patches
> I see, you mean at the beginning of frm_after, we can just return the > incoming mode as is? > > If (CALL_P (insn)) > return mode; // Given we aware the mode is DYN_CALL already. Yes, potentially similar for all the other ifs but I didn't check all of them. > Thank and will cleanup this i

Re: [PATCH v2] RISC-V: testsuite: Add vector_hw and zvfh_hw checks.

2023-07-27 Thread Robin Dapp via Gcc-patches
> LGTM, I just found this patch still on the list, I mostly tested with > qemu, so I don't think that is a problem before, but I realize it's a > problem when we run on a real board that does not support those > extensions. I think we can skip this one as I needed to introduce vector_hw and zvfh_h

Re: [PATCH v8] RISC-V: Support CALL for RVV floating-point dynamic rounding

2023-07-28 Thread Robin Dapp via Gcc-patches
Hi Pan, thanks for your patience and your work. Apart from my general doubt whether mode-changing intrinsics are a good idea, I don't have other remarks that need fixing. What I mentioned before: - Handling of asms wouldn't be a huge change. It can be done in a follow-up patch of course but

[PATCH] gcse: Extract reg pressure handling into separate file.

2023-07-28 Thread Robin Dapp via Gcc-patches
Hi, this patch extracts the hoist-pressure handling from gcse and puts it into a separate file so it can be used by other passes in the future. No functional change and I also abstained from c++ifying the code. The naming with the regpressure_ prefix might be a bit clunky for now and I'm open to a

Re: [PATCH v2] RISC-V: convert the mulh with 0 to mov 0 to the reg.

2023-07-28 Thread Robin Dapp via Gcc-patches
> This is a draft patch. I would like to explain it's hard to make the > simplify generic and ask for some help. > > There're 2 categories we need to optimize. > > - The op in optab such as div / 1. > - The unspec operation such as mulh * 0, (vadc+vmadc) + 0. > > Especially for the unspec operat

Re: [PATCH V2] RISC-V: Enable basic VLS auto-vectorization

2023-07-30 Thread Robin Dapp via Gcc-patches
> +;; - > +;; Duplicate Operations > +;; - > + > +(define_insn_and_split "@vec_duplicate" > + [(set (match_operand:VLS 0 "register_operand") > +(vec_duplicat

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +/* Expand Vector POPCOUNT by parallel popcnt: > + > + int parallel_popcnt(uint32_t n) { > + #define POW2(c) (1U << (c)) > + #define MASK(c) (static_cast(-1) / (POW2(POW2(c)) + 1U)) > + #define COUNT(x, c) ((x) & MASK(c)) + (((x)>>(POW2(c))) & MASK(c)) > + n = CO

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
> +/* FIXME: We don't allow vectorize "__builtin_popcountll" yet since it needs > "vec_pack_trunc" support > + and such pattern may cause inferior codegen. > + We will enable "vec_pack_trunc" when we support reasonable vector > cost model. */ Wait, why do we need vec_pack_trunc f

Re: [PATCH V2] RISC-V: Support POPCOUNT auto-vectorization

2023-07-31 Thread Robin Dapp via Gcc-patches
>>> I'm not against continuing with the more well-known approach for now >>> but we should keep in mind that might still be potential for improvement. > > No. I don't think it's faster. I did a quick check on my x86 laptop and it's roughly 25% faster there. That's consistent with the literature.

Re: RISCV test infrastructure for d / v / zfh extensions

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi Joern, thanks, I believe this will help with testing. > +proc check_effective_target_riscv_v { } { > +return [check_no_compiler_messages riscv_ext_v assembly { > + #ifndef __riscv_v > + #error "Not __riscv_v" > + #endif > +}] > +} This can be replaced by riscv_vector

[PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-01 Thread Robin Dapp via Gcc-patches
Hi, this patch adds vector average patterns op[0] = (narrow) ((wide) op[1] + (wide) op[2]) >> 1; op[0] = (narrow) ((wide) op[1] + (wide) op[2] + 1) >> 1; If there is no direct support, the vectorizer can synthesize the patterns but, presumably due to lack of narrowing operation support, won't

Re: [PATCH] RISC-V: Implement vector "average" autovec pattern.

2023-08-02 Thread Robin Dapp via Gcc-patches
> 1. How do you model round to +Inf (avg_floor) and round to -Inf (avg_ceil) ? That's just specified by the +1 or the lack of it in the original pattern. Actually the IFN is just a detour because we would create perfect code if not for the fallback. But as there is currently now way to check for

Re: [PATCH V2] RISC-V: Support CALL conditional autovec patterns

2023-08-03 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I would find it a bit clearer if the prepare_ternay part were a separate patch. As it's mostly mechanical replacements I don't mind too much, though so it's LGTM from my side without that. As to the lmul = 8 ICE, is the problem that the register allocator would actually need 5 "registe

[PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi, originally inspired by the wish to transform vmv v3, a0 ; = vec_duplicate vadd.vv v1, v2, v3 into vadd.vx v1, v2, a0 via fwprop for riscv, this patch enables the forward propagation of UNARY_P sources. As this involves potentially replacing a vector register with a scalar register the

Re: [PATCH] RISC-V: Support VLS basic operation auto-vectorization

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, looks good from my side. > +/* { dg-final { scan-assembler-times {vand\.vi\s+v[0-9]+,\s*v[0-9]+,\s*-16} > 42 } } */ > +/* { dg-final { scan-assembler-not {csrr} } } */ I was actually looking for a scan-assembler-not vsetvli... but the csrr will do as well. Regards Robin

[PATCH] vect: Add a popcount fallback.

2023-08-07 Thread Robin Dapp via Gcc-patches
Hi, This patch adds a fallback when the backend does not provide a popcount implementation. The algorithm is the same one libgcc uses, as well as match.pd for recognizing a popcount idiom. __builtin_ctz and __builtin_ffs can also rely on popcount so I used the fallback for them as well. Bootstr

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Looks reasonable to me - I couldn't read from above whether you did > testing on riscv and thus verified the runtime correctness of the fallback? > If not may I suggest to force matching the pattern on a target you can > test for this purpose? I tested on riscv (manually and verified the run tes

Re: [PATCH v2] Mode-Switching: Fix SET_SRC ICE when USE or CLOBBER

2023-08-08 Thread Robin Dapp via Gcc-patches
> Could you please help to share how to enable checks here? Build with --enable-checking or rather --enable-checking=extra. Regards Robin

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Well, not sure how VECT_COMPARE_COSTS can help here, we either > get the pattern or vectorize the original function. There's no special > handling > for popcount in vectorizable_call so all special cases are handled via > patterns. > I was thinking of popcounthi via popcountsi and zero-extend

Re: [PATCH] RISC-V: Allow CONST_VECTOR for VLS modes.

2023-08-08 Thread Robin Dapp via Gcc-patches
Hi Juzhe, just some nits. > - else if (rtx_equal_p (step, constm1_rtx) && poly_int_rtx_p (base, &value) > + else if (rtx_equal_p (step, constm1_rtx) > +&& poly_int_rtx_p (base, &value) Looks like just a line-break change and the line is not too long? > - rtx ops[] = {dest, vid, g

Re: [PATCH] vect: Add a popcount fallback.

2023-08-08 Thread Robin Dapp via Gcc-patches
> Hmm, the conversion should be a separate statement so I wonder > why it would go wrong? It is indeed. Yet, lhs_type is the lhs type of the conversion and not the call and consequently we compare the precision of the converted type with the popcount input. So we should probably rather do someth

  1   2   3   4   5   >