Re: [PATCH 1/2] c-family: Copy DECL_USER_ALIGN even if DECL_ALIGN is similar.

2021-05-17 Thread Robin Dapp via Gcc-patches
on s390 a warning test fails: inline int ATTR ((cold, aligned (8))) finline_hot_noret_align (int); inline int ATTR ((warn_unused_result)) finline_hot_noret_align (int); inline int ATTR ((aligned (4))) finline_hot_noret_align (int); /* { dg-warning "ignoring attribute .aligned \\(4\\). beca

Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2023-01-11 Thread Robin Dapp via Gcc-patches
Hi, > On optimizing for speed, default_noce_conversion_profitable_p() allows > plenty of headroom, so this patch has little impact. > > Also, if the target-specific cost estimate is accurate or allows for > margins, the impact should be similarly small. I believe this part of ifcvt does/did not

Re: [PATCH] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-22 Thread Robin Dapp via Gcc-patches
Hi Lehua, no concerns here, just tiny remarks but in general LGTM as is. > +(define_insn_and_split "*copysign_neg" > + [(set (match_operand:VF 0 "register_operand") > +(neg:VF > + (unspec:VF [ > +(match_operand:VF 1 "register_operand") > +(match_operand:V

Re: [PATCH V2] RISC-V: Add conditional unary neg/abs/not autovec patterns

2023-08-23 Thread Robin Dapp via Gcc-patches
OK, thanks. Regards Robin

[PATCH] RISC-V: Add initial pipeline description for an out-of-order core.

2023-08-23 Thread Robin Dapp via Gcc-patches
Hi, this adds a pipeline description for a generic out-of-order core. Latency and units are not based on any real processor but more or less educated guesses what such a processor could look like. For the lack of a better name, I called the -mtune parameter "generic-ooo". In order to account for

Re: [PATCH] RISC-V: Add initial pipeline description for an out-of-order core.

2023-08-23 Thread Robin Dapp via Gcc-patches
> Does this patch fix these 2 following PR: > 108271 – Missed RVV cost model (gnu.org) > > 108412 – RISC-V: Negative optimization of GCSE && LOOP INVARIANTS (gnu.org) > > > If yes, plz app

Re: [PATCH] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > vcpop.m a5,v0 > beq a5,zero,.L3 > addia5,a5,-1 > vsetvli a4,zero,e32,m1,ta,ma > vcompress.vmv2,v3,v0 > vslidedown.vx v2,v2,a5 > vmv.x.s a0,v2 > .L3: > sext.w a0,a0 Mhm, where is this sext coming from? Thought I had this c

Re: [PATCH] RISC-V: Add conditional sign/zero extension and truncation autovec patterns

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, just tiny non-functional nits. > - rtx ops[] = {operands[0], quarter}; > - icode = code_for_pred_trunc (mode); > - riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ops); > + rtx half = gen_reg_rtx (mode); Not really a half anymore now? :) > +#include > + > +#

Re: [PATCH] RISC-V: Add conditional sign/zero extension and truncation autovec patterns

2023-08-24 Thread Robin Dapp via Gcc-patches
> Yes, it's better to call it one_quad. I'd suggest to go with quarter as before or quarter_width_op or something. >> Is this necessary for recognizing a different pattern? > > Are you saying that the testcases xxx-1 and xxx-2 are duplicated? If > so, I have no problem removing it and just kee

Re: [PATCH V2] RISC-V: Support LEN_FOLD_EXTRACT_LAST auto-vectorization

2023-08-24 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

Re: [PATCH] RISC-V: Add COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS testcases

2023-08-24 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] tree-optimization/111115 - SLP of masked stores

2023-08-24 Thread Robin Dapp via Gcc-patches
This causes an ICE in gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c (internal compiler error: in get_group_load_store_type, at tree-vect-stmts.cc:2121) #include #define TEST_LOOP(DATA_TYPE, INDEX_TYPE) \ void __attribute__ ((noinline,

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-08-24 Thread Robin Dapp via Gcc-patches
Ping. I refined the code and some comments a bit and added a test case. My question in general would still be: Is this something we want given that we potentially move some of combine's work a bit towards the front of the RTL pipeline? Regards Robin Subject: [PATCH] fwprop: Allow UNARY_P and

Re: [PATCH V2] RISC-V: Add conditional autovec convert(INT<->INT) patterns

2023-08-24 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, LGTM. One thing maybe for the next patches: It seems to me that we lump all of the COND_... tests into the cond subdirectory when IMHO they would also fit into the respective directories of their operations (binop, unop etc). Right now we will have a lot of rather unrelated tes

Re: [PATCH V2] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
Thanks, just giving my quick thoughts on some of the FAILs: > Test report: > FAIL: gcc.dg/vect/bb-slp-10.c -flto -ffat-lto-objects scan-tree-dump slp2 > "unsupported unaligned access" > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump slp2 "unsupported unaligned > access" For these we would need

Re: [PATCH] RISC-V: Disable user vsetvl fusion into EMPTY block

2023-08-28 Thread Robin Dapp via Gcc-patches
> || vsetvl_insn_p (expr.get_insn ()->rtl ())) > continue; > new_info = expr.global_merge (expr, eg->src->index); > @@ -3317,6 +3335,25 @@ pass_vsetvl::earliest_fusion (void) > prob = profile_probability::uninitialized (); >

Re: [PATCH V3] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
On 8/28/23 12:16, Juzhe-Zhong wrote: > FAIL: gcc.dg/vect/bb-slp-10.c -flto -ffat-lto-objects scan-tree-dump slp2 > "unsupported unaligned access" > FAIL: gcc.dg/vect/bb-slp-10.c scan-tree-dump slp2 "unsupported unaligned > access" > XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times v

Re: [PATCH] RISC-V: Refactor and clean expand_cond_len_{unop,binop,ternop}

2023-08-28 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks for starting with the refactoring. I have some minor comments. > +/* The value means the number of operands for insn_expander. */ > enum insn_type > { >RVV_MISC_OP = 1, >RVV_UNOP = 2, > - RVV_UNOP_M = RVV_UNOP + 2, > - RVV_UNOP_MU = RVV_UNOP + 2, > - RVV_UNOP_TU =

Re: [PATCH V4] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-08-28 Thread Robin Dapp via Gcc-patches
> LGTM from my side, but I would like to wait Robin is ok too In principle I'm OK with it as well, realizing we will still need to fine-tune a lot here anyway. For now, IMHO it's good to have some additional test coverage in the vector space but we should not expect every test to be correct/a go

Re: [PATCH V3] RISC-V: Refactor and clean expand_cond_len_{unop,binop,ternop}

2023-08-29 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, LGTM now. Regards Robin

[PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Robin Dapp via Gcc-patches
Hi, when looking at a riscv ICE in vect-live-6.c I noticed that we assume that the variable part (coeffs[1] * x1) of the to-be-extracted bit number in extract_bit_field_1 is a multiple of BITS_PER_UNIT. This means that bits_to_bytes_round_down and num_trailing_bits cannot handle e.g. extracting f

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-08-30 Thread Robin Dapp via Gcc-patches
> But in the VLA case, doesn't it instead have precision 4+4X? > The problem then is that we can't tell at compile time which > byte that corresponds to. So... Yes 4 + 4x. I keep getting confused with poly modes :) In this case we want to extract the bitnum [3 4] = 3 + 4x which would be in byte

Re: [PATCH] RISC-V: Refactor and clean emit_{vlmax,nonvlmax}_xxx functions

2023-08-31 Thread Robin Dapp via Gcc-patches
Hi Lehua, thanks, this definitely goes into the direction of what I had in mind and simplifies a lot of the reduntant emit_... so it's good to have it. I was too slow for a detailed response :) So just some high-level comments. One thing I noticed is the overloading of "MASK_OP", we use it as

[PATCH] testsuite/vect: Make match patterns more accurate.

2023-08-31 Thread Robin Dapp via Gcc-patches
Hi, on some targets we fail to vectorize with the first type the vectorizer tries but succeed with the second. This patch changes several regex patterns to reflect that behavior. Before we would look for a single occurrence of e.g. "vect_recog_dot_prod_pattern" but would possible find two (one f

Re: [PATCH] RISC-V: Add Vector cost model framework for RVV

2023-08-31 Thread Robin Dapp via Gcc-patches
OK. As it doesn't do anything and we'll be needing it anyway no harm in adding it. Regards Robin

[PATCH] RISC-V: Add vec_extract for BI -> QI.

2023-09-01 Thread Robin Dapp via Gcc-patches
Hi, this patch adds a vec_extract expander that extracts a QImode from a vector mask mode. In doing so, it helps recognize a "live operation"/extract last idiom for mask modes. It fixes the ICE in tree-vect-live-6.c by circumventing the fallback code in extract_bit_field_1. The problem there is

Re: [PATCH] RISC-V: Enable VECT_COMPARE_COSTS by default

2023-09-01 Thread Robin Dapp via Gcc-patches
Hi Juzhe, thanks, this is OK, we would have needed this sooner or later anyway. Regards Robin

Re: [PATCH] RISC-V: Add dynamic LMUL compile option

2023-09-01 Thread Robin Dapp via Gcc-patches
LGTM Regards Robin

Re: [PATCH 1/4] RISC-V: Adjust expand_cond_len_{unary,binop,op} api

2023-09-01 Thread Robin Dapp via Gcc-patches
Thanks, LGTM. Btw. I haven't forgotten to respond to your last refactor but just didn't find the time yet. I figured I should have some proper draft before suggesting more things :) Regards Robin

Re: [PATCH 2/4] RISC-V: Add conditional autovec convert(INT<->INT) patterns

2023-09-01 Thread Robin Dapp via Gcc-patches
Hi Lehua, this LGTM now, thanks. It's also easier to read after the refactor :) Regards Robin

Re: [PATCH 3/4] RISC-V: Add conditional autovec convert(FP<->FP) patterns

2023-09-01 Thread Robin Dapp via Gcc-patches
Hi Lehua, this is OK, thanks. Regards Robin

Re: [PATCH 4/4] RISC-V: Add conditional autovec convert(INT<->FP) patterns

2023-09-01 Thread Robin Dapp via Gcc-patches
This one is OK as well, thanks. Regards Robin

Re: [PATCH] expmed: Allow extract_bit_field via mem for low-precision modes.

2023-09-01 Thread Robin Dapp via Gcc-patches
> It's not just a question of which byte though. It's also a question > of which bit. > > One option would be to code-generate for even X and for odd X, and select > between them at runtime. But that doesn't scale well to 2+2X and 1+1X. > > Otherwise I think we need to treat the bit position as

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-04 Thread Robin Dapp via Gcc-patches
> So I don't think I have a good feel for the advantages and disadvantages > of doing this. Robin's analysis of the aarch64 changes was nice and > detailed though. I think the one that worries me most is the addressing > mode one. fwprop is probably the first chance we get to propagate adds > in

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-05 Thread Robin Dapp via Gcc-patches
> I imagine doing it in reverse postorder would still make sense. > > But my point was that, for the current fwprop limitation of substituting > into exactly one use of a register, we can check whether that use is > the *only* use of register. > > I.e. if we substitute: > > A: (set (reg R1) (f

Re: [PATCH V2] RISC-V: Support Dynamic LMUL Cost model

2023-09-05 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I think the general approach makes sense and it doesn't need to be perfect from the beginning as we can always iterate on it. Before continuing with a more detailed review (hopefully tomorrow) some high-level questions upfront. It would help to document some of these choices so it's eas

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-06 Thread Robin Dapp via Gcc-patches
Hi Richard, I did some testing with the attached v2 that does not restrict to UNARY anymore. As feared ;) there is some more fallout that I'm detailing below. On Power there is one guality fail (pr43051-1.c) that I would take the liberty of ignoring for now. On x86 there are four fails: - con

Re: [PATCH V2] RISC-V: Support Dynamic LMUL Cost model

2023-09-06 Thread Robin Dapp via Gcc-patches
Hi Juzhe, general remark upfront: Please add function-level comments for all functions. This makes reading and reviewing much easier. I had to sweep back and forth quite a bit. > + > +static int > +get_last_live_range (const vec &live_ranges, tree var) > +{ > + unsigned int ix; > + var_live_

Re: [PATCH v3] RISC-V: Add autovec FP binary operations.

2023-09-06 Thread Robin Dapp via Gcc-patches
Hi Lehua, > May I ask if the compiler options "-march=rv64gcv_zvfh -mabi=lp64d" > should be removed? Because we don't specify -march and -mabi when we > run testcase (so but why we need to specify the -march and -mabi in > this target check?), we run it with the default values. Assuming that > t

Re: [PATCH] RISC-V: Fix VSETVL PASS AVL/VL fetch bug[111295]

2023-09-06 Thread Robin Dapp via Gcc-patches
OK. Regards Robin

Re: [PATCH] RISC-V: Remove unreasonable TARGET_64BIT for VLS modes with size = 64bit

2023-09-06 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-07 Thread Robin Dapp via Gcc-patches
in >From d3f87e4de7d7d05a2fcf8c948097b14eadf08c90 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Mon, 24 Jul 2023 16:25:38 +0200 Subject: [PATCH] gcse: Extract reg pressure handling into separate file. This patch extracts the hoist-pressure handling from gcse and puts it into a separate file

Re: [PATCH] RISC-V: Add VLS mask modes mov patterns[PR111311]

2023-09-07 Thread Robin Dapp via Gcc-patches
I have an almost identical patch locally that passed testing as well but didn't get around to posting it yet. Therefore LGTM. Regards Robin

Re: [PATCH] fwprop: Allow UNARY_P and check register pressure.

2023-09-07 Thread Robin Dapp via Gcc-patches
Thanks for looking at it in detail. > Yeah, I think this is potentially a blocker for propagating A into B > when A is used elsewhere. Combine is able to combine A and B while > keeping A in parallel with the result. I think either fwprop would > need to try that too, or it would need to be rest

[PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-09-08 Thread Robin Dapp via Gcc-patches
Hi, found in slp-reduc-7.c, this patch prevents optimizing e.g. COND_LEN_ADD ({-1, ... }, a, 0, c, len, bias) unconditionally into just "a". Currently, we assume that COND_LEN operations can be optimized similarly to COND operations. As the length is part of the mask (and usually not compile-ti

[PATCH] match: Don't sink comparisons into vec_cond operands.

2023-09-08 Thread Robin Dapp via Gcc-patches
Hi, on riscv gcc.dg/pr70252.c ICEs at gimple-isel.cc:283. This is because we created the gimple statement mask__7.36_170 = VEC_COND_EXPR ; during vrp2. What happens is that, starting with maskdest = (vec_cond mask1 1 0) >= (vec_cond mask2 1 0) we fold to maskdest = mask1 >= (vec_cond (ma

Re: [PATCH V3] RISC-V: Support Dynamic LMUL Cost model

2023-09-11 Thread Robin Dapp via Gcc-patches
Hi Juzhe, glad that we can use the dominator info directly. Could we move the calculation of the info to the beginning (if it's not available)? That makes it clearer that it's a prerequisite. Function comments look good now. Some general remarks kind of similar to v1: - I would prefer a hash

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-09-11 Thread Robin Dapp via Gcc-patches
Hi, as Juzhe noticed in gcc.dg/pr92301.c there was still something missing in the last patch. The attached v2 makes sure we always have a COND_LEN operation before returning true and initializes len and bias even if they are unused. Bootstrapped and regtested on aarch64 and x86. Regards Robin

Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Robin Dapp via Gcc-patches
Hi Juzhe, > +max_number_of_live_regs (const basic_block bb, > + const hash_map &live_ranges, > + unsigned int max_point, machine_mode biggest_mode, > + int lmul) > +{ > + unsigned int max_nregs = 0; > + unsigned int i; > + unsigned

Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Robin Dapp via Gcc-patches
I did some benchmarks and, at least for calculix the differences are miniscule. I'd say we can stick with the current approach and improve as needed. However, I noticed ICEs here: + gcc_assert (biggest_size >= mode_size); and here: + mode = TYPE_MODE (TREE_TYPE (lhs)); when compiling calcul

Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Robin Dapp via Gcc-patches
> Is calculix big ? It's 7 nested for loops IIRC and, when unrolling, can get pretty nasty. I tested with -Ofast -funroll-loops. I think wrf is even larger, maybe I can run a full comparison test tonight to have good coverage. > Could you give me the testcase to reproduce it? OK, I will try to

Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Robin Dapp via Gcc-patches
> This is first version of dynamic LMUL. > I didn't test it with full GCC testsuite. > > My plan is to first pass all GCC testsuite (including vect.exp) with default > LMUL = M1. > Then enable dynamic LMUL to test it. > > Maybe we could tolerate this ICE issue for now. Then we can test it > wi

Re: [PATCH V5] RISC-V: Support Dynamic LMUL Cost model

2023-09-12 Thread Robin Dapp via Gcc-patches
LGTM. We should just keep in mind the restrictions discussed in the other thread. Regards Robin

Re: [PATCH] RISC-V: Support VECTOR BOOL vcond_mask optab[PR111337]

2023-09-12 Thread Robin Dapp via Gcc-patches
Maybe you want to add PR target/111337 to the changelog? The rest LGTM. Regards Robin

Re: [PATCH V2] RISC-V: Support VECTOR BOOL vcond_mask optab[PR111337]

2023-09-12 Thread Robin Dapp via Gcc-patches
The PR thing needs to be moved but I can commit it. Regards Robin

Re: [PATCH V6] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-09-12 Thread Robin Dapp via Gcc-patches
The current status (for rv64gcv) is: === gcc tests === Running target unix/-march=rv64gcv XPASS: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 2 XPASS: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optim

Re: [PATCH V6] RISC-V: Enable vec_int testsuite for RVV VLA vectorization

2023-09-12 Thread Robin Dapp via Gcc-patches
> Most (all?) of those are due to: > f951: Warning: command-line option '-Wno-psabi' is valid for > C/C++/D/LTO/ObjC/ObjC++ but not for Fortran > so no real bug. When pushing this, I'd take the liberty of enabling the recently merged vector ABI so we don't require -Wno-psabi anymore. All Fortran

Re: [PATCH] RISC-V: Support VLS modes VEC_EXTRACT auto-vectorization

2023-09-13 Thread Robin Dapp via Gcc-patches
> -(define_expand "vec_extract" > +(define_expand "@vec_extract" Do we need the additional helper function? If not let's rather not add them for build-time reasons. The rest is OK, no need for v2. Regards Robin

Re: [PATCH] RISC-V: Support VLS modes VEC_EXTRACT auto-vectorization

2023-09-13 Thread Robin Dapp via Gcc-patches
> Yes. We need the additional helper function since I will cal emit_insn > (gen_vec_extract (mode, mode) > in the following patch which fixes PR111391 ICE. OK. Regards Robin

Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-14 Thread Robin Dapp via Gcc-patches
> I am thinking what we are doing is something like we are allowing > scalar mode within the vector register, so...not sure should we try to > implement that within the mov pattern? > > I guess we need some inputs from Jeff. Sorry for the late response. I have also been thinking about this and i

Re: Machine Mode ICE in RISC-V when LTO

2023-09-15 Thread Robin Dapp via Gcc-patches
Hi Thomas, Jakub, is there anything we can do to assist from the riscv side in order to help with this? I haven't really been involved with it but was wondering what's missing. If I understand correctly Thomas has a major cleanup operation in plan but might not get to it soon. The fix he propos

Re: [PATCH V4] RISC-V: Expand VLS mode to scalar mode move[PR111391]

2023-09-15 Thread Robin Dapp via Gcc-patches
> You mean this patch is ok? I thought about it a bit more. From my point of view the patch is OK for now in order to get the bug out of the way. In the longer term I would really prefer a more "regular" solution (i.e. via hard_regno_mode_ok) and related. I can take care of that once I have a b

Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

2023-09-18 Thread Robin Dapp via Gcc-patches
> I must be missing something. Doesn't insn 10 broadcast the immediate > 0x2 to both elements of r142?!? What am I missing? It is indeed a bit misleading. The difference is in the mask which is not displayed in the short form. So we actually use a vec_dup for a single-element move, essentially

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-09-18 Thread Robin Dapp via Gcc-patches
Ping. Regards Robin

Re: [PATCH] RISC-V: Remove redundant vec_duplicate pattern

2023-09-18 Thread Robin Dapp via Gcc-patches
LGTM. Regards Robin

Re: [PATCH] RISC-V: Support combine cond extend and reduce sum to cond widen reduce sum

2023-09-18 Thread Robin Dapp via Gcc-patches
Hi Lehua, > +(define_expand "vcond_mask_" > + [(set (match_operand:V_VLS 0 "register_operand") > +(if_then_else:V_VLS > + (match_operand: 3 "register_operand") > + (match_operand:V_VLS 1 "nonmemory_operand") > + (match_operand:V_VLS 2 "vector_register_or_const_0

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> $ riscv64-unknown-linux-gnu-gcc > --param=riscv-autovec-preference=fixed-vlmax > gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c -O2 -march=rv64gcv > -S > ../riscv-gnu-toolchain-trunk/riscv-gcc/gcc/testsuite/gcc.target/riscv/rvv/base/spill-10.c: > In function 'stach_check_alloca_1': > ../riscv

Re: [PATCH 2/3 V2] RISC-V: Enable basic auto-vectorization for RVV

2023-04-20 Thread Robin Dapp via Gcc-patches
> Can you give more comments about Robin's opinion that he want to change into > "fixed" vs "varying" or "fixed vector size" vs "dynamic vector size" ? It's not necessary to decide on this now as --params are not supposed to be stable and can be changed quickly. I was just curious if this had alr

Re: [RFA] [PR target/108248] [RISC-V] Break down some bitmanip insn types

2023-04-21 Thread Robin Dapp via Gcc-patches
> ../../gcc/config/riscv/generic.md:28:1: unknown value `smin' for attribute > `type' > make[3]: *** [Makefile:2528: s-attrtab] Error 1 > >From 582c428258ce17ffac8ef1b96b4072f3d510480f Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Fri, 21 Apr 2023 09:38:06 +0200 Su

Re: [PATCH v4 05/10] RISC-V:autovec: Add autovectorization patterns for binary integer operations

2023-04-26 Thread Robin Dapp via Gcc-patches
Hi Michael, I have the diff below for the binops in my tree locally. Maybe something like this works for you? Untested but compiles and the expander helpers would need to be fortified obviously. Regards Robin -- gcc/ChangeLog: * config/riscv/autovec.md (3): New binops expander.

[PATCH] riscv: Allow vector constants in riscv_const_insns.

2023-04-28 Thread Robin Dapp via Gcc-patches
Hi, I figured I'm going to start sending some patches that build on top of the upcoming RISC-V autovectorization. This one is obviously not supposed to be installed before the basic support lands but it's small enough that it shouldn't hurt to send it now. This patch allows vector constants in r

Re: [PATCH V3] RISC-V: Enable basic RVV auto-vectorization support.

2023-05-05 Thread Robin Dapp via Gcc-patches
Hi Juzhe, I wasn't yet able to check this locally so just some minor comment nits: > +/* Return the vectorization machine mode for RVV according to LMUL. */ > +machine_mode > +preferred_simd_mode (scalar_mode mode) > +{ > + /* We only enable auto-vectorization when TARGET_MIN_VLEN < 128 && > +

[PATCH] s390: Add LEN_LOAD/LEN_STORE support.

2023-02-02 Thread Robin Dapp via Gcc-patches
Hi, this patch adds LEN_LOAD/LEN_STORE support for z14 and newer. It defines a bias value of -1 and implements the LEN_LOAD and LEN_STORE optabs. It also includes various vll/vstl testcases adapted from Kewen Lin's patch for Power. Bootstrapped and regtested on z13-z16. Is it OK? Regards Robi

[PATCH] testsuite: Move gimplefe40.c and gimplefe41.c

2021-04-15 Thread Robin Dapp via Gcc-patches
directory. Is this OK? Regards Robin -- gcc/testsuite/ChangeLog: * gcc.dg/gimplefe-40.c: Moved to... * gcc.dg/vect/gimplefe-40.c: ...here. * gcc.dg/gimplefe-41.c: Moved to... * gcc.dg/vect/gimplefe-41.c: ...here. commit 1acd4d01c41aeea78ec3c188beb25b245dda

Re: [PATCH] testsuite: Move gimplefe40.c and gimplefe41.c

2021-04-16 Thread Robin Dapp via Gcc-patches
Do the testcases currently fail? How? In principle moving to vect/ is OK but then having the gimplefe testcases in one place is nice ... yes, they ICE on targets that do not have vector capabilities: gimplefe-40.c:7:1: internal compiler error: in emit_move_insn, at expr.c:3821 Normally we

[PATCH] s390/testsuite: Fix oscbreak-1.c.

2021-04-16 Thread Robin Dapp via Gcc-patches
Hi, checking for an osc break is somewhat brittle especially with many passes potentially introducing new insns and moving them around. Therefore, only run the test with -O1 -fschedule-insns in order to limit the influence of other passes. Is it OK? Regards Robin -- gcc/testsuite/ChangeLog:

[PATCH] combine: Check for paradoxical subreg

2021-06-23 Thread Robin Dapp via Gcc-patches
cal subreg. >From 2b74c66f8d97c30c45e99336c9871cccd8cf94e1 Mon Sep 17 00:00:00 2001 From: Robin Dapp Date: Tue, 22 Jun 2021 17:35:37 +0200 Subject: [PATCH 11/11] combine paradoxical subreg fix. --- gcc/combine.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/combine.c

[PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-06-25 Thread Robin Dapp via Gcc-patches
When if-converting multiple SETs and we encounter a swap-style idiom if (a > b) { tmp = c; // [1] c = d; d = tmp; } ifcvt should not generate a conditional move for the instruction at [1]. In order to achieve that, this patch goes through all relevant SETs and marks

[PATCH 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.

2021-06-25 Thread Robin Dapp via Gcc-patches
Currently we only ever call emit_conditional_move with the comparison (as well as its comparands) we got from the jump. Thus, backends are going to emit a CC comparison for every conditional move that is being generated instead of re-using the existing CC. This, combined with emitting temporaries

[PATCH 0/7] ifcvt: Convert multiple

2021-06-25 Thread Robin Dapp via Gcc-patches
he situation on s390 and, in general, allows more accurate costing. Regards Robin Robin Dapp (7): ifcvt: Check if cmovs are needed. ifcvt: Allow constants for noce_convert_multiple. ifcvt: Improve costs handling for noce_convert_multiple. ifcvt/optabs: Allow using a CC comparison for emit_c

[PATCH 5/7] ifcvt: Try re-using CC for conditional moves.

2021-06-25 Thread Robin Dapp via Gcc-patches
Following up on the previous patch, this patch makes noce_convert_multiple emit two cmov sequences: The same one as before and a second one that tries to re-use the existing CC. Then their costs are compared and the cheaper one is selected. --- gcc/ifcvt.c | 94 ++

[PATCH 6/7] testsuite/s390: Add tests for noce_convert_multiple.

2021-06-25 Thread Robin Dapp via Gcc-patches
Add new s390-specific tests that check if we convert two SETs into two loads on condition. Remove the s390-specific target-check in gcc.dg/ifcvt-4.c. --- gcc/testsuite/gcc.dg/ifcvt-4.c| 2 +- .../gcc.target/s390/ifcvt-two-insns-bool.c| 39 +++ .../gcc.target/s

[PATCH 2/7] ifcvt: Allow constants for noce_convert_multiple.

2021-06-25 Thread Robin Dapp via Gcc-patches
This lifts the restriction of not allowing constants for noce_convert_multiple. The code later checks if a valid sequence is produced anyway. --- gcc/ifcvt.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index eef6490626a..6006055f26a

[PATCH 3/7] ifcvt: Improve costs handling for noce_convert_multiple.

2021-06-25 Thread Robin Dapp via Gcc-patches
When noce_convert_multiple is called the original costs are not yet initialized. Therefore, up to now, costs were only ever unfairly compared against COSTS_N_INSNS (2). This would lead to default_noce_conversion_profitable_p () rejecting all but the most contrived of sequences. This patch tempor

[PATCH 7/7] s390: Increase costs for load on condition and change movqicc expander.

2021-06-25 Thread Robin Dapp via Gcc-patches
--- gcc/config/s390/s390.c | 2 +- gcc/config/s390/s390.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 6bbeb640e1f..e3b1f99669f 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -3636,7 +3636,7 @@

Re: [PATCH] vect: Add bias parameter for partial vectorization

2021-12-13 Thread Robin Dapp via Gcc-patches
Hi Richard, I incorporated all your remarks (sorry for the hunk from a different branch) except for this one: > Think it would be better to make it: > > if (use_bias_adjusted_len) > { > gcc_assert (i == 0); > > But do we need to do this? Code should only care abou

Re: [PATCH v3 1/7] ifcvt: Check if cmovs are needed.

2022-01-10 Thread Robin Dapp via Gcc-patches
Hi, I included the outstanding minor remarks and believe everything is OK'ed now. Still posting the ChangeLogs that I omitted before continuing. I'd expect some fallout on other targets (hopefully nothing major) since rtx costs are handled differently now for this code path. Regards Robin --

Re: [PATCH v3 2/7] ifcvt: Allow constants for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (noce_convert_multiple_sets): Allow constants. (bb_ok_for_noce_convert_multiple_sets): Likewise.

Re: [PATCH v3 3/7] ifcvt: Improve costs handling for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (bb_ok_for_noce_convert_multiple_sets): Estimate insns costs. (noce_process_if_block): Use potential costs.

Re: [PATCH v3 4/7] ifcvt/optabs: Allow using a CC comparison for emit_conditional_move.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * rtl.h (struct rtx_comparison): New struct that holds an rtx comparison. * config/rs6000/rs6000.c (rs6000_emit_minmax): Use struct instead of single parameters. (rs6000_emit_swsqrt): Likewise.

Re: [PATCH v3 5/7] ifcvt: Try re-using CC for conditional moves.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (cond_exec_get_condition): New parameter to allow getting the reversed comparison. (try_emit_cmove_seq): New function to facilitate creating a cmov sequence. (noce_convert_multiple_sets): Cr

Re: [PATCH v3 6/7] testsuite/s390: Add tests for noce_convert_multiple.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/testsuite/ChangeLog: * gcc.dg/ifcvt-4.c: Remove s390-specific check. * gcc.target/s390/ifcvt-two-insns-bool.c: New test. * gcc.target/s390/ifcvt-two-insns-int.c: New test. * gcc.target/s390/ifcvt-two-insns-long.c: New t

Re: [PATCH v3 7/7] ifcvt: Run second pass if it is possible to omit a temporary.

2022-01-10 Thread Robin Dapp via Gcc-patches
Posting the ChangeLog before pushing. -- gcc/ChangeLog: * ifcvt.c (noce_convert_multiple_sets_1): New function. (noce_convert_multiple_sets): Call function a second time if we can improve the first try.

Re: [PATCH] vect: Add bias parameter for partial vectorization

2022-01-10 Thread Robin Dapp via Gcc-patches
Hi Richard, > I think it would be better to fold this into the existing documentation > a bit more: [..] done. Fixed the remaining nits in the attached v5. Bootstrap and regtest are good on s390x, Power9 and i386. Regards Robin -- gcc/ChangeLog: * config/rs6000/vsx.md: Use const0 b

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
Hi Richard, this causes a bootstrap error on s390 (where IRA_HARD_REGNO_ADD_COST_MULTIPLIER is defined). rclass is used in the #define-guarded area. I guess you also wanted to move this to the new function ira_caller_save_cost? Regards Robin -- ../../gcc/ira-costs.c: In function ‘void ira_tun

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
Hi Richard, > Could you try the attached? build and bootstrap look OK with it. Testsuite shows lots of fallout but the proper bisect isn't finished yet. The commit before your series is still fine - the problem could also be after it, though. Will report back later. Thanks Robin

Re: [PATCH 5/6] ira: Consider modelling caller-save allocations as loop spills

2022-01-11 Thread Robin Dapp via Gcc-patches
> Could you try the attached? The series with the patch is OK from a testsuite point of view. The other problem appears later. Regards Robin

Re: [PATCH] VECT: Add WHILE_LEN pattern for decrement IV support for auto-vectorization

2023-04-12 Thread Robin Dapp via Gcc-patches
>> I think we can CC IBM folks to see whether we can make WHILE_LEN works >> for both IBM and RVV ? > > I've CCed them. Adding WHILE_LEN support to rs6000/s390x would be > mainly the "easy" way to get len-masked (epilog) loop support. I've > figured actually implementing WHILE_ULT for AVX512 in

<    8   9   10   11   12   13