Re: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold

2023-08-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 16 Aug 2023, Juzhe-Zhong wrote: > >> Hi, Richard and Richi. >> >> Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. >> It's supported in tree-ssa-math-opts.cc. However, GCC failed to support >> COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. >>

Re: [PATCH] Remove XFAIL from gcc/testsuite/gcc.dg/unroll-7.c

2023-08-21 Thread Richard Sandiford via Gcc-patches
Thiago Jung Bauermann via Gcc-patches writes: > This test passes since commit e41103081bfa "Fix undefined behaviour in > profile_count::differs_from_p", so remove the xfail annotation. > > Tested on aarch64-linux-gnu, armv8l-linux-gnueabihf and x86_64-linux-gnu. > > gcc/testsuite/ChangeLog: >

Re: [PATCH v2] mklog: handle Signed-Off-By, minor cleanup

2023-08-21 Thread Richard Sandiford via Gcc-patches
Marc Poulhiès via Gcc-patches writes: > Consider Signed-Off-By lines as part of the ending of the initial > commit to avoid having these in the middle of the log when the > changelog part is injected after. > > This is particularly usefull with: > > $ git gcc-commit-mklog --amend -s > > that can

Re: [pushed] c/c++: use positive tone in missing header notes [PR84890]

2023-06-18 Thread Richard Sandiford via Gcc-patches
David Malcolm via Gcc-patches writes: > Quoting "How a computer should talk to people" (as quoted > in "Concepts Error Messages for Humans"): > > "Various negative tones or actions are unfriendly: being manipulative, > not giving a second chance, talking down, using fashionable slang, > blaming. W

[committed] vect: Restore aarch64 bootstrap

2023-06-19 Thread Richard Sandiford via Gcc-patches
Spot-tested on aarch64-linux-gnu, pushed as obvious. Richard gcc/ * tree-vect-loop-manip.cc (vect_set_loop_condition_partial_vectors): Handle null niters_skip. --- gcc/tree-vect-loop-manip.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-loo

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-19 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote: >> IVOPTs has strip_offset which suffers from the same issues regarding >> integer overflow that split_constant_offset did but the latter was >> fixed quite some time ago. The following implements strip_offset >> in terms

Re: [PATCH][gensupport] drop suppport for define_cond_exec from compact syntac

2023-06-20 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > define_cond_exec does not support the special @@ syntax > and so can't support {@. As such just remove support > for it. > > Bootstrapped and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > PR bootstrap/110324 > * gensuppo

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, 19 Jun 2023, Richard Sandiford wrote: > >> Jeff Law writes: >> > On 6/16/23 06:34, Richard Biener via Gcc-patches wrote: >> >> IVOPTs has strip_offset which suffers from the same issues regarding >> >> integer overflow that split_constant_offset did but the latter

[pushed] aarch64: Robustify stack tie handling

2023-06-20 Thread Richard Sandiford via Gcc-patches
The SVE handling of stack clash protection copied the stack pointer to X11 before the probe and set up X11 as the CFA for unwind purposes: /* This is done to provide unwinding information for the stack adjustments we're about to do, however to prevent the optimizers from removing

[pushed] aarch64: Fix gcc.target/aarch64/sve/pcs failures

2023-06-20 Thread Richard Sandiford via Gcc-patches
Several gcc.target/aarch64/sve/pcs tests started failing after 6a2e8dcbbd4, because the tests weren't robust against whether an indirect argument register or the stack pointer was used as the base for stores. The patch allows either base register when there is only one indirect argument. It disab

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches > wrote: >> >> We have already use intermidate type in case WIDEN, but not for NONE, >> this patch extended that. >> >> I didn't do that in pattern recog since we need to know whether the >> stmt belo

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Richard Biener via Gcc-patches writes: >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches >> wrote: >>> >>> We have already use intermidate type in case WIDEN, but not for NONE, >>> this patch extended that. >>> >>> I didn't do that in pattern recog since we n

Re: [PATCH] tree-optimization/110243 - kill off IVOPTs split_offset

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The issue in the PR the change is fixing is that we end up with > an expression that overflows but uses signed arithmetic and so > we miscompile it later. IIRC the fixes to split_constant_offset > always were that the sum of the base + offset wasn't equal to > the origina

Re: [PATCH] [vect]Use intermiediate integer type for float_expr/fix_trunc_expr when direct optab is not existed.

2023-06-21 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jun 21, 2023 at 11:32 AM Richard Sandiford > wrote: >> >> Richard Sandiford writes: >> > Richard Biener via Gcc-patches writes: >> >> On Fri, Jun 2, 2023 at 3:01 AM liuhongt via Gcc-patches >> >> wrote: >> >>> >> >>> We have already use intermidate type in case

Re: [PATCH] Change fma_reassoc_width tuning for ampere1

2023-06-22 Thread Richard Sandiford via Gcc-patches
Di Zhao OS via Gcc-patches writes: > This patch enables reassociation of floating-point additions on ampere1. > This brings about 1% overall benefit on spec2017 fprate cases. (There > are minor regressions in 510.parest_r and 508.namd_r, analyzed here: > https://gcc.gnu.org/bugzilla/show_bug.cgi?i

Re: [PATCH V5] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-22 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > gcc/ChangeLog: > > * internal-fn.cc (expand_partial_store_optab_fn): Adapt for > LEN_MASK_STORE. > (internal_load_fn_p): Add LEN_MASK_LOAD. > (internal_store_fn_p): Add LEN_MASK_STORE. > (internal_fn_mask_index)

Re: [PATCH V6] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-23 Thread Richard Sandiford via Gcc-patches
Bernhard Reutner-Fischer writes: > On 23 June 2023 01:51:12 CEST, juzhe.zh...@rivai.ai wrote: >>From: Ju-Zhe Zhong > > I am sorry but I somehow overlooked a trivial spot in V5. > Nit which does not warrant an immediate next version, but please consider it > before pushing iff approved: > >>+

Re: [PATCH V6] VECT: Apply LEN_MASK_{LOAD,STORE} into vectorizer

2023-06-23 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Address comments from Richard and Bernhard from V5 patch. > V6 fixed all issues according their comments. > > gcc/ChangeLog: > > * internal-fn.cc (expand_partial_store_optab_fn): Adapt for > LEN_MASK_STORE. > (internal_load_fn_

Re: [PATCH 3/3] [aarch64] Adjust testcase to match assembly output after r14-2007.

2023-06-26 Thread Richard Sandiford via Gcc-patches
liuhongt writes: > The new assembly looks better than original one, so I adjust those testcases. The new loops are shorter, but they process only half the amount of data per iteration. The problem is that the new vectoriser code generates multiple statements but only costs one. I'll post a fix

[PATCH] vect: Cost intermediate conversions

2023-06-26 Thread Richard Sandiford via Gcc-patches
g:6f19cf7526168f8 extended N-vector to N-vector conversions to handle cases where an intermediate integer extension or truncation is needed. This patch adjusts the cost to account for these intermediate conversions. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/

Re: [PATCH] tree-optimization/110381 - preserve SLP permutation with in-order reductions

2023-06-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following fixes a bug that manifests itself during fold-left > reduction transform in picking not the last scalar def to replace > and thus double-counting some elements. But the underlying issue > is that we merge a load permutation into the in-order reduction > whic

Re: [PATCH] Change fma_reassoc_width tuning for ampere1

2023-06-26 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich writes: > Richard, > > OK for backport to GCC-13? Yeah, OK for GCC 13 too. Thanks, Richard > Thanks, > Philipp. > > On Thu, 22 Jun 2023 at 16:18, Richard Sandiford via Gcc-patches > wrote: >> >> Di Zhao OS via Gcc-patches writes: >

Re: [PATCH] New finish_compare_by_pieces target hook (for x86).

2023-06-26 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Sun, Jun 25, 2023 at 7:39 AM Roger Sayle > wrote: >> >> >> On Tue, 13 June 2023 12:02, Richard Biener wrote: >> > On Mon, Jun 12, 2023 at 4:04 PM Roger Sayle >> > wrote: >> > > The following simple test case, from PR 104610, shows that memcmp () >> >

Re: [PATCH 1/2] Mid engine setup [SU]ABDL

2023-06-26 Thread Richard Sandiford via Gcc-patches
Thanks for doing this. Generally looks good, but some comments below. Oluwatamilore Adebayo writes: > From: oluade01 > > This updates vect_recog_abd_pattern to recognize the widening > variant of absolute difference (ABDL, ABDL2). > > gcc/ChangeLog: > > * internal-fn.cc (widening_fn_p, de

Re: [PATCH 2/2] AArch64: New RTL for ABDL

2023-06-26 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This patch adds new RTL for ABDL (sabdl, sabdl2, uabdl, uabdl2). > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (vec_widen_abdl_lo_, vec_widen_abdl_hi_): > Expansions for abd vec widen optabs. > (aarch64_abdl_in

[PATCH] gengtype: Handle braced initialisers in structs

2023-06-26 Thread Richard Sandiford via Gcc-patches
I have a patch that adds braced initializers to a GTY structure. gengtype didn't accept that, because it parsed the "{ ... }" in " = { ... };" as the end of a statement (as "{ ... }" would be in a function definition) and so it didn't expect the following ";". This patch explicitly handles initial

Re: [PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-06-26 Thread Richard Sandiford via Gcc-patches
Andrew Carlotti via Gcc-patches writes: > Many intrinsics currently depend on both an architecture version and a > feature, despite the corresponding instructions being available within > GCC at lower architecture versions. > > LLVM has already removed these explicit architecture version > depende

Re: [PATCH 1/2] Mid engine setup [SU]ABDL

2023-06-27 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: >> - VTYPE x, y, out; >> + VTYPE x, y; >> + WTYPE out; >> type diff; >> loop i in range: >> S1 diff = x[i] - y[i] >> S2 out[i] = ABS_EXPR ; >> >> - where 'type' is a integer and 'VTYPE' is a vector of integers >> - the same size as

Re: [SVE] Fold svdupq to VEC_PERM_EXPR if elements are not constant

2023-06-27 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > Sorry I forgot to commit this patch, which you had approved in: > https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615308.html > > Just for context for the following test: > svint32_t f_s32(int32x4_t x) > { > return svdupq_s32 (x[0], x[1], x[2], x[

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-28 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: > Hi Juzhe, > > I find the bug description rather confusing. What I can see is that > the constant in the literal pool is indeed wrong but how would DSE or > so play a role there? Particularly only for the smaller modes? > > My suspicion would be that the const

[PATCH] A couple of va_gc_atomic tweaks

2023-06-28 Thread Richard Sandiford via Gcc-patches
The only current user of va_gc_atomic is Ada's: vec It uses the generic gt_pch_nx routines (with gt_pch_nx being the “note pointers” hooks), such as: template void gt_pch_nx (vec *v) { extern void gt_pch_nx (T &); for (unsigned i = 0; i < v->length (); i++)

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Robin Dapp via Gcc-patches writes: >> Hi Juzhe, >> >> I find the bug description rather confusing. What I can see is that >> the constant in the literal pool is indeed wrong but how would DSE or >> so play a role there? Particularly only for the smaller modes? >> >>

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: >> Sorry, only realised later, but: if the precision can cover fewer >> bytes than the bitsize, I suppose there ought to be some zero-byte >> padding at the end as well. > It looks like this problem, and also the padding, has been discussed > before when the precision of VNx1BI

Re: [PATCH 1/2] Mid engine setup [SU]ABDL

2023-06-29 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This updates vect_recog_abd_pattern to recognize the widening > variant of absolute difference (ABDL, ABDL2). > > gcc/ChangeLog: > > * internal-fn.cc (widening_fn_p, decomposes_to_hilo_fn_p): > Add IFN_VEC_WIDEN_ABD to the switch stat

Re: [PATCH 2/2] AArch64: New RTL for ABDL

2023-06-29 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This patch adds new RTL for ABDL (sabdl, sabdl2, uabdl, uabdl2). > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (vec_widen_abdl_lo_, vec_widen_abdl_hi_): > Expansions for abd vec widen optabs. > (aarch64_abdl_in

Re: [PATCH][RFC] target/110456 - avoid loop masking with zero distance dependences

2023-06-29 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > With applying loop masking to epilogues on x86_64 AVX512 we see > some significant performance regressions when evaluating SPEC CPU 2017 > that are caused by store-to-load forwarding fails across outer > loop iterations when the inner loop does not iterate. Consider > >

Re: [PATCH V3] RISC-V: Fix bug of pre-calculated const vector mask for VNx1BI, VNx2BI and VNx4BI

2023-06-29 Thread Richard Sandiford via Gcc-patches
Kito Cheng writes: > Hi Robin: > >> diff --git a/gcc/lto/lto-lang.cc b/gcc/lto/lto-lang.cc >> index 52d7626e92e..14d419c2013 100644 >> --- a/gcc/lto/lto-lang.cc >> +++ b/gcc/lto/lto-lang.cc >> @@ -1050,7 +1050,7 @@ lto_type_for_mode (machine_mode mode, int unsigned_p) >>else if (GET_MODE_CLASS

Re: [PATCH 1/2] Mid engine setup [SU]ABDL

2023-06-30 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This updates vect_recog_abd_pattern to recognize the widening > variant of absolute difference (ABDL, ABDL2). > > gcc/ChangeLog: > > * internal-fn.cc (widening_fn_p, decomposes_to_hilo_fn_p): > Add IFN_VEC_WIDEN_ABD to the switch stat

Re: [PATCH 4/9] vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP

2023-07-02 Thread Richard Sandiford via Gcc-patches
Kewen Lin writes: > @@ -9743,11 +9739,23 @@ vectorizable_load (vec_info *vinfo, >unsigned int n_groups = 0; >for (j = 0; j < ncopies; j++) > { > - if (nloads > 1) > + if (nloads > 1 && !costing_p) > vec_alloc (v, nloads); > gimple *new_stmt = NUL

Re: [PATCH 0/9] vect: Move costing next to the transform for vect load

2023-07-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin wrote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> next to the transform, in order to make it easier to keep >> costing and the transform in

Re: [PATCH V5] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-07-02 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richi and Richard. > > This patch is adding LEN_MASK_{GATHER_LOAD,SCATTER_STORE} to allow targets > handle flow control by mask and loop control by length on gather/scatter > memory > operations. Consider this following case: > > #include

Re: [PATCH] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > > According to Richard's review comments: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html > > current len, bias and mask order is not reasonable. > > Change {len,mask,bias} into {len,bias,mask}. > > Th

[PATCH] aarch64: Fix vector-to-vector vec_extract

2023-07-03 Thread Richard Sandiford via Gcc-patches
The documentation says: - @cindex @code{vec_extract@var{m}@var{n}} instruction pattern @item @samp{vec_extract@var{m}@var{n}} Extract given field from the vector value. [...] The @var{n} mode is the mode of the field or vect

Re: [PATCH V2] Middle-end: Change order of LEN_MASK_LOAD/LEN_MASK_STORE arguments

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard. I fix the order as you suggeted. > > Before this patch, the order is {len,mask,bias}. > > Now, after this patch, the order becomes {len,bias,mask}. > > Since you said we should not need 'internal_fn_bias_index', the bias index > s

Re: [PATCH] middle-end/110495 - avoid associating constants with (VL) vectors

2023-07-03 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > When trying to associate (v + INT_MAX) + INT_MAX we are using > the TREE_OVERFLOW bit to check for correctness. That isn't > working for VECTOR_CSTs and it can't in general when one considers > VL vectors. It looks like it should work for COMPLEX_CSTs but

Re: [PATCH V7] Machine Description: Add LEN_MASK_{GATHER_LOAD, SCATTER_STORE} pattern

2023-07-03 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richi and Richard. > > Base one the review comments from Richard: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623405.html > > I change len_mask_gather_load/len_mask_scatter_store order into: > {len,bias,mask} > > We adjust adding

Re: [PATCH] tree-optimization/110310 - move vector epilogue disabling to analysis phase

2023-07-03 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following removes late deciding to elide vectorized epilogues to > the analysis phase and also avoids altering the epilogues niter. > The costing part from vect_determine_partial_vectors_and_peeling is > moved to vect_analyze_loop_costing where we use the main loop > a

Re: [PATCH] tree-optimization/110310 - move vector epilogue disabling to analysis phase

2023-07-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 4 Jul 2023, Richard Biener wrote: > >> On Mon, 3 Jul 2023, Richard Sandiford wrote: >> >> > Richard Biener writes: >> > > The following removes late deciding to elide vectorized epilogues to >> > > the analysis phase and also avoids altering the epilogues niter.

Re: [PATCH][RFC] target/110456 - avoid loop masking with zero distance dependences

2023-07-04 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, 29 Jun 2023, Richard Biener wrote: > >> On Thu, 29 Jun 2023, Richard Sandiford wrote: >> >> > Richard Biener writes: >> > > With applying loop masking to epilogues on x86_64 AVX512 we see >> > > some significant performance regressions when evaluating SPEC CPU 20

Re: [PATCH] middle-end/110541 - VEC_PERM_EXPR documentation is off

2023-07-05 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following adjusts the tree.def documentation about VEC_PERM_EXPR > which wasn't adjusted when the restrictions of permutes with constant > mask were relaxed. I was going to complain about having two copies of the documentation, but then I realised that

Re: [PATCH][RFC] target/110456 - avoid loop masking with zero distance dependences

2023-07-05 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Tue, 4 Jul 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Thu, 29 Jun 2023, Richard Biener wrote: >> > >> >> On Thu, 29 Jun 2023, Richard Sandiford wrote: >> >> >> >> > Richard Biener writes: >> >> > > With applying loop masking to epilogues on x8

Re: [PATCH] Vect: select small VF for epilog of unrolled loop (PR tree-optimization/110474)

2023-07-05 Thread Richard Sandiford via Gcc-patches
Hao Liu OS via Gcc-patches writes: > Hi, > > If a loop is unrolled during vectorization (i.e. suggested_unroll_factor > 1), > the VFs of both main and epilog loop are enlarged. The epilog vect loop is > specific for a loop with small iteration counts, so a large VF may hurt > performance. > > Thi

Re: [PATCH] Vect: use a small step to calculate induction for the unrolled loop (PR tree-optimization/110449)

2023-07-06 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, Jul 5, 2023 at 8:44 AM Hao Liu OS via Gcc-patches > wrote: >> >> Hi, >> >> If a loop is unrolled by n times during vectoriation, two steps are used to >> calculate the induction variable: >> - The small step for the unrolled ith-copy: vec_1 = vec

Re: [PATCH] Vect: use a small step to calculate induction for the unrolled loop (PR tree-optimization/110449)

2023-07-07 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 06.07.2023 um 19:50 schrieb Richard Sandiford : >> >> Richard Biener via Gcc-patches writes: On Wed, Jul 5, 2023 at 8:44 AM Hao Liu OS via Gcc-patches wrote: Hi, If a loop is unrolled by n times during vectoriation, two steps are use

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: > Hi, > > upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The > helper functions in gen* rely on the opcode as well as two modes fitting > into an unsigned int (a signed int even if we consider the qsort default > comparison function). This

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Robin Dapp via Gcc-patches writes: >> Hi, >> >> upcoming changes for RISC-V will have us exceed 256 modes or 8 bits. The >> helper functions in gen* rely on the opcode as well as two modes fitting >> into an unsigned int (a signed int even if we consider the qsort defa

Re: [PATCH] genopinit: Allow more than 256 modes.

2023-07-11 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: > Ok so the consensus seems to rather stay with 32 bits and only > change the shift to 10/20? Yeah. The check would then be: if (NUM_OPTABS > 0xfff || NUM_MACHINE_MODES > 0x3ff) fatal ("genopinit range assumptions invalid"); > As MACHINE_MODE_BITSIZE is already > 16 we

Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types

2023-02-13 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, 13 Feb 2023, juzhe.zh...@rivai.ai wrote: > >> >> But then GET_MODE_PRECISION (GET_MODE_INNER (..)) should always be 1? >> Yes, I think so. >> >> Let's explain RVV more clearly. >> Let's suppose we have vector-length = 64bits in RVV CPU. >> VNx1BI is exactly 1 cons

Re: [PATCH] RISC-V: Bugfix for mode tieable of the rvv bool types

2023-02-13 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: >>> What's the byte size of VNx1BI, expressed as a function of N? >>> If it's CEIL (N, 8) then we don't have a way of representing that yet. > N is a poly value. > RVV like SVE support scalable vector. > the N is poly (1,1). > > VNx1B mode nunits = poly(1,1) units. >

[Ping] ifcvt: Fix regression in aarch64/fcsel_1.c

2023-02-13 Thread Richard Sandiford via Gcc-patches
Ping for the patch below aarch64/fcsel_1.c contains: double f_2 (double a, double b, double c, double d) { if (a > b) return c; else return d; } which started failing in the GCC 12 timeframe. When it passed, the RTL had the form: [A] (set (reg ret) (reg c)) (set (pc) (if_

[Ping^3] gomp: Various fixes for SVE types [PR101018]

2023-02-13 Thread Richard Sandiford via Gcc-patches
Ping^3 [https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606741.html] Various parts of the omp code checked whether the size of a decl was an INTEGER_CST in order to determine whether the decl was variable-sized or not. If it was variable-sized, it was expected to have a DECL_VALUE_E

Re: [Ping] ifcvt: Fix regression in aarch64/fcsel_1.c

2023-02-13 Thread Richard Sandiford via Gcc-patches
Richard Sandiford via Gcc-patches writes: > Ping for the patch below Ugh, somehow missed Jeff's OK over the weekend. Sorry for the noise! Richard

Re: [PATCH 2/2] vect: Make partial trapping ops use predication [PR96373]

2023-02-13 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi Richard, > > on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote: >> PR96373 points out that a predicated SVE loop currently converts >> trapping unconditional ops into unpredicated vector ops. Doing >> the operation on inact

Re: [RFC PATCH v1 08/10] ifcvt: add if-conversion to conditional-zero instructions

2023-02-13 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich > wrote: >> >> Some architectures, as it the case on RISC-V with the proposed >> ZiCondOps and the vendor-defined XVentanaCondOps, define a >> conditional-zero instruction that is equivalent to: >> - the posi

Re: [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681]

2023-02-13 Thread Richard Sandiford via Gcc-patches
Jeff Law writes: > On 2/7/23 03:29, Richard Sandiford via Gcc-patches wrote: >> In this PR we had a write to one vector of a 4-vector tuple. >> The vector had mode V1DI, and the target doesn't provide V1DI >> moves, so this was converted into: >> >>

Re: [PATCH 2/2] vect: Make partial trapping ops use predication [PR96373]

2023-02-14 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > on 2023/2/13 21:57, Richard Sandiford wrote: >> "Kewen.Lin" writes: >>> Hi Richard, >>> >>> on 2023/1/27 19:08, Richard Sandiford via Gcc-patches wrote: >>>> PR96373 points out that a predicated SVE loop

Re: [PATCH] debug: Support "phrs" for dumping a HARD_REG_SET

2023-02-14 Thread Richard Sandiford via Gcc-patches
Hans-Peter Nilsson via Gcc-patches writes: > Ok to commit? It survived both a cris-elf regtest and a > x86_64-linux-gnu native regtest. :) OK, thanks. Richard > 8< > The debug-function in sel-sched-dump.cc that would be > suitable for a hookup to a command in gdb is guarded by > #ifd

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi, > >> > I avoided open coding it with add and shift because it creates a 4 >> > instructions (and shifts which are typically slow) dependency chain >> > instead of a load and multiply. This change, unless the target is >> > known to optimize it further is unlikely to

Re: [PATCH 1/2, GCC12] AArch64: Update transitive closures of aes, sha2 and sha3 extensions

2023-02-27 Thread Richard Sandiford via Gcc-patches
Tejas Belagod writes: > Transitive closures of architectural extensions have to be manually maintained > for AARCH64_OPT_EXTENSION list. Currently aes, sha2 and sha3 extensions add > AARCH64_FL_SIMD has their dependency - this does not automatically pull in the > transitive dependence of AARCH64_

Re: [PATCH 2/2, GCC12] AArch64: Gate various crypto intrinsics availability based on features

2023-02-27 Thread Richard Sandiford via Gcc-patches
Tejas Belagod writes: > The 64-bit variant of PMULL{2} and AES instructions are available if FEAT_AES > is implemented according to the Arm ARM [1]. Similarly FEAT_SHA1 and > FEAT_SHA256 enable the use of SHA1 and SHA256 instruction variants. > This patch fixes arm_neon.h to correctly reflect the

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-02-27 Thread Richard Sandiford via Gcc-patches
Sorry for the slow reply, been away for a couple of weeks. "incarnation.p.lee--- via Gcc-patches" writes: > From: Pan Li > > Fix the bug of the rvv bool mode precision with the adjustment. > The bits size of vbool*_t will be adjusted to > [1, 2, 4, 8, 16, 32, 64] according to t

Re: [PATCH] constraint: fix relaxed memory and repeated constraint handling

2023-02-27 Thread Richard Sandiford via Gcc-patches
"Victor L. Do Nascimento" writes: > The function `constrain_operands' lacked the logic to consider relaxed > memory constraints when "traditional" memory constraints were not > satisfied, creating potential issues as observed during the reload > compilation pass. > > In addition, it was observed t

Re: [PATCH] vect: Check that vector factor is a compile-time constant

2023-02-27 Thread Richard Sandiford via Gcc-patches
FWIW, this patch looks good to me. I'd argue it's a regression fix of kinds, in that the current code was correct before variable VF and became incorrect after variable VF. It might be possible to trigger the problem on SVE too, with a sufficiently convoluted test case. (Haven't tried though.) R

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > Hi! > > The following testcase is miscompiled on aarch64. The problem is that > aarch64 with TARGET_SIMD is !SHIFT_COUNT_TRUNCATED target with > targetm.shift_truncation_mask (DImode) == 0 which has HAVE_conditional_move > true. If a doubleword shift (in this case TImode)

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Uros Bizjak writes: > On Fri, Feb 17, 2023 at 8:38 AM Richard Biener wrote: >> >> On Thu, 16 Feb 2023, Uros Bizjak wrote: >> >> > simplify_subreg can return VOIDmode const_int operand and will >> > cause ICE in simplify_gen_subreg when this operand is passed to it. >> > >> > The patch prevents VO

Re: [Patch] gcc.dg/overflow-warn-9.c: exclude from LLP64

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jonathan Yong via Gcc-patches writes: > This test is for LP64 only, exclude LLP64 too. > Patch OK? OK, thanks. Richard > From fbc83ae10df1a0e10c302fb0fee13092eb65818e Mon Sep 17 00:00:00 2001 > From: Jonathan Yong <10wa...@gmail.com> > Date: Mon, 27 Feb 2023 09:49:31 + > Subject: [PATCH] gc

Re: [Patch] gcc.dg/memchr-3.c: fix for LLP64

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jonathan Yong via Gcc-patches writes: > Attached patch OK? > > gcc.dg/memchr-3.c: fix for LLP64 > > gcc/testsuite/ChangeLog: > > PR middle-end/97956 > * gcc.dg/memchr-3.c (memchr): fix long to size_t in > prototype. > > From 194eb3d43964276b

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Mon, Feb 27, 2023 at 03:34:11PM +, Richard Sandiford wrote: >> > The following testcase is miscompiled on aarch64. The problem is that >> > aarch64 with TARGET_SIMD is !SHIFT_COUNT_TRUNCATED target with >> > targetm.shift_truncation_mask (DImode) == 0 which has HAVE

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Mon, Feb 27, 2023 at 07:51:21PM +, Richard Sandiford wrote: >> I think RTL and gimple are different in that respect. >> SHIFT_COUNT_TRUNCATED's effect on shifts is IMO a bit like >> CTZ_DEFINED_VALUE_AT_ZERO's effect on CTZ: it enumerates common >> target-specific be

Re: [PATCH] optabs: Fix up expand_doubleword_shift_condmove for shift_mask == 0 [PR108803]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Mon, Feb 27, 2023 at 08:43:27PM +, Richard Sandiford wrote: >> My argument was that !SHIFT_COUNT_TRUNCATED and >> C?Z_DEFINED_VALUE_AT_ZERO==0 mean that the behaviour is undefined >> only in the sense that target-independent code doesn't know what >> the behaviour is

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-27 Thread Richard Sandiford via Gcc-patches
Tamar Christina via Gcc-patches writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, February 27, 2023 12:12 PM >> To: Tamar Christina >> Cc: Tamar Christina via Gcc-patches ; nd >> ; rguent...@suse.de; j...@ventanamicro.com >> Subject: Re: [PATCH 1/2]middle-end: Fix

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-02-28 Thread Richard Sandiford via Gcc-patches
"Li, Pan2" writes: > Hi Richard Sandiford, > > After some investigation, I am not sure if it is possible to make it general > without any changes to exact_div. We can add one method like below to get the > unit poly for all possible N. > > template > inline POLY_CONST_RESULT (N, Ca, Ca) > normal

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-28 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, February 27, 2023 9:33 PM >> To: Tamar Christina via Gcc-patches >> Cc: Tamar Christina ; nd ; >> rguent...@suse.de; j...@ventanamicro.com >> Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatchi

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-02-28 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Tuesday, February 28, 2023 11:09 AM >> To: Tamar Christina >> Cc: Tamar Christina via Gcc-patches ; nd >> ; rguent...@suse.de; j...@ventanamicro.com >> Subject: Re: [PATCH 1/2]middle-end: Fix wrong overmatc

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread Richard Sandiford via Gcc-patches
"Li, Pan2" writes: > Hi Richard Sandiford, > > Just tried the overloaded constant divisors with below print div, it works as > you mentioned, 😉! > > printf ("can_div_away_from_zero_p (mode_precision[E_%smode], " > "BITS_PER_UNIT, &mode_size[E_%smode]);\n", m->name, m->name); > > template

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread Richard Sandiford via Gcc-patches
盼 李 via Gcc-patches writes: > Thank you all for your quick response. > > As juzhe mentioned, the memory access of RISC-V will be always aligned to the > bytes boundary with the compact mode, aka ceil(vl / 8) bytes for vbool*. OK, thanks to both of you. This is what I'd have expected. In that c

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread Richard Sandiford via Gcc-patches
盼 李 via Gcc-patches writes: > Just have a test with the below code, the [0x4, 0x4] test comes from VNx4BI. > You can notice that the mode size is unchanged. > > printf ("can_div_away_from_zero_p (mode_precision[E_%smode], " > "BITS_PER_UNIT, &mode_size[E_%smode]);\n", m->name, m->name); > >

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread Richard Sandiford via Gcc-patches
Pan Li via Gcc-patches writes: > I am not very familiar with the memory pattern, maybe juzhe can provide more > information or correct me if anything is misleading. > > The different precision try to resolve the below bugs, the second vlm(with > different size of load bytes compared to first one

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-01 Thread Richard Sandiford via Gcc-patches
"Li, Pan2" writes: > Thanks all for so much valuable and helpful materials. > > As I understand (Please help to correct me if any mistake.), for the VNx*BI > (aka, 1, 2, 4, 8, 16, 32, 64), > the precision and mode size need to be adjusted as below. > > Precision size [1, 2, 4, 8, 16, 32, 64] > Mo

Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-02 Thread Richard Sandiford via Gcc-patches
Thanks for the explanation about the sizes. "juzhe.zh...@rivai.ai" writes: > Fortunately, we won't have aggregates, arrays of vbool*_t in the future. > I think it's not an issue. But isn't it possible to allocate a char/byte array and construct vbool*_ts at addresses calculated by intrinsics? E

Re: [PATCH v2] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-03-02 Thread Richard Sandiford via Gcc-patches
pan2...@intel.com writes: > From: Pan Li > > Fix the bug of the rvv bool mode precision with the adjustment. > The bits size of vbool*_t will be adjusted to > [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The > adjusted mode precison of vbool*_t will help unde

Re: [PATCH] simplify-rtx: Fix VOIDmode operand handling in simplify_subreg [PR108805]

2023-03-02 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hey both, > > Sorry about that, don't know how I missed those. Just running a test on > that now and will commit when it's done. I assume the comment and 0 -> > byte change can be seen as obvious, especially since it was supposed to > be in my original patch...

[PATCH] vect: Fix voluntarily-masked negative conditionals [PR108430]

2023-03-02 Thread Richard Sandiford via Gcc-patches
vectorizable_condition checks whether a COND_EXPR condition is used elsewhere with a loop mask. If so, it applies the loop mask to the COND_EXPR too, to reduce the number of live masks and to increase the chance of combining the AND with the comparison. There is also code to do this for inverted

[PATCH] Avoid creating (const (reg ...)) [PR108603]

2023-03-02 Thread Richard Sandiford via Gcc-patches
convert_memory_address_addr_space_1 has two modes: one in which it tries to create a self-contained RTL expression (which might fail) and one in which it can emit new instructions where necessary. When handling a CONST, the function recurses into the CONST's operand and then constifies the result.

Re: [PATCH 0/8] aarch64: testsuite: Fix test failures with --enable-default-pie or --enable-default-ssp

2023-03-02 Thread Richard Sandiford via Gcc-patches
Xi Ruoyao writes: > Hi, > > This patch series fixes a lot of test failures with --enable-default-pie > or --enable-default-ssp for AArch64 target. Only test files are changed > to disable PIE or SSP to satisify the expectation of the developer who > programmed the test. > > Bootstrapped and regte

Re: [PATCH v2] MIPS: Add buildtime option to set msa default

2023-03-02 Thread Richard Sandiford via Gcc-patches
"Junxian Zhu" writes: > From: Junxian Zhu > > Add buildtime option to decide whether will compiler build with `-mmsa` > option default. > > gcc/ChangeLog: > * config.gcc: add -with-{no-}msa build option. > * config/mips/mips.h: Likewise. > * doc/install.texi: Likewise. Thanks,

Re: [PATCH] MIPS: Bugfix for fix Dejagnu issues with RTL checking enabled.

2023-03-02 Thread Richard Sandiford via Gcc-patches
"Xin Liu" writes: > From: Robert Suchanek > > gcc/ChangeLog: > >* config/mips/mips.cc (mips_set_text_contents_type): Modified parameter >* config/mips/mips-protos.h (mips_set_text_contents_type): Likewise > > Signed-off-by: Xin Liu Thanks, pushed to trunk. I guess this is a regression

Re: [Patch] gcc.dg/overflow-warn-9.c: exclude from LLP64

2023-03-02 Thread Richard Sandiford via Gcc-patches
Jonathan Yong via Gcc-patches writes: > On 2/28/23 03:06, Hans-Peter Nilsson wrote: >> >> On Mon, 27 Feb 2023, Jonathan Yong via Gcc-patches wrote: >> >>> This test is for LP64 only, exclude LLP64 too. >>> Patch OK? >> >> I may be confused, but you're not making use of the "llp64" >> effective

Re: [Patch] gcc.dg/memchr-3.c: fix for LLP64

2023-03-02 Thread Richard Sandiford via Gcc-patches
Jonathan Yong <10wa...@gmail.com> writes: > On 2/27/23 16:55, Richard Sandiford wrote: >> Jonathan Yong via Gcc-patches writes: >>> Attached patch OK? >>> >>> gcc.dg/memchr-3.c: fix for LLP64 >>> >>> gcc/testsuite/ChangeLog: >>> >>> PR middle-end/97956 >>>

Re: [Patch] gcc.dg/overflow-warn-9.c: exclude from LLP64

2023-03-02 Thread Richard Sandiford via Gcc-patches
Jonathan Yong <10wa...@gmail.com> writes: > On 3/2/23 10:44, Richard Sandiford wrote: >>> diff --git a/gcc/testsuite/gcc.dg/overflow-warn-9.c >>> b/gcc/testsuite/gcc.dg/overflow-warn-9.c >>> index 57c0f17bc91..ae588bd8491 100644 >>> --- a/gcc/testsuite/gcc.dg/overflow-warn-9.c >>> +++ b/gcc/testsu

<    1   2   3   4   5   6   7   8   9   10   >