Re: decremnt IV patch create fails on PowerPC

2023-05-30 Thread Richard Sandiford via Gcc-patches
My understanding was that we went into this knowing that the IVs would defeat SCEV analysis. Apparently that wasn't a problem for RVV, but it's not surprising that it is a problem in general. This isn't just about SELECT_VL though. We use the same type of IV for cases what aren't going to use SE

Re: decremnt IV patch create fails on PowerPC

2023-05-30 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> But how easy would it be to extend SCEV analysis, via a pattern match? >> The evolution of the IV phi wrt the inner loop is still a normal SCEV. > > No, the IV isn't a normal SCEV, the final value is different. Which part of the IV though? Won't all executions of the la

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Follow Richi's suggestion, I change current decrement IV flow from: > > do { >remain -= MIN (vf, remain); > } while (remain != 0); > > into: > > do { >old_remain = remain; >len = MIN (vf, remain); >remain -= vf; > } while (old_r

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Before this patch: > foo: > ble a2,zero,.L5 > csrr a3,vlenb > srli a4,a3,2 > .L3: > minu a5,a2,a4 > vsetvli zero,a5,e32,m1,ta,ma > vle32.v v2,0(a1) > vle32.v v1,0(a0) > vsetvli t1,zero,e32,m1,ta,ma > vadd.vv v1,v1,v2 > vsetvli zero,a5,e32,m1,ta,ma > vse32.v v1,0(a0

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-30 Thread Richard Sandiford via Gcc-patches
"juzhe.zhong" writes: > Maybe we can include rgroup number into select vl pattern?So that, I always > use select vl pattern. In my backend, if it is single rgroup,we gen vsetvl, > otherwise we gen min. That just seems to be a way of hiding an “is the target RVV?” test though. IMO targets shouldn

Re: [aarch64] Code-gen for vector initialization involving constants

2023-05-30 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi Richard, > The s32 case for single constant patch doesn't regress now after the > above commit. > Bootstrapped+tested on aarch64-linux-gnu, and verified that the new > tests pass for aarch64_be-linux-gnu. > Is it OK to commit ? > > Thanks, > Prathamesh > > [aarch64

Re: [PATCH] Refactor wi::bswap as a function (instead of a method).

2023-05-30 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > This patch implements Richard Sandiford's suggestion from > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html > that wi::bswap (and a new wi::bitreverse) should be functions, > and ideally only accessors are member functions. This patch > implements the first

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: > >> Hi?all. I have posted my several investigations: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620105.html >> https://gcc.gnu.org/pipermail/gcc

Re: [PATCH] VECT: Change flow of decrement IV

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Wed, 31 May 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 31 May 2023, juzhe.zh...@rivai.ai wrote: >> > >> >> Hi?all. I have posted my several investigations: >> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620101.html

Re: [PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a > pattern to match the new GIMPLE form. > > With this patch, gcc.target/aarch64/rev16_2.c passes again. > > 2023-05-31 Christophe Lyon > > PR target/110039 > gcc/ > * config/aarch64/

Re: [PATCH] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On Wed, 31 May 2023 at 11:49, Richard Sandiford > wrote: > >> Christophe Lyon writes: >> > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a >> > pattern to match the new GIMPLE form. >> > >> > With this patch, gcc.target/aarch64/rev16_2.c passes agai

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-05-31 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote: >> >> Hi all, >> >> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just >> do it because the GCC middle-end models DIVMOD's return value as >> "complex int" type, and there are no ve

Re: [PATCH v2] aarch64: Add pattern for bswap + rotate [PR 110039]

2023-05-31 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > After commit g:d8545fb2c71683f407bfd96706103297d4d6e27b, we missed a > pattern to match the new GIMPLE form. > > With this patch, gcc.target/aarch64/rev16_2.c passes again. > > 2023-05-31 Christophe Lyon > > PR target/110039 > gcc/ > * config/aarch64/

Re: [PATCH V3] VECT: Change flow of decrement IV

2023-06-02 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richi. I am gonna merge it after Richard's final approve. Thanks for checking, but no need to wait for a second ack from me! Please go ahead and commit. Richard

Re: [PATCH 2/3] Refactor widen_plus as internal_fn

2023-06-02 Thread Richard Sandiford via Gcc-patches
Just some very minor things. "Andre Vieira (lists)" writes: > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index > 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..348bee35a35ae4ed9a8652f5349f430c2733e1cb > 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -90,6 +90,71 @@ loo

Re: [PATCH] VECT: Add SELECT_VL support

2023-06-04 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review. I don't know the IV-related parts well enough to review those properly, but they looked reasonable to me. Hopefully Richi can comment. I'm curious though. For: > + tree step = vect_dr_behavior (vinfo, dr_info)->step; > + > + [...] > + poly_uint64 bytesize = GET_MO

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-04 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > + /* If we're using decrement IV approach in loop control, we can use output > of > + SELECT_VL to adjust IV of loop control and data reference when it > satisfies > + the following checks: > + > + (a) SELECT_VL is supported by the target. > + (b) L

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. Thanks for the comments. > >>> If we use SELECT_VL to refer only to the target-independent ifn, I don't >>> see why this last bit is true. > Could you give me more details and information about this since I am not sure > whether I catch up with you.

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > "juzhe.zh...@rivai.ai" writes: >> Hi, Richard. Thanks for the comments. >> If we use SELECT_VL to refer only to the target-independent ifn, I don't see why this last bit is true. >> Could you give me more details and information about this since I am not >>

Re: [PATCH V2] VECT: Add SELECT_VL support

2023-06-05 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. > >>> No, I meant that the comment I quoted seemed to be saying that solution >>> 3 wasn't possible. The comment seemed to say that we would need to do >>> solution 1. > I am so sorry that I didn't write the comments accurately. > Could you help me wi

Re: [PATCH] New wi::bitreverse function.

2023-06-05 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > This patch provides a wide-int implementation of bitreverse, that > implements both of Richard Sandiford's suggestions from the review at > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/618215.html of an > improved API (as a stand-alone function matching the bswap refa

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-05 Thread Richard Sandiford via Gcc-patches
Looks good! Just some minor comments: Tamar Christina writes: > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index > 6a435eb44610960513e9739ac9ac1e8a27182c10..1437ab55b260ab5c876e92d59ba39d24bffc6276 > 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -27,6 +27,7 @@ See the next

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-06 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: >> diff --git a/gcc/gensupport.h b/gcc/gensupport.h >> index >> a1edfbd71908b6244b40f801c6c01074de56777e..7925e22ed418767576567cad583bddf83c0846b1 >> 100644 >> --- a/gcc/gensupport.h >> +++ b/gcc/gensupport.h >> @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-06 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> >int operand_number; /* Operand index in the big array. */ >> >int output_format; /* INSN_OUTPUT_FORMAT_*. */ >> > + bool compact_syntax_p; >> >struct operand_data operand[MAX_MAX_OPERANDS]; }; >> > >> > @@ -700,12 +702,57 @@ proc

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-06 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab, uabd_optab

Re: [PATCH] rtl: AArch64: New RTL for ABD

2023-06-06 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: >> It would be good to mark all of these functions with __attribute__((noipa)), >> since I think interprocedural optimisations might otherwise defeat the >> runtime test in abd_run_1.c (in the sense that we might end up folding >> things at compile time and not testin

Re: [PATCH 0/3] aarch64: ls64 builtin fixes [PR110100,PR110132]

2023-06-07 Thread Richard Sandiford via Gcc-patches
Alex Coplan writes: > Hi, > > This patch series fixes various defects with the FEAT_LS64 ACLE > implementation in the AArch64 backend. > > The series is organised as follows: > > - Patch 1/3 fixes whitespace errors in the existing code. > - Patch 2/3 fixes PR110100 where we generate wrong code f

Re: vect: Don't pass subtype to vect_widened_op_tree where not needed [PR 110142]

2023-06-07 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > This patch fixes an issue introduced by > g:2f482a07365d9f4a94a56edd13b7f01b8f78b5a0, where a subtype was beeing > passed to vect_widened_op_tree, when no subtype was to be used. This > lead to an errorneous use of IFN_VEC_WIDEN_MINUS. > > gcc/ChangeLog:

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-07 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > On 30/05/2023 07:26, Richard Biener wrote: >> On Fri, May 26, 2023 at 4:35 PM Andrew Stubbs wrote: >>> >>> Hi all, >>> >>> I want to implement a vector DIVMOD libfunc for amdgcn, but I can't just >>> do it because the GCC middle-end models DIVMOD's return value as >>> "com

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-08 Thread Richard Sandiford via Gcc-patches
In addition to Andreas's and Richard's comments: Tamar Christina writes: > +@item > +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} > followed by > +a list of @code{match_operand}/@code{match_scratch} comma operand numbers, > then a How about: a comma-separated lis

Re: [PATCH] Make sure SCALAR_INT_MODE_P before invoke try_const_anchors

2023-06-09 Thread Richard Sandiford via Gcc-patches
guojiufu writes: > Hi, > > On 2023-06-09 16:00, Richard Biener wrote: >> On Fri, 9 Jun 2023, Jiufu Guo wrote: >> >>> Hi, >>> >>> As checking the code, there is a "gcc_assert (SCALAR_INT_MODE_P >>> (mode))" >>> in "try_const_anchors". >>> This assert seems correct because the function try_const_

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-09 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > On 07/06/2023 20:42, Richard Sandiford wrote: >> I don't know if this helps (probably not), but we have a similar >> situation on AArch64: a 64-bit mode like V8QI can be doubled to a >> 128-bit vector or to a pair of 64-bit vectors. We used V16QI for >> the former and "V2x

Re: [PATCH] Add COMPLEX_VECTOR_INT modes

2023-06-09 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, Jun 9, 2023 at 11:45 AM Andrew Stubbs wrote: >> >> On 09/06/2023 10:02, Richard Sandiford wrote: >> > Andrew Stubbs writes: >> >> On 07/06/2023 20:42, Richard Sandiford wrote: >> >>> I don't know if this helps (probably not), but we have a similar >> >>> situatio

Re: [PATCH V6] VECT: Add SELECT_VL support

2023-06-09 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks, Richi. > > Should I wait for Richard ACK gain ? > Since the last email of this patch, he just asked me to adjust comment no > codes change. > I am not sure whether he is ok. Yeah, OK from my POV too, thanks. Richard

Re: [PATCH] simplify-rtx: Implement constant folding of SS_TRUNCATE, US_TRUNCATE

2023-06-09 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov via Gcc-patches writes: > Hi all, > > This patch implements RTL constant-folding for the SS_TRUNCATE and > US_TRUNCATE codes. > The semantics are a clamping operation on the argument with the min and max > of the narrow mode, > followed by a truncation. The signedness of the clamp

Re: [PATCH v2] [PR96339] Optimise svlast[ab]

2023-06-12 Thread Richard Sandiford via Gcc-patches
Tejas Belagod writes: > From: Tejas Belagod > > This PR optimizes an SVE intrinsics sequence where > svlasta (svptrue_pat_b8 (SV_VL1), x) > a scalar is selected based on a constant predicate and a variable vector. > This sequence is optimized to return the correspoding element of a NEON

Re: [PATCH V2] RISC-V: Rework Phase 5 && Phase 6 of VSETVL PASS

2023-06-12 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 6/9/23 04:41, juzhe.zh...@rivai.ai wrote: >> @@ -4342,135 +4510,81 @@ pass_vsetvl::cleanup_insns (void) const >> } >> } >> >> +/* Return true if the SET result is not used by any instructions. */ >> +static bool >> +has_no_uses (basic_block cfg_bb,

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > AFAIU this special instruction is only supposed to prevent > code motion (of stack memory accesses?) across this instruction? > I'd say a > > (may_clobber (mem:BLK (reg:DI 1 1))) > > might be more to the point? I've used "may_clobber" which doesn't > exist since I'm not

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > AFAIU this special instruction is only supposed to prevent >> > code motion (of stack memory accesses?) across this instruction? >> > I'd say a >> > >> > (may_clobber (mem:BLK (reg:DI 1 1)))

Re: [PATCH] rs6000: replace '(const_int 0)' to 'unspec:BLK [(const_int 0)]' for stack_tie

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener writes: >> > On Wed, 14 Jun 2023, Richard Sandiford wrote: >> > >> >> Richard Biener writes: >> >> > AFAIU this special instruction is only supposed to prevent >> >> > code motion (of stack memory accesses

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-14 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab, uabd_optab

Re: [PATCH] [RFC] main loop masked vectorization with --param vect-partial-vector-usage=1

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > Currently vect_determine_partial_vectors_and_peeling will decide > to apply fully masking to the main loop despite > --param vect-partial-vector-usage=1 when the currently analyzed > vector mode results in a vectorization factor that's bigger > than the num

Re: [PATCH 1/3] Inline vect_get_max_nscalars_per_iter

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The function is only meaningful for LOOP_VINFO_MASKS processing so > inline it into the single use. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > * tree-vect-loop.cc (vect_get_max_nscalars_per_iter): Inline > into ... >

Re: [PATCH 3/3] AVX512 fully masked vectorization

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > This implemens fully masked vectorization or a masked epilog for > AVX512 style masks which single themselves out by representing > each lane with a single bit and by using integer modes for the mask > (both is much like GCN). > > AVX512 is also special in

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-14 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > +The syntax rules are as follows: > +@itemize @bullet > +@item > +Templates must start with @samp{@{@@} to use the new syntax. > + > +@item > +@samp{@{@@} is followed by a layout in parentheses which is @samp{cons:} s/parentheses/square brackets/ > +followed by a comma-

Re: [PATCH 1/3] Inline vect_get_max_nscalars_per_iter

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 14 Jun 2023, Richard Sandiford wrote: > >> Richard Biener via Gcc-patches writes: >> > The function is only meaningful for LOOP_VINFO_MASKS processing so >> > inline it into the single use. >> > >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? >> > >>

Re: [PATCH v2] machine descriptor: New compact syntax for insn and insn_split in Machine Descriptions.

2023-06-14 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: >> + >> + /* Skip any newlines or whitespaces needed. */ >> + while (ISSPACE(*templ)) >> + templ++; >> + continue; >> + } >> + else if (templ[0] == '/' && templ[1] == '*') >> + { >> + templ += 2; >> + /*

Re: [PATCH 1/2] Missed opportunity to use [SU]ABD

2023-06-14 Thread Richard Sandiford via Gcc-patches
Oluwatamilore Adebayo writes: > From: oluade01 > > This adds a recognition pattern for the non-widening > absolute difference (ABD). > > gcc/ChangeLog: > > * doc/md.texi (sabd, uabd): Document them. > * internal-fn.def (ABD): Use new optab. > * optabs.def (sabd_optab, uabd_optab

Re: [PATCH 3/3] AVX512 fully masked vectorization

2023-06-15 Thread Richard Sandiford via Gcc-patches
Andrew Stubbs writes: > One > comment: building a vector constant {0, 1, 2, 3, , 63} results in a > very large entry in the constant pool and an unnecessary memory load (it > literally has to use this sequence to generate the addresses to load the > constant!) Generating the sequence via V

Re: [PATCH] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH V5] RISC-V: Support gather_load/scatter RVV auto-vectorization

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 12 Jul 2023, juzhe.zh...@rivai.ai wrote: > >> Thanks Richard. >> >> Is it correct that the better way is to add optabs >> (len_strided_load/len_strided_store), >> then expand LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE to >> len_strided_load/len_strided_store op

Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Richard Sandiford via Gcc-patches
Vladimir Makarov via Gcc-patches writes: > The following patch solves > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110372 > > The patch was successfully bootstrapped and tested on x86-64. > > commit 1f7e5a7b91862b999aab88ee0319052aaf00f0f1 > Author: Vladimir N. Makarov > Date: Fri Jul 7 09:

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Mon, Jul 10, 2023 at 1:01 PM Uros Bizjak wrote: >> >> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener >> wrote: >> > >> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: >> > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener >> > > wrote:

Re: [PATCH V2] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 12, 2023 at 1:05 PM Uros Bizjak wrote: >> >> On Wed, Jul 12, 2023 at 12:58 PM Uros Bizjak wrote: >> > >> > On Wed, Jul 12, 2023 at 12:23 PM Richard Sandiford >> > wrote: >> > > >> > > Richard Biener via Gcc-patches writes: >> > > > On Mon, Jul 10, 2023 at 1

Re: [PATCH V3] VECT: Apply COND_LEN_* into vectorizable_operation

2023-07-12 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > As we disscussed before, COND_LEN_* patterns were added for multiple > situations. > This patch apply CON_LEN_* for the following situation: > > Support for the situation that in "vectorizable_operation": > /* If ope

Re: [PATCH] tree-optimization/94864 - vector insert of vector extract simplification

2023-07-12 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The PRs ask for optimizing of > > _1 = BIT_FIELD_REF ; > result_4 = BIT_INSERT_EXPR ; > > to a vector permutation. The following implements this as > match.pd pattern, improving code generation on x86_64. > > On the RTL level we face the issue that backend patterns in

Re: [pushed][LRA][PR110372]: Refine reload pseudo class

2023-07-12 Thread Richard Sandiford via Gcc-patches
Vladimir Makarov writes: > On 7/12/23 06:07, Richard Sandiford wrote: >> Vladimir Makarov via Gcc-patches writes: >>> diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc >>> index 73fbef29912..2f95121df06 100644 >>> --- a/gcc/lra-assigns.cc >>> +++ b/gcc/lra-assigns.cc >>> @@ -1443,10 +1443,11 @

[WIP RFC] Add support for keyword-based attributes

2023-07-14 Thread Richard Sandiford via Gcc-patches
Summary: We'd like to be able to specify some attributes using keywords, rather than the traditional __attribute__ or [[...]] syntax. Would that be OK? In more detail: We'd like to add some new target-specific attributes for Arm SME. These attributes affect semantics and code generation and so t

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-16 Thread Richard Sandiford via Gcc-patches
Thanks for the feedback. Nathan Sidwell writes: > On 7/14/23 11:56, Richard Sandiford wrote: >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Would that be OK? >> >> In more detail: >> >> We'd like to

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-16 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via Gcc-patches > wrote: >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Would that

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-17 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Fri, Jul 14, 2023 at 5:58 PM Richard Sandiford via Gcc-patches > wrote: >> >> Summary: We'd like to be able to specify some attributes using >> keywords, rather than the traditional __attribute__ or [[...]] >> syntax. Wou

Re: [WIP RFC] Add support for keyword-based attributes

2023-07-17 Thread Richard Sandiford via Gcc-patches
Jason Merrill writes: > On Sun, Jul 16, 2023 at 6:50 AM Richard Sandiford > wrote: > >> Jakub Jelinek writes: >> > On Fri, Jul 14, 2023 at 04:56:18PM +0100, Richard Sandiford via >> Gcc-patches wrote: >> >> Summary: We'd like to be able to speci

Re: [PATCH] RTL_SSA: Relax PHI_MODE in phi_setup

2023-07-17 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard. > > RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc) > There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc) > > When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS > (inserted after RA) ICE: > rvv.

Re: [PATCH V2] RTL_SSA: Relax PHI_MODE in phi_setup

2023-07-17 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard. > > RISC-V port needs to add a bunch VLS modes (V16QI,V32QI,V64QI,...etc) > There are sharing same REG_CLASS with VLA modes (VNx16QI,VNx32QI,...etc) > > When I am adding those VLS modes, the RTL_SSA initialization in VSETVL PASS >

Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2023-07-17 Thread Richard Sandiford via Gcc-patches
Manolis Tsamis writes: > noce_convert_multiple_sets has been introduced and extended over time to > handle > if conversion for blocks with multiple sets. Currently this is focused on > register moves and rejects any sort of arithmetic operations. > > This series is an extension to allow more sequ

Re: [x86-64] RFC: Add nosse abi attribute

2023-07-17 Thread Richard Sandiford via Gcc-patches
Michael Matz via Gcc-patches writes: > Hello, > > the ELF psABI for x86-64 doesn't have any callee-saved SSE > registers (there were actual reasons for that, but those don't > matter anymore). This starts to hurt some uses, as it means that > as soon as you have a call (say to memmove/memcpy, eve

Re: [PATCH v2 0/2] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2023-07-18 Thread Richard Sandiford via Gcc-patches
Manolis Tsamis writes: > On Tue, Jul 18, 2023 at 1:12 AM Richard Sandiford > wrote: >> >> Manolis Tsamis writes: >> > noce_convert_multiple_sets has been introduced and extended over time to >> > handle >> > if conversion for blocks with multiple sets. Currently this is focused on >> > register

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-19 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > The resulting predicate register of a whilelo is not > restricted to the lower half of the predicate register file. > > As such these tests started failing after recent changes > because the whilelo outside the loop is getting assigned p15. It's the whilelo i

Re: [GCC 13 PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-07-19 Thread Richard Sandiford via Gcc-patches
Andrew Carlotti writes: > Updated patch to fix the fp16 intrinsic pragmas, and pushed to master. > OK to backport to GCC 13? OK, thanks. Richard > Many intrinsics currently depend on both an architecture version and a > feature, despite the corresponding instructions being available within > GC

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-07-20 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 7/19/23 04:25, Richard Biener wrote: >> On Wed, 19 Jul 2023, YunQiang Su wrote: >> >>> Eric Botcazou ?2023?7?19??? 17:45??? > I don't see that. That's definitely not what GCC expects here, > the left-most word of the doubleword should be unchan

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Thu, 20 Jul 2023, Richard Sandiford wrote: > >> Tamar Christina writes: >> > Hi All, >> > >> > The resulting predicate register of a whilelo is not >> > restricted to the lower half of the predicate register file. >> > >> > As such these tests started failing after rec

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-20 Thread Richard Sandiford via Gcc-patches
Jan Hubicka writes: >> Tamar Christina writes: >> > Hi All, >> > >> > The resulting predicate register of a whilelo is not >> > restricted to the lower half of the predicate register file. >> > >> > As such these tests started failing after recent changes >> > because the whilelo outside the loop

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: #> On Thu, 20 Jul 2023, Richard Sandiford wrote: > >> Jeff Law via Gcc-patches writes: >> > On 7/19/23 04:25, Richard Biener wrote: >> >> On Wed, 19 Jul 2023, YunQiang Su wrote: >> >> >> >>> Eric Botcazou ?2023?7?19??? 17:45??? >> >> > I don't see that. That's d

Re: [PATCH] testsuite: Add a test case for PR110729

2023-07-20 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi, > > As PR110729 reported, there was one issue for .section > __patchable_function_entries with -ffunction-sections, that > is we put the same symbol as link_to section symbol for all > functions wrongly. The commit r13-4294 for PR99889 has > fixed this with the correspon

Re: [PATCH] sccvn: Correct the index of bias for IFN_LEN_STORE [PR110744]

2023-07-20 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi, > > Commit r14-2267-gb8806f6ffbe72e adjusts the arguments order > of LEN_STORE from {len,vector,bias} to {len,bias,vector}, > in order to make them consistent with LEN_MASK_STORE and > MASK_STORE. But it missed to update the related handlings > in tree-ssa-sccvn.cc, it c

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > When we materialize a layout we push edge permutes to constant/external > defs without checking we can actually do so. For externals defined > by vector stmts rather than scalar components we can't. > > Bootstrapped and tested on x86_64-unknown-linux-gnu.

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 20.07.2023 um 16:09 schrieb Richard Sandiford : >> >> Richard Biener via Gcc-patches writes: >>> When we materialize a layout we push edge permutes to constant/external >>> defs without checking we can actually do so. For externals defined >>> by vector stmts rathe

Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors

2023-07-20 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: >> Am 20.07.2023 um 18:59 schrieb Richard Sandiford : >> >> Richard Biener writes: > Am 20.07.2023 um 16:09 schrieb Richard Sandiford > : Richard Biener via Gcc-patches writes: > When we materialize a layout we push edge permutes to constant/exte

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-21 Thread Richard Sandiford via Gcc-patches
Jan Hubicka writes: > Avoid scaling flat loop profiles of vectorized loops > > As discussed, when vectorizing loop with static profile, it is not always > good idea > to divide the header frequency by vectorization factor because the profile may > not realistically represent the expected number o

Re: [PATCH] Remove SLP_TREE_VEC_STMTS in favor of SLP_TREE_VEC_DEFS

2023-07-24 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > The following unifies SLP_TREE_VEC_STMTS into SLP_TREE_VEC_DEFS > which can handle all cases we need. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. Nice! Just curious... > @@ -149,6 +147,20 @@ _slp_tree::~_slp_tree () > free (failed); > } > > +/*

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-24 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > This only affects the new costs in aarch64 backend. Currently, the reduction > latency of vector body is too large as it is multiplied by stmt count. As the > scalar reduction latency is small, the new costs model may think "scalar code > would issue more quickly" and increa

Re: [PATCH 2/2] AARCH64: Turn off unwind tables for crtbeginT.o

2023-07-24 Thread Richard Sandiford via Gcc-patches
Andrew Pinski via Gcc-patches writes: > The problem -fasynchronous-unwind-tables is on by default for aarch64 > We need turn it off for crt*.o because it would make __EH_FRAME_BEGIN__ point > to .eh_frame data from crtbeginT.o instead of the user-defined object > during static linking. Could you

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
钟居哲 writes: > Hi, Richi. Thank you so much for review. > >>> This function doesn't seem to care about conditional vectorization >>> support, so why are you changing it? > > I debug and analyze the code here: > > Breakpoint 1, vectorizable_call (vinfo=0x3d358d0, stmt_info=0x3dcc820, > gsi=0x0, vec

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-25 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: > Hi, > > Thanks for the suggestion. I tested it and found a gcc_assert failure: > gcc.target/aarch64/sve/cost_model_13.c (internal compiler error: in > info_for_reduction, at tree-vect-loop.cc:5473) > > It is caused by empty STMT_VINFO_REDUC_DEF. When was STMT_VINFO_REDU

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Thanks Richard. > > Do you suggest we should add a macro like this first: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERNAL_COND_FN(NAME, FLAGS, OPTAB, TYPE) \ > DEF_INTERNAL_OPTAB_FN (COND_##NAME, FLAGS, cond_##optab, cond_##TYPE) > DEF_INTERNAL_OPTAB_FN

Re: [PATCH] VECT: Support CALL vectorization for COND_LEN_*

2023-07-25 Thread Richard Sandiford via Gcc-patches
"juzhe.zh...@rivai.ai" writes: > Hi, Richard. >>> I think we should have an internal-fn helper that returns IFN_COND_LEN_* >>> for a given IFN_COND_*. It could handle IFN_MASK_LOAD -> IFN_MASK_LEN_LOAD >>> etc. too. > Could you name this helper function for me? Does it call > "get_conditional_le

Re: [RFC] [v2] Extend fold_vec_perm to handle VLA vectors

2023-07-25 Thread Richard Sandiford via Gcc-patches
Hi, Thanks for the rework and sorry for the slow review. Prathamesh Kulkarni writes: > Hi Richard, > This is reworking of patch to extend fold_vec_perm to handle VLA vectors. > The attached patch unifies handling of VLS and VLA vector_csts, while > using fallback code > for ctors. > > For VLS ve

Re: vectorizer: Avoid an OOB access from vectorization

2023-07-25 Thread Richard Sandiford via Gcc-patches
Was leaving a bit of time in case Richi had any comments, but: Matthew Malcomson writes: > Our checks for whether the vectorization of a given loop would make an > out of bounds access miss the case when the vector we load is so large > as to span multiple iterations worth of data (while only bei

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches > wrote: >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that we're >> > not papering over an issue elsewhere. >> >> Yes, I also wonder if this is an issue in vectorizable_reduction. Below

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches >> > wrote: >> >> >> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that >> >> > we're not pape

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-28 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response. Hao Liu OS writes: >> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing: >> >> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar) >> && vect_is_reduction (stmt_info)) >> >> to: >> >> if ((kind == scalar_stmt || k

Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]

2023-07-31 Thread Richard Sandiford via Gcc-patches
Hao Liu OS writes: >> Which test case do you see this for? The two tests in the patch still >> seem to report correct latencies for me if I make the change above. > > Not the newly added tests. It is still the existing case causing the > previous ICE (i.e. assertion problem): gcc.target/aarch64

Re: [PATCH] Add POLY_INT_CST support to fold_ctor_reference in gimple-fold.cc

2023-07-31 Thread Richard Sandiford via Gcc-patches
Richard Ball writes: > Add POLY_INT_CST support to code within > fold_ctor_reference. This code previously > only supported INTEGER_CST which caused a > bug when using VEC_PERM_EXPR with SVE vectors. Just to add for others: this is a prerequisite for a follow-on patch, so the change will be teste

Re: [PATCH] internal-fn: Refine macro define of COND_* and COND_LEN_* internal functions

2023-07-31 Thread Richard Sandiford via Gcc-patches
juzhe.zh...@rivai.ai writes: > From: Ju-Zhe Zhong > > Hi, Richard and Richi. > > Base on previous disscussions, we should make COND_* and COND_LEN_* > consistent. > > So, this patch define these internal function together by these 2 > wrappers: > > #ifndef DEF_INTERNAL_COND_FN > #define DEF_INTERN

Re: [PATCH V2] VECT: Support CALL vectorization for COND_LEN_*

2023-07-31 Thread Richard Sandiford via Gcc-patches
Juzhe-Zhong writes: > Hi, Richard and Richi. > > Base on the suggestions from Richard: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/625396.html > > This patch choose (1) approach that Richard provided, meaning: > > RVV implements cond_* optabs as expanders. RVV therefore supports > both

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following makes sure to limit the shift operand when vectorizing > (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift > operand otherwise invokes undefined behavior. When we determine > whether we can demote the operand we know we at m

Re: [PATCH] tree-optimization/110838 - vectorization of widened shifts

2023-08-01 Thread Richard Sandiford via Gcc-patches
Richard Sandiford writes: > Richard Biener via Gcc-patches writes: >> The following makes sure to limit the shift operand when vectorizing >> (short)((int)x >> 31) via (short)x >> 31 as the out of bounds shift >> operand otherwise invokes undefined behavior. When we determine >> whether we can d

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-01 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 7/19/23 04:11, Xiao Zeng wrote: >> This patch completes the recognition of the basic semantics >> defined in the spec, namely: >> >> Conditional zero, if condition is equal to zero >>rd = (rs2 == 0) ? 0 : rs1 >> Conditional zero, if condition is non zero

Re: [PATCH 2/5] [RISC-V] Generate Zicond instruction for basic semantics

2023-08-02 Thread Richard Sandiford via Gcc-patches
Jeff Law via Gcc-patches writes: > On 8/1/23 05:18, Richard Sandiford wrote: >> >> Where were you seeing the requirement for pointer equality? genrecog.cc >> at least uses rtx_equal_p, and I think it has to. E.g. some patterns >> use (match_dup ...) to match output and input mems, and mem rtxes

  1   2   3   4   5   6   7   8   9   10   >