Re: [PATCH 4/5] aarch64: Refactor aarch64_qshrn_n RTL pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches
Jonathan Wright writes: > Hi, > > As subject, this patch splits the aarch64_qshrn_n > pattern into separate scalar and vector variants. It further splits the vector > pattern into big/little endian variants that model the zero-high-half > semantics of the underlying instruction - allowing for more

Re: [PATCH 5/5] testsuite: aarch64: Add tests for high-half narrowing instructions

2021-05-19 Thread Richard Sandiford via Gcc-patches
Jonathan Wright writes: > Hi, > > As subject, this patch adds tests to confirm that a *2 (write to high-half) > Neon instruction is generated from vcombine* of a narrowing intrinsic > sequence. > > Ok for master? OK, thanks. Richard > Thanks, > Jonathan > > --- > > gcc/testsuite/ChangeLog: > >

Re: [PATCH] aarch64: Use an expander for quad-word vec_pack_trunc pattern

2021-05-19 Thread Richard Sandiford via Gcc-patches
Jonathan Wright writes: > Hi, > > The existing vec_pack_trunc RTL pattern emits an opaque two- > instruction assembly code sequence that prevents proper instruction > scheduling. This commit changes the pattern to an expander that emits > individual xtn and xtn2 instructions. > > This commit also

Re: [PATCH] aarch64: Use correct type attributes for RTL generating XTN(2)

2021-05-19 Thread Richard Sandiford via Gcc-patches
Jonathan Wright writes: > Hi, > > As subject, this patch corrects the type attribute in RTL patterns that > generate XTN/XTN2 instructions to be "neon_move_narrow_q". > > This makes a material difference because these instructions can be > executed on both SIMD pipes in the Cortex-A57 core model,

Re: [PATCH] aarch64: Add attributes for builtins specified in aarch64-builtins.c

2021-05-21 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov writes: > Hi all, > > Besides the builtins in aarch64-simd-builtins.def there are a number of > builtins defined in aarch64-builtins.c itself. > They could also benefit from the attributes generated by > aarch64_get_attributes. > However aarch64_get_attributes and its helpers are

Re: [PATCH][vect] Use main loop's thresholds and vectorization factor to narrow upper_bound of epilogue

2021-05-24 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > Hi, > > When vectorizing with --param vect-partial-vector-usage=1 the vectorizer > uses an unpredicated (all-true predicate for SVE) main loop and a > predicated tail loop. The way this was implemented seems to mean it > re-uses the same vector-mode for both loo

Re: RFA: fix gcc.dg/tree-ssa/popcount4l.c 16 bit failure, improve 64 bit popcount expansion for 32 bit target

2021-05-26 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Mon, May 17, 2021 at 3:18 PM Joern Wolfgang Rennecke > wrote: >> >> Attached is the updated version of the patch. >> Bootstrapped and regtested on x86_64-pc-linux-gnu. >> >> OK to apply? > > + machine_mode m = mode_for_size ((prec + 1) / 2, MODE_INT, 1).require

Re: [PATCH 2/4]AArch64: Add support for sign differing dot-product usdot for NEON and SVE.

2021-05-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, May 10, 2021 5:49 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH 2/4]AArch64: Add support for sign

Re: [PATCH]AArch64: Correct dot-product auto-vect optab RTL

2021-05-26 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > The current RTL for the vectorizer patterns for dot-product are incorrect. > Operand3 isn't an output parameter so we can't write to it. > > This fixes this issue and reduces the number of RTL. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues

Re: [PATCH v2] AArch64: Improve GOT addressing

2021-05-26 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > @@ -23746,6 +23767,24 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn > *curr) > } > } > > + /* Always treat GOT accesses as a pair to ensure they can be easily > + identified and optimized in linkers. */ Sorry, I guess I'd not given enough wei

Re: [PATCH] arm: Auto-vectorization for MVE: vaddv

2021-05-26 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds support for the reduc_plus_scal optab with MVE, which > maps to the vaddv instruction. > > It moves the reduc_plus_scal_ expander from neon.md to > vec-common.md and adds support for MVE to it. > > Since vaddv uses a 32-bits accumulator, we have to truncat

Re: [PATCH] [i386] Fix _mm256_zeroupper to notify LRA that vzeroupper will kill sse registers. [PR target/82735]

2021-05-27 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek writes: > On Thu, May 27, 2021 at 01:07:09PM +0800, Hongtao Liu via Gcc-patches wrote: >> + /* Flag used for call_insn indicates it's a fake call. */ >> + RTX_FLAG (insn, used) = 1; > >> + /* CALL_INSN use "used" flag to indicate it's a fake call. */ >> + if (i == STACK

Re: [PATCH 0/11] warning control by group and location (PR 74765)

2021-05-27 Thread Richard Sandiford via Gcc-patches
Thanks for doing this. Martin Sebor via Gcc-patches writes: > […] > On 5/24/21 5:08 PM, David Malcolm wrote: >> On Mon, 2021-05-24 at 16:02 -0600, Martin Sebor wrote: >>> Subsequent patches then replace invocations of the TREE_NO_WARNING() >>> macro and the gimple_no_warning_p() and gimple_set_no

Re: RFA: Fix match_scratch bug in define_subst

2021-05-27 Thread Richard Sandiford via Gcc-patches
Joern Rennecke writes: > Bootstrapped on x86_64-pc-linux-gnu. > > 2020-12-10 Joern Rennecke > > Fix bug in the define_subst handling that made match_scratch unusable for > multi-alternative patterns. OK, and sorry for the slow response. The changelog won't pass, but I'll leave you to

Re: none

2021-05-27 Thread Richard Sandiford via Gcc-patches
Joern Rennecke writes: > At the moment, for a match_dup in a define_cond_exec, you'd have to > give the number in the > resulting pattern(s) rather than in the substitute pattern. That's > not only wrong, but can also > be impossible when the pattern should apply to multiple patterns with > diffe

Re: [PATCH v2] forwprop: Support vec perm fed by CTOR and CTOR/CST [PR99398]

2021-05-27 Thread Richard Sandiford via Gcc-patches
Sorry for the slow reponse. "Kewen.Lin" writes: > diff --git a/gcc/vec-perm-indices.c b/gcc/vec-perm-indices.c > index ede590dc5c9..57dd11d723c 100644 > --- a/gcc/vec-perm-indices.c > +++ b/gcc/vec-perm-indices.c > @@ -101,6 +101,70 @@ vec_perm_indices::new_expanded_vector (const > vec_perm_indi

Re: [PATCH v2] Add vec_const_duplicate optab and TARGET_GEN_MEMSET_SCRATCH_RTX

2021-05-31 Thread Richard Sandiford via Gcc-patches
"H.J. Lu via Gcc-patches" writes: > On Mon, May 31, 2021 at 06:32:04AM -0700, H.J. Lu wrote: >> On Mon, May 31, 2021 at 6:26 AM Richard Biener >> wrote: >> > >> > On Mon, May 31, 2021 at 3:12 PM H.J. Lu wrote: >> > > >> > > On Mon, May 31, 2021 at 5:46 AM Richard Biener >> > > wrote: >> > > > >

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-02 Thread Richard Sandiford via Gcc-patches
Kewen Lin writes: > Hi all, > > define_insn_and_split should avoid to use empty split condition > if the condition for define_insn isn't empty, otherwise it can > sometimes result in unexpected consequence, since the split > will always be done even if the insn condition doesn't hold. > > To avoid

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-02 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi Richard, > > on 2021/6/2 锟斤拷锟斤拷4:11, Richard Sandiford wrote: >> Kewen Lin writes: >>> Hi all, >>> >>> define_insn_and_split should avoid to use empty split condition >>> if the condition for define_insn isn't empty, otherwise it can >>> sometimes result in unexpected con

Re: [PATCH] rtl: constm64_rtx..const64_rtx

2021-06-02 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > Since times immemorial there has been const_int_rtx for all values from > -64 to 64, but only constm1_rtx..const2_rtx have been available for > convenient use. Change this, so that we can use all values in > {-64,...,64} in RTL easily. This matters, because then we w

Re: [PATCH] predcom: Enabled by loop vect at O2 [PR100794]

2021-06-02 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin via Gcc-patches" writes: > Hi, > > As PR100794 shows, in the current implementation PRE bypasses > some optimization to avoid introducing loop carried dependence > which stops loop vectorizer to vectorize the loop. At -O2, > there is no downstream pass to re-catch this kind of opportun

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-02 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, Jun 2, 2021 at 12:01 PM Kewen.Lin wrote: >> >> on 2021/6/2 下午5:13, Richard Sandiford wrote: >> > "Kewen.Lin" writes: >> >> Hi Richard, >> >> >> >> on 2021/6/2 锟斤拷锟斤拷4:11, Richard Sandiford wrote: >> >>> Kewen Lin writes: >> Hi all, >> >> define_in

Re: [PATCH] arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

2021-06-02 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds support for auto-vectorization of average value > computation using vhadd or vrhadd, for both MVE and Neon. > > The patch adds the needed [u]avg3_[floor|ceil] patterns to > vec-common.md, I'm not sure how to factorize them without introducing > an unspec i

Re: [PATCH] arm: Auto-vectorization for MVE: vabs

2021-06-02 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds support for auto-vectorization of absolute value > computation using vabs. > > We use a similar pattern to what is used in neon.md and extend the > existing neg2 expander to match both 'neg' and 'abs'. This > implies renaming the existing abs2 define_insn

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin" writes: > Hi Richi/Richard/Jeff/Segher, > > Thanks for the comments! > > on 2021/6/3 锟斤拷锟斤拷7:52, Segher Boessenkool wrote: >> On Wed, Jun 02, 2021 at 06:32:13PM +0100, Richard Sandiford wrote: >>> Richard Biener writes: So what Richard suggests would be to disallow split conditio

Re: [RFC/PATCH 00/11] Fix up some unexpected empty split conditions

2021-06-03 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Thu, Jun 03, 2021 at 09:05:02AM +0100, Richard Sandiford via Gcc-patches > wrote: >> Right. Plus it creates less make-work. If we didn't have it, someone >> would need to split the define_insn_and_splits that don't currently >>

Re: [PATCH] rtl: constm64_rtx..const64_rtx

2021-06-03 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Wed, Jun 02, 2021 at 06:07:28PM +0100, Richard Sandiford wrote: >> Segher Boessenkool writes: >> > Since times immemorial there has been const_int_rtx for all values from >> > -64 to 64, but only constm1_rtx..const2_rtx have been available for >> > convenient use.

Re: [PATCH v2 1/2] Allow vec_duplicate_optab to fail

2021-06-07 Thread Richard Sandiford via Gcc-patches
"H.J. Lu" writes: > Update vec_duplicate to allow to fail so that backend can only allow > broadcasting an integer constant to a vector when broadcast instruction > is available. I'm not sure why we need this to fail though. Once the optab is defined for target X, the optab should handle all dup

Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-06-07 Thread Richard Sandiford via Gcc-patches
Sorry for the slow response. Tamar Christina writes: > […] > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c > index > 441d6cd28c4eaded7abd756164890dbcffd2f3b8..82123b96313e6783ea214b9259805d65c07d8858 > 100644 > --- a/gcc/tree-vect-patterns.c > +++ b/gcc/tree-vect-patterns.c >

Re: [PATCH 02/57] Support scanning of build-time GC roots in gengtype

2021-06-07 Thread Richard Sandiford via Gcc-patches
Bill Schmidt via Gcc-patches writes: > On 5/20/21 5:24 PM, Segher Boessenkool wrote: >> On Tue, May 11, 2021 at 11:01:22AM -0500, Bill Schmidt wrote: >>> Hi!  I'd like to ping this specific patch from the series, which is the >>> only one remaining that affects common code.  I confess that I don't

Re: [PATCH] arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

2021-06-08 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On Wed, 2 Jun 2021 at 20:19, Richard Sandiford > wrote: >> >> Christophe Lyon writes: >> > This patch adds support for auto-vectorization of average value >> > computation using vhadd or vrhadd, for both MVE and Neon. >> > >> > The patch adds the needed [u]avg3_[floor|c

Re: [PATCH 1/2] arm: Auto-vectorization for MVE: vclz

2021-06-08 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds support for auto-vectorization of clz for MVE. > > It does so by removing the unspec from mve_vclzq_ and uses > 'clz' instead. It moves to neon_vclz expander from neon.md to > vec-common.md and renames it into the standard name clz2. > > 2021-06-03 Christ

Re: [PATCH 2/2] arm: Auto-vectorization for MVE: add pack/unpack patterns

2021-06-08 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds vec_unpack_hi_, vec_unpack_lo_, > vec_pack_trunc_ patterns for MVE. > > It does so by moving the unpack patterns from neon.md to > vec-common.md, while adding them support for MVE. The pack expander is > derived from the Neon one (which in turn is renamed

Re: [PATCH] rtl: Join the insn and split conditions in define_insn_and_split

2021-06-08 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Tue, Jun 08, 2021 at 02:48:11PM +0200, Richard Biener wrote: >> > So yeah, patch withdrawn. This on one hand is proof we do want to make >> > such a change, but on the other hand shows it needs more preparatory >> > steps. >> >> I wonder if it makes sense to provi

Re: [PATCH] rtl: Join the insn and split conditions in define_insn_and_split

2021-06-08 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool writes: > On Tue, Jun 08, 2021 at 04:50:56PM +0100, Richard Sandiford wrote: >> Segher Boessenkool writes: >> > On Tue, Jun 08, 2021 at 02:48:11PM +0200, Richard Biener wrote: >> >> > So yeah, patch withdrawn. This on one hand is proof we do want to make >> >> > such a change,

Re: [PATCH 1/2] arm: Fix vcond_mask expander for MVE (PR target/100757)

2021-06-09 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > The problem in this PR is that we call VPSEL with a mask of vector > type instead of HImode. This happens because operand 3 in vcond_mask > is the pre-computed vector comparison and has vector type. The fix is > to transfer this value to VPR.P0 by comparing operand 3 with

Re: [PATCH] arm: Auto-vectorization for MVE and Neon: vhadd/vrhadd

2021-06-09 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > This patch adds support for auto-vectorization of average value > computation using vhadd or vrhadd, for both MVE and Neon. > > The patch adds the needed [u]avg3_[floor|ceil] patterns to > vec-common.md, I'm not sure how to factorize them without introducing > an unspec i

Re: [PATCH 1/2] arm: Auto-vectorization for MVE: vclz

2021-06-09 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On Tue, 8 Jun 2021 at 13:58, Richard Sandiford > wrote: >> >> Christophe Lyon writes: >> > This patch adds support for auto-vectorization of clz for MVE. >> > >> > It does so by removing the unspec from mve_vclzq_ and uses >> > 'clz' instead. It moves to neon_vclz expan

Re: [PATCH 2/2] arm: Auto-vectorization for MVE: add pack/unpack patterns

2021-06-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > On Tue, 8 Jun 2021 at 14:10, Richard Sandiford > wrote: >> >> Christophe Lyon writes: >> > This patch adds vec_unpack_hi_, vec_unpack_lo_, >> > vec_pack_trunc_ patterns for MVE. >> > >> > It does so by moving the unpack patterns from neon.md to >> > vec-common.md, while

Re: [PATCH 3/4] remove %K from error() calls in the aarch64/arm back ends (PR 98512)

2021-06-11 Thread Richard Sandiford via Gcc-patches
Martin Sebor via Gcc-patches writes: > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 7b37e1b602c..7cdc824730c 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -13242,13 +13242,8 @@ bounds_check (rtx operand, HOST_WIDE_INT low, > HOST_WIDE_INT high, >lan

Re: [PATCH 2/2] arm: Auto-vectorization for MVE: add pack/unpack patterns

2021-06-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > In the meantime, I tried to make some progress, and looked at how > things work on aarch64. > > This led me to define (in mve.md): > > (define_insn "@mve_vec_pack_trunc_lo_" > [(set (match_operand: 0 "register_operand" "=w") >(truncate: (match_operand:MVE_5 1 "re

Re: [PATCH 2/2] arm: Auto-vectorization for MVE: add pack/unpack patterns

2021-06-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon writes: > Thanks for the feedback. How about v2 attached? > Do you want me to merge neon_vec_unpack and > mve_vec_unpack and only have different assembly? > if (TARGET_HAVE_MVE) > return "vmovlb.%# %q0, %q1"; > else > return "vmovlb.%# %q0, %q1"; I think it'd be better to kee

Re: [PING^2][PATCH] libgcc, emutls: Allow building weak definitions of the emutls functions.

2021-10-04 Thread Richard Sandiford via Gcc-patches
Iain Sandoe writes: > Hi, > > So let’s ignore the questions for now - OK for the non-Darwin parts of the > patch ? Looks OK to me. Thanks, Richard > >> On 24 Sep 2021, at 17:57, Iain Sandoe wrote: >> > >> as noted below the non-Darwin parts of this are trivial (and a no-OP). >> I’d like to ap

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

2021-10-04 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Thu, 30 Sep 2021, Andre Vieira (lists) wrote: > >> Hi, >> >> >> >> That just forces trying the vector modes we've tried before. Though I >> >> might >> >> need to revisit this now I think about it. I'm afraid it might be possible >> >> for >> >> this

Re: [PATCH] middle-end/102587 - avoid auto-init for VLA vectors

2021-10-05 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > On Mon, 4 Oct 2021, Qing Zhao wrote: > >> >> >> > On Oct 4, 2021, at 12:19 PM, Richard Biener wrote: >> > >> > On October 4, 2021 7:00:10 PM GMT+02:00, Qing Zhao >> > wrote: >> >> I have several questions on this fix: >> >> >> >> 1. This fix avoided

[PATCH] loop: Fix profile updates after unrolling [PR102385]

2021-10-05 Thread Richard Sandiford via Gcc-patches
In g:62acc72a957b5614 I'd stopped the unroller from using an epilogue loop in cases where the iteration count was known to be a multiple of the unroll factor. The epilogue and non-epilogue cases still shared this (preexisting) code to update the edge frequencies: basic_block exit_bb = single_pr

Re: [PATCH #2] Introduce smul_highpart and umul_highpart RTX for high-part multiplications

2021-10-06 Thread Richard Sandiford via Gcc-patches
"Roger Sayle" writes: > Hi Richard, > > All excellent suggestions. The revised patch below implements all of > your (and Andreas') recommendations. I'm happy to restrict GCC's support > for saturating arithmetic to integer types, even though I do know of one > target (nvptx) that supports satura

Re: [PATCH 1/7]AArch64 Add combine patterns for right shift and narrow

2021-10-06 Thread Richard Sandiford via Gcc-patches
(Nice optimisations!) Kyrylo Tkachov writes: > Hi Tamar, > >> -Original Message- >> From: Tamar Christina >> Sent: Wednesday, September 29, 2021 5:19 PM >> To: gcc-patches@gcc.gnu.org >> Cc: nd ; Richard Earnshaw ; >> Marcus Shawcroft ; Kyrylo Tkachov >> ; Richard Sandiford >> >> Subjec

Re: [SVE] [gimple-isel] PR93183 - SVE does not use neg as conditional

2021-10-08 Thread Richard Sandiford via Gcc-patches
Thanks for looking at this. Prathamesh Kulkarni writes: > Hi, > As mentioned in PR, for the following test-case: > > typedef unsigned char uint8_t; > > static inline uint8_t > x264_clip_uint8(uint8_t x) > { > uint8_t t = -x; > uint8_t t1 = x & ~63; > return (t1 != 0) ? t : x; > } > > void >

Re: [PATCH]AArch64[RFC] Force complicated constant to memory when beneficial

2021-10-08 Thread Richard Sandiford via Gcc-patches
Catching up on backlog, sorry for the very late response: Tamar Christina writes: > Hi All, > > Consider the following case > > #include > > uint64_t > test4 (uint8x16_t input) > { > uint8x16_t bool_input = vshrq_n_u8(input, 7); > poly64x2_t mask = vdupq_n_p64(0x0102040810204080UL); >

Re: [PATCH]AArch64 Make use of FADDP in simple reductions.

2021-10-08 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > This is a respin of an older patch which never got upstream reviewed by a > maintainer. It's been updated to fit the current GCC codegen. > > This patch adds a pattern to support the (F)ADDP (scalar) instruction. > > Before the patch, the C code > > typedef f

Re: [PATCH 05/13] arm: Add support for VPR_REG in arm_class_likely_spilled_p

2021-10-11 Thread Richard Sandiford via Gcc-patches
Sorry for the very long delay in reviewing this. Things have been a bit hectic recently. Christophe Lyon via Gcc-patches writes: > VPR_REG is the only register in its class, so it should be handled by > TARGET_CLASS_LIKELY_SPILLED_P. No test fails without this patch, but > it seems it should be

Re: [PATCH 06/13] arm: Fix mve_vmvnq_n_ argument mode

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use > iterator instead of HI in mve_vmvnq_n_. > > 2021-09-03 Christophe Lyon > > gcc/ > * config/arm/mve.md (mve_vmvnq_n_): Use V_elem mode > for operand 1. > > diff --git

Re: [PATCH 07/13] arm: Implement MVE predicates as vectors of booleans

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This patch implements support for vectors of booleans to support MVE > predicates, instead of HImode. Since the ABI mandates pred16_t (aka > uint16_t) to represent predicates in intrinsics prototypes, we > introduce a new "predicate" type qualifier so tha

Re: [PATCH 08/13] arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > We make use of qualifier_predicate to describe MVE builtins > prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins, > as they are exercised by the tests added earlier in the series. > > Special handling is needed for mve_vpselq because it

Re: [PATCH 09/13] arm: Fix vcond_mask expander for MVE (PR target/100757)

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > From: Christophe Lyon > > The problem in this PR is that we call VPSEL with a mask of vector > type instead of HImode. This happens because operand 3 in vcond_mask > is the pre-computed vector comparison and has vector type. > > This patch fixes it by imp

Re: [PATCH 10/13] arm: Convert remaining MVE vcmp builtins to predicate qualifiers

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This is mostly a mechanical change, only tested by the intrinsics > expansion tests. > > 2021-09-02 Christophe Lyon > > gcc/ > PR target/100757 > PR target/101325 > * config/arm/arm-builtins.c (BINOP_UNONE_NONE_NONE_QUALIFIERS):

Re: [PATCH 12/13] arm: Convert more load/store MVE builtins to predicate qualifiers

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This patch covers a few builtins where we do not use the > iterator and thus we cannot use . > > However this introduces a problem for the v2di instructions, because > there is not predicate for this case. For instance, changing > STRSBS_P_QUALIFIERS bre

Re: [PATCH 13/13] arm: Convert more MVE/CDE builtins to predicate qualifiers

2021-10-11 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This patch covers a few non-load/store builtins where we do not use > the iterator and thus we cannot use . > > We need to update the expected code in cde-mve-full-assembly.c because > we now use mve_movv16qi instead of movhi to generate the vmsr > instru

Re: [PATCH 12/13] arm: Convert more load/store MVE builtins to predicate qualifiers

2021-10-11 Thread Richard Sandiford via Gcc-patches
Richard Sandiford via Gcc-patches writes: > Christophe Lyon via Gcc-patches writes: >> This patch covers a few builtins where we do not use the >> iterator and thus we cannot use . >> >> However this introduces a problem for the v2di instructions, because >> the

Re: [SVE] [gimple-isel] PR93183 - SVE does not use neg as conditional

2021-10-11 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Fri, 8 Oct 2021 at 21:19, Richard Sandiford > wrote: >> >> Thanks for looking at this. >> >> Prathamesh Kulkarni writes: >> > Hi, >> > As mentioned in PR, for the following test-case: >> > >> > typedef unsigned char uint8_t; >> > >> > static inline uint8_t >> > x

Re: *PING* [PATCH] doc: improve -fsanitize=undefined description

2021-10-11 Thread Richard Sandiford via Gcc-patches
Diane Meirowitz via Gcc-patches writes: > Please review my patch. It is tiny. Thank you. Thanks for the patch and sorry for the very slow response. I've now pushed this to master and all active branches. Thanks, Richard > Diane > > On 9/15/21, 5:02 PM, "Diane Meirowitz" wrote: > > > d

Re: [PATCH 1/5]AArch64 sve: combine inverted masks into NOTs

2021-10-11 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi, > > Sending a new version of the patch because I noticed the pattern was > overriding the nor pattern. > > A second pattern is needed to capture the nor case as combine will match the > longest sequence first. So without this pattern we end up de-optimizing nor > an

Re: [PATCH 2/5]AArch64 sve: combine nested if predicates

2021-10-11 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> > Note: This patch series is working incrementally towards generating the >> most >> > efficient code for this and other loops in small steps. >> >> It looks like this could be done in the vectoriser via an extension of the >> scalar_cond_masked_set mechanism. We

Re: [SVE] [gimple-isel] PR93183 - SVE does not use neg as conditional

2021-10-13 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Mon, 11 Oct 2021 at 20:42, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > On Fri, 8 Oct 2021 at 21:19, Richard Sandiford >> > wrote: >> >> >> >> Thanks for looking at this. >> >> >> >> Prathamesh Kulkarni writes: >> >> > Hi, >> >> > As mentio

Re: [PATCH] tree-optimization/102659 - avoid undefined overflow after if-conversion

2021-10-13 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches writes: > The following makes sure to rewrite arithmetic with undefined behavior > on overflow to a well-defined variant when moving them to be always > executed as part of doing if-conversion for loop vectorization. > > Bootstrapped and tested on x86_64-unknown-linu

Re: [PATCH] tree-optimization/102659 - avoid undefined overflow after if-conversion

2021-10-13 Thread Richard Sandiford via Gcc-patches
Richard Biener writes: > On Wed, 13 Oct 2021, Richard Sandiford wrote: > >> Richard Biener via Gcc-patches writes: >> > The following makes sure to rewrite arithmetic with undefined behavior >> > on overflow to a well-defined variant when moving them to be always >> > executed as part of doing if

Re: [PATCH] Add debug helpers for auto_bitmap.

2021-10-14 Thread Richard Sandiford via Gcc-patches
Aldy Hernandez via Gcc-patches writes: > Using debug() on an auto_bitmap from gdb doesn't work because the > implicit conversion from auto_bitmap to bitmap_head doesn't work > from within a debugging session. This patch adds the convenience > functions for auto_bitmap. > > OK? OK, thanks. Richa

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-14 Thread Richard Sandiford via Gcc-patches
Hi Robin, Thanks for the update and sorry for the late response. Robin Dapp writes: > Hi Richard, > >> Don't we still need this code (without the REG_DEAD handling) for the >> case in which… >> >>> + /* As we are transforming >>> +if (x > y) >>> + { >>> +a = b; >>> +

Re: [PATCH] combine: Check for paradoxical subreg

2021-10-14 Thread Richard Sandiford via Gcc-patches
Sorry for the slow review. Robin Dapp via Gcc-patches writes: > Hi, > > while evaluating another patch that introduces more lvalue paradoxical > subregs I ran into an ICE in combine at > > wide_int o = wi::insert (rtx_mode_t (outer, temp_mode), > rtx_mode

Re: [PATCH 3/5]AArch64 sve: do not keep negated mask and inverse mask live at the same time

2021-10-14 Thread Richard Sandiford via Gcc-patches
Sorry for the slow reply. Tamar Christina writes: > Hi All, > > The following example: > > void f11(double * restrict z, double * restrict w, double * restrict x, >double * restrict y, int n) > { > for (int i = 0; i < n; i++) { > z[i] = (w[i] > 0) ? w[i] : y[i]; > } > } >

Re: [PATCH 4/5]AArch64 sve: optimize add reduction patterns

2021-10-14 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > The following loop does a conditional reduction using an add: > > #include > > int32_t f (int32_t *restrict array, int len, int min) > { > int32_t iSum = 0; > > for (int i=0; i if (array[i] >= min) >iSum += array[i]; > } > return iSum; > }

Ping: [PATCH] Add a simulate_record_decl lang hook

2021-10-14 Thread Richard Sandiford via Gcc-patches
Ping Richard Sandiford via Gcc-patches writes: > This patch adds a lang hook for defining a struct/RECORD_TYPE > “as if” it had appeared directly in the source code. It follows > the similar existing hook for enums. > > It's the caller's responsibility to create the

[PATCH] arm: Remove add_stmt_cost hook

2021-10-14 Thread Richard Sandiford via Gcc-patches
The arm implementation of add_stmt_cost was added alongside arm_builtin_vectorization_cost. At that time it was necessary to override the latter when overriding the former, since default_add_stmt_cost didn't indirect through the builtin_vectorization_cost target hook: int stmt_cost = defaul

[committed] aarch64: Remove redundant flag_vect_cost_model test

2021-10-14 Thread Richard Sandiford via Gcc-patches
The aarch64 version of add_stmt_cost has a redundant test of flag_vect_cost_model. The current structure was based on the contemporaneous definition of default_add_stmt_cost, but g:d6d1127249564146429009e0682f25bd58d7a791 later removed the flag_vect_cost_model test from the default version. Teste

Re: [PATCH] aarch64: Fix pointer parameter type in LD1 Neon intrinsics

2021-10-14 Thread Richard Sandiford via Gcc-patches
Jonathan Wright via Gcc-patches writes: > The pointer parameter to load a vector of signed values should itself > be a signed type. This patch fixes two instances of this unsigned- > signed implicit conversion in arm_neon.h. > > Tested relevant intrinsics with -Wpointer-sign and warnings no longer

[PATCH] rs6000: Fix memory leak in rs6000_density_test

2021-10-14 Thread Richard Sandiford via Gcc-patches
rs6000_density_test has an early exit test between a call to get_loop_body and the corresponding free. This would lead to a memory leak if the early exit is taken. Tested on powerpc64le-linux-gnu. It's obvious that moving the test avoids the leak, but there are multiple ways to write it, so: OK

[RFC] vect: Convert cost hooks to classes

2021-10-14 Thread Richard Sandiford via Gcc-patches
The current vector cost interface has a quite a bit of redundancy built in. Each target that defines its own hooks has to replicate the basic unsigned[3] management. Currently each target also duplicates the cost adjustment for inner loops. This patch instead defines a vector_costs class for hol

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-10-14 Thread Richard Sandiford via Gcc-patches
Robin Dapp writes: > Hi Richard, > >> (2) Insert: >> >> if (SUBREG_P (src)) >>src = SUBREG_REG (src); >> >> here. >> >> OK with those changes if that works. Let me know if they don't — >> I'll try to be quicker with the next review. > > thank you, this looks good in a first testsu

Re: [PATCH]AArch64 Lower intrinsics shift to GIMPLE when possible.

2021-10-15 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: > Hi All, > > This lowers shifts to GIMPLE when the C interpretations of the shift > operations > matches that of AArch64. > > In C shifting right by BITSIZE is undefined, but the behavior is defined in > AArch64. Additionally negative shifts lefts are undefined in C but

Re: [PATCH v2 03/14] arm: Add tests for PR target/101325

2021-10-15 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > These tests are derived from the one provided in the PR: there is a > compile-only test because I did not have access to anything that could > execute MVE code until recently. > I have been able to add an executable test since QEMU supports MVE. > > Instea

Re: [PATCH v2 09/14] arm: Fix vcond_mask expander for MVE (PR target/100757)

2021-10-15 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > @@ -31086,36 +31087,20 @@ arm_expand_vector_compare (rtx target, rtx_code > code, rtx op0, rtx op1, > case NE: >if (TARGET_HAVE_MVE) > { > - rtx vpr_p0; > - if (vcond_mve) > - vpr_p0 = target; > - else > -

Re: [PATCH v2 12/14] arm: Convert more load/store MVE builtins to predicate qualifiers

2021-10-15 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This patch covers a few builtins where we do not use the > iterator and thus we cannot use . > > For v2di instructions, we use the V8BI mode for predicates. Why V8BI though, when VPRED uses HI? Would it make sense to define a V2BI? Or doesn't that work

Re: [PATCH v2 00/14] ARM/MVE use vectors of boolean for predicates

2021-10-15 Thread Richard Sandiford via Gcc-patches
Christophe Lyon via Gcc-patches writes: > This is v2 of this patch series, addressing the comments I received. > The changes v1 -> v2 are: > > - Patch 3: added an executable test, and updated > check_effective_target_arm_mve_hw > - Patch 4: split into patch 4 and patch 14 (to keep numbering the

Re: [PATCH]AArch64 Lower intrinsics shift to GIMPLE when possible.

2021-10-15 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, October 15, 2021 1:26 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH]AArch64 Lower intrinsics shift

Re: [PATCH v2 12/14] arm: Convert more load/store MVE builtins to predicate qualifiers

2021-10-15 Thread Richard Sandiford via Gcc-patches
Christophe LYON writes: > On 15/10/2021 17:08, Richard Sandiford wrote: >> Christophe Lyon via Gcc-patches writes: >>> This patch covers a few builtins where we do not use the >>> iterator and thus we cannot use . >>> >>> For v2di instructions, we use the V8BI mode for predicates. >> Why V8BI th

Re: [SVE] [gimple-isel] PR93183 - SVE does not use neg as conditional

2021-10-18 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c > b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c > index 4604365fbef..cedc5b7c549 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_4.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/c

Re: [PATCH][RFC] Introduce TREE_AOREFWRAP to cache ao_ref in the IL

2021-10-18 Thread Richard Sandiford via Gcc-patches
Michael Matz via Gcc-patches writes: > Hello, > > On Thu, 14 Oct 2021, Richard Biener wrote: > >> > So, at _this_ write-through of the email I think I like the above idea >> > best: make ao_ref be a tree (at least its storage, because it currently >> > is a one-member-function class), make ao_re

Re: [PATCH] AArch64: Enable fast shifts on Neoverse V1/N2

2021-10-18 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Enable the fast shift feature in Neoverse V1 and N2 tunings as well. > > ChangeLog: > 2021-10-18 Wilco Dijkstra > > * config/aarch64/aarch64.c (neoversev1_tunings): > Enable AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND. > (neoversen2_tunings): Likewise.

Re: [PATCH] AArch64: Tune case-values-threshold

2021-10-18 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Tune the case-values-threshold setting for modern cores. A value of 11 > improves > SPECINT2017 by 0.2% and reduces codesize by 0.04%. With -Os use value 8 which > reduces codesize by 0.07%. > > Passes regress, OK for commit? > > ChangeLog: > > 2021-10-18 Wilco Dijkstr

Re: [PATCH] Add a simulate_record_decl lang hook

2021-10-18 Thread Richard Sandiford via Gcc-patches
Jason Merrill writes: > On 9/24/21 13:53, Richard Sandiford wrote: >> This patch adds a lang hook for defining a struct/RECORD_TYPE >> “as if” it had appeared directly in the source code. It follows >> the similar existing hook for enums. >> >> It's the caller's responsibility to create the fiel

Re: [SVE] Adjust PR93183 test-case to compile with -march=armv8.3-a+sve

2021-10-19 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi, > The attached patch removes "-mcpu=generic+sve" from dg-options, > because it conflicts > with -march=armv8.3-a+sve, and resulted in: > > cc1: warning: switch '-mcpu=generic+sve' conflicts with > '-march=armv8.3-a+sve' switch^M > FAIL: gcc.target/aarch64/sve/pr93

Re: [PATCH][aarch64] target: Support whitespaces in target attr/pragma.

2021-10-19 Thread Richard Sandiford via Gcc-patches
Martin Liška writes: > Hello. > > The patch does the same as g:df592811f950301ed3b10a08e476dad0f2eff26a for > aarch64. > > Tested locally with cross compiler. > > Ready for master? > > Thanks, > Martin > > PR target/102375 > > gcc/ChangeLog: > > * config/aarch64/aarch64.c (aarch64_pro

Re: [PATCH][aarch64] target: Support whitespaces in target attr/pragma.

2021-10-19 Thread Richard Sandiford via Gcc-patches
Martin Liška writes: > On 10/19/21 12:52, Richard Sandiford wrote: >> It looks like this ought to happen after the alloca and copy, since it >> modifies the string. > > Oh yeah, good point. > > Ready to be installed with the change? > Thanks, > Martin > > From 68df4cba3bccb714a14e3c795e6d9e4a44c54

Re: [PATCH] AArch64: Tune case-values-threshold

2021-10-19 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Hi Richard, > >> I'm just concerned that here we're using the same explanation but with >> different numbers. Why are the new numbers more right than the old ones >> (especially when it comes to code size, where the trade-off hasn't >> really changed)? > > Like all tuning

Re: [aarch64] PR102376 - Emit better diagnostic for arch extensions in target attr

2021-10-19 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > Hi, > The attached patch emits a more verbose diagnostic for target attribute that > is an architecture extension needing a leading '+'. > > For the following test, > void calculate(void) __attribute__ ((__target__ ("sve"))); > > With patch, the compiler now emits: >

Re: [RFC] Partial vectors for s390

2021-10-20 Thread Richard Sandiford via Gcc-patches
Robin Dapp via Gcc-patches writes: > Hi, > > I have been playing around with making Kewen's partial vector changes > workable with s390: > > We have a vll instruction that can be passed the highest byte to load. > The rather unfortunate consequence of this is that a length of zero > cannot be s

Re: [aarch64] PR102376 - Emit better diagnostic for arch extensions in target attr

2021-10-20 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni writes: > On Tue, 19 Oct 2021 at 19:58, Richard Sandiford > wrote: >> >> Prathamesh Kulkarni writes: >> > Hi, >> > The attached patch emits a more verbose diagnostic for target attribute >> > that >> > is an architecture extension needing a leading '+'. >> > >> > For the fol

Re: [PATCH] AArch64: Tune case-values-threshold

2021-10-20 Thread Richard Sandiford via Gcc-patches
Wilco Dijkstra writes: > Hi Richard, > >> The problem is that you're effectively asking for these values to be >> taken on faith without providing any analysis and without describing >> how you arrived at the new numbers. Did you try other values too? >> If so, how did they compare with the numbe

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

2021-10-22 Thread Richard Sandiford via Gcc-patches
"Andre Vieira (lists)" writes: > On 15/10/2021 09:48, Richard Biener wrote: >> On Tue, 12 Oct 2021, Andre Vieira (lists) wrote: >> >>> Hi Richi, >>> >>> I think this is what you meant, I now hide all the unrolling cost >>> calculations >>> in the existing target hooks for costs. I did need to adj

<    7   8   9   10   11   12   13   14   15   16   >