Re: [PATCH] aarch64: Improve on ldp-stp policies code structure.

2023-09-29 Thread Richard Sandiford
Thanks for the update. Manos Anagnostakis writes: > Improves on: 834fc2bf > > This improves the code structure of the ldp-stp policies > patch introduced in 834fc2bf > > Bootstrapped and regtested on aarch64-linux. > > gcc/ChangeLog: > * config/aarch64/aarch64-opts.h (enum aarch64_ldp_polic

Re: [RFC] > WIDE_INT_MAX_PREC support in wide_int and widest_int

2023-09-29 Thread Richard Sandiford
Richard Biener writes: > On Thu, 28 Sep 2023, Jakub Jelinek wrote: > >> Hi! >> >> On Tue, Aug 29, 2023 at 05:09:52PM +0200, Jakub Jelinek via Gcc-patches >> wrote: >> > On Tue, Aug 29, 2023 at 11:42:48AM +0100, Richard Sandiford wrote: >> > >

Re: [PATCH v2] aarch64: Improve on ldp-stp policies code structure.

2023-09-29 Thread Richard Sandiford
Manos Anagnostakis writes: > Improves on: 834fc2bf > > This improves the code structure of the ldp-stp policies > patch introduced in 834fc2bf > > Bootstrapped and regtested on aarch64-linux. > > gcc/ChangeLog: > * config/aarch64/aarch64-opts.h (enum aarch64_ldp_policy): Removed. > (en

[pushed] Fix profiledbootstrap poly_int fallout [PR111642]

2023-10-01 Thread Richard Sandiford
rtl-tests.cc and simplify-rtx.cc used partial specialisation to try to restrict the NUM_POLY_INT_COEFFS>1 tests without resorting to preprocessor tests. That now triggers an error in some configurations, since the NUM_POLY_INT_COEFFS>1 tests used the global poly_int64, whose definition does not de

Re: [PATCH] Remove poly_int_pod

2023-10-02 Thread Richard Sandiford
Jan-Benedict Glaw writes: > Hi Richard, > > On Thu, 2023-09-28 10:55:46 +0100, Richard Sandiford > wrote: >> poly_int was written before the switch to C++11 and so couldn't >> use explicit default constructors. This led to an awkward split >> between poly_int

Re: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-05 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This adds an implementation for masked copysign along with an optimized > pattern for masked copysign (x, -1). It feels like we're ending up with a lot of AArch64-specific code that just hard-codes the observation that changing the sign is equivalent to chang

Re: [PATCH]AArch64 Add special patterns for creating DI scalar and vector constant 1 << 63 [PR109154]

2023-10-05 Thread Richard Sandiford
Tamar Christina writes: > Hi, > >> The lowpart_subreg should simplify this back into CONST0_RTX (mode), >> making it no different from: >> >> emti_move_insn (target, CONST0_RTX (mode)); >> >> If the intention is to share zeros between modes (sounds good!), then I think >> the subreg needs to

Re: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-05 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Thursday, October 5, 2023 8:29 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkach

Re: [PATCH V2] Emit funcall external declarations only if actually used.

2023-10-05 Thread Richard Sandiford
"Jose E. Marchesi" writes: > ping I don't know this code very well, and have AFAIR haven't worked with an assembler that requires external declarations, but since it's at a second ping :) > >> ping >> >>> [Differences from V1: >>> - Prototype for call_from_call_insn moved before comment block. >

Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-10-07 Thread Richard Sandiford
Richard Biener writes: > On Thu, 5 Oct 2023, Tamar Christina wrote: > >> > I suppose the idea is that -abs(x) might be easier to optimize with other >> > patterns (consider a - copysign(x,...), optimizing to a + abs(x)). >> > >> > For abs vs copysign it's a canonicalization, but (negate (abs @0))

Re: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-07 Thread Richard Sandiford
Richard Biener writes: > On Thu, Oct 5, 2023 at 10:46 PM Tamar Christina > wrote: >> >> > -Original Message----- >> > From: Richard Sandiford >> > Sent: Thursday, October 5, 2023 9:26 PM >> > To: Tamar Christina >> > Cc: gcc

Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-10-07 Thread Richard Sandiford
Richard Biener writes: >> Am 07.10.2023 um 11:23 schrieb Richard Sandiford >> >> Richard Biener writes: >>> On Thu, 5 Oct 2023, Tamar Christina wrote: >>> >>>>> I suppose the idea is that -abs(x) might be easier to optimize with other >&

Re: [PATCH 6/6] aarch64: Add front-end argument type checking for target builtins

2023-10-07 Thread Richard Sandiford
Richard Earnshaw writes: > On 03/10/2023 16:18, Victor Do Nascimento wrote: >> In implementing the ACLE read/write system register builtins it was >> observed that leaving argument type checking to be done at expand-time >> meant that poorly-formed function calls were being "fixed" by certain >> o

Re: [PATCH v2][GCC] aarch64: Enable Cortex-X4 CPU

2023-10-07 Thread Richard Sandiford
Saurabh Jha writes: > On 10/6/2023 2:24 PM, Saurabh Jha wrote: >> Hey, >> >> This patch adds support for the Cortex-X4 CPU to GCC. >> >> Regression testing for aarch64-none-elf target and found no regressions. >> >> Okay for gcc-master? I don't have commit access so if it looks okay, >> could som

Re: [PATCH] RFC: Add late-combine pass [PR106594]

2023-10-07 Thread Richard Sandiford
Robin Dapp writes: > Hi Richard, > > cool, thanks. I just gave it a try with my test cases and it does what > it is supposed to do, at least if I disable the register pressure check :) > A cursory look over the test suite showed no major regressions and just > some overly specific tests. > > My t

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-08 Thread Richard Sandiford
Robin Dapp writes: > Hi Tamar, > >> The only comment I have is whether you actually need this helper >> function? It looks like all the uses of it are in cases you have, or >> will call conditional_internal_fn_code directly. > removed the cond_fn_p entirely in the attached v3. > > Bootstrapped and

Re: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-09 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Saturday, October 7, 2023 10:58 AM >> To: Richard Biener >> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; >> nd ; Richard Earnshaw ; >> Marcus Shawcroft ; Kyrylo

Re: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-09 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, October 9, 2023 10:56 AM >> To: Tamar Christina >> Cc: Richard Biener ; gcc-patches@gcc.gnu.org; >> nd ; Richard Earnshaw ; >> Marcus Shawcroft ; Kyrylo

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-09 Thread Richard Sandiford
Prathamesh Kulkarni writes: > Hi, > The attached patch attempts to fix PR111648. > As mentioned in PR, the issue is when a1 is a multiple of vector > length, we end up creating following encoding in result: { base_elem, > arg[0], arg[1], ... } (assuming S = 1), > where arg is chosen input vector,

Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

2023-10-09 Thread Richard Sandiford
Jakub Jelinek writes: > Hi! > > As mentioned in the _BitInt support thread, _BitInt(N) is currently limited > by the wide_int/widest_int maximum precision limitation, which is depending > on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION). > That is fairly low limit for _Bi

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-09 Thread Richard Sandiford
Robin Dapp writes: >> It'd be good to expand on this comment a bit. What kind of COND are you >> anticipating? A COND with the neutral op as the else value, so that the >> PLUS_EXPR (or whatever) can remain unconditional? If so, it would be >> good to sketch briefly how that happens, and why it

Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

2023-10-10 Thread Richard Sandiford
Jakub Jelinek writes: > On Mon, Oct 09, 2023 at 03:44:10PM +0200, Jakub Jelinek wrote: >> Thanks, just quick answers, will work on patch adjustments after trying to >> get rid of rwide_int (seems dwarf2out has very limited needs from it, just >> some routine to construct it in GCed memory (and nev

Re: [PATCH 02/11] Handle epilogues that contain jumps

2023-10-12 Thread Richard Sandiford
Richard Biener writes: > On Tue, Aug 22, 2023 at 12:42 PM Szabolcs Nagy via Gcc-patches > wrote: >> >> From: Richard Sandiford >> >> The prologue/epilogue pass allows the prologue sequence >> to contain jumps. The sequence is then pa

Re: [PATCH 6/6] aarch64: Add front-end argument type checking for target builtins

2023-10-12 Thread Richard Sandiford
"Richard Earnshaw (lists)" writes: > On 09/10/2023 14:12, Victor Do Nascimento wrote: >> >> >> On 10/7/23 12:53, Richard Sandiford wrote: >>> Richard Earnshaw writes: >>>> On 03/10/2023 16:18, Victor Do Nascimento wrote: >>>>>

Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-12 Thread Richard Sandiford
Robin Dapp writes: >> It wasn't very clear, sorry, but it was the last sentence I was asking >> for clarification on, not the other bits. Why do we want to avoid >> generating a COND_ADD when the operand is a vectorisable call? > > Ah, I see, apologies. Upon thinking about it a bit more (thanks)

Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

2023-10-12 Thread Richard Sandiford
Jakub Jelinek writes: > @@ -2036,11 +2075,20 @@ wi::lrshift_large (HOST_WIDE_INT *val, c > unsigned int xlen, unsigned int xprecision, > unsigned int precision, unsigned int shift) > { > - unsigned int len = rshift_large_common (val, xval, xlen, xprecision, > s

Re: [PATCH-1v2, expand] Enable vector mode for compare_by_pieces [PR111449]

2023-10-12 Thread Richard Sandiford
HAO CHEN GUI writes: > Hi, > Vector mode instructions are efficient on some targets (e.g. ppc64). > This patch enables vector mode for compare_by_pieces. The non-member > function widest_fixed_size_mode_for_size takes by_pieces_operation > as the second argument and decide whether vector mode is

Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

2023-10-12 Thread Richard Sandiford
Jakub Jelinek writes: > On Thu, Oct 12, 2023 at 11:54:14AM +0100, Richard Sandiford wrote: >> Jakub Jelinek writes: >> > @@ -2036,11 +2075,20 @@ wi::lrshift_large (HOST_WIDE_INT *val, c >> > unsigned int xlen, unsigned int xprecision, >> >

Re: [PATCH V2] Emit funcall external declarations only if actually used.

2023-10-12 Thread Richard Sandiford
"Jose E. Marchesi" writes: > Hi Richard. > Thanks for looking at this! :) > > >> "Jose E. Marchesi" writes: >>> ping >> >> I don't know this code very well, and have AFAIR haven't worked >> with an assembler that requires external declarations, but since >> it's at a second ping :) >> >>> pi

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-12 Thread Richard Sandiford
Robin Dapp via Gcc-patches writes: > Hi, > > as Juzhe noticed in gcc.dg/pr92301.c there was still something missing in > the last patch. The attached v2 makes sure we always have a COND_LEN > operation > before returning true and initializes len and bias even if they are unused. > > Bootstrapped

Re: [PATCH V2] Emit funcall external declarations only if actually used.

2023-10-12 Thread Richard Sandiford
"Jose E. Marchesi" writes: >> "Jose E. Marchesi" writes: >>> Hi Richard. >>> Thanks for looking at this! :) >>> >>> "Jose E. Marchesi" writes: > ping I don't know this code very well, and have AFAIR haven't worked with an assembler that requires external declarations, but

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-12 Thread Richard Sandiford
Richard Sandiford writes: > Robin Dapp via Gcc-patches writes: >> [...] >> @@ -386,9 +390,29 @@ try_conditional_simplification (internal_fn ifn, >> gimple_match_op *res_op, >> default: >>gcc_unreachable (); >> } >> - *res_op = c

Re: [PATCH 5/6]AArch64: Fix Armv9-a warnings that get emitted whenever a ACLE header is used.

2023-10-12 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > At the moment, trying to use -march=armv9-a with any ACLE header such as > arm_neon.h results in rows and rows of warnings saying: > > : warning: "__ARM_ARCH" redefined > : note: this is the location of the previous definition > > This is obviously not useful

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-15 Thread Richard Sandiford
"Roger Sayle" writes: > I'd like to ping my patch for restoring bootstrap using g++ 4.8.5 > (the system compiler on RHEL 7 and later systems). > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632008.html > > Note the preprocessor #ifs can be removed; they are only there to document > why t

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-16 Thread Richard Sandiford
Prathamesh Kulkarni writes: > On Wed, 11 Oct 2023 at 16:57, Prathamesh Kulkarni > wrote: >> >> On Wed, 11 Oct 2023 at 16:42, Prathamesh Kulkarni >> wrote: >> > >> > On Mon, 9 Oct 2023 at 17:05, Richard Sandiford >> > wrote: >> > >

Re: [PATCH V3] VECT: Enhance SLP of MASK_LEN_GATHER_LOAD[PR111721]

2023-10-16 Thread Richard Sandiford
Juzhe-Zhong writes: > This patch fixes this following FAILs in RISC-V regression: > > FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump > vect "Loop contains only SLP stmts" > FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP > stmts" > FAIL: g

Re: PATCH-1v3, expand] Enable vector mode for compare_by_pieces [PR111449]

2023-10-16 Thread Richard Sandiford
Thanks for the update. The comments below are mostly asking for cosmetic changes. HAO CHEN GUI writes: > Hi, > Vector mode instructions are efficient for compare on some targets. > This patch enables vector mode for compare_by_pieces. Currently, > vector mode is enabled for compare, set and cl

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-16 Thread Richard Sandiford
Robin Dapp writes: >> Why are the contents of this if statement wrong for COND_LEN? >> If the "else" value doesn't matter, then the masked form can use >> the "then" value for all elements. I would have expected the same >> thing to be true of COND_LEN. > > Right, that one was overly pessimistic.

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Robin Dapp writes: >>> I don't know much about valueisation either :) But it does feel >>> like we're working around the lack of a LEN form of COND_EXPR. >>> In other words, it seems odd that we can do: >>> >>> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >>> >>> but we can't do: >>> >>> IFN_C

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Richard Biener writes: > On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford > wrote: >> >> Robin Dapp writes: >> >> Why are the contents of this if statement wrong for COND_LEN? >> >> If the "else" value doesn't matter, then the masked

Re: [PATCH] gimple-match: Do not try UNCOND optimization with COND_LEN.

2023-10-17 Thread Richard Sandiford
Robin Dapp writes: > Thank you for the explanation. > > So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along > with the respective helper and expand functions, what would be the > way forward? IMO it'd be worth starting with the _LEN form only. > Generate an IFN_VCOND_MASK(_LEN) h

[PATCH 1/2] aarch64: Use vecs to store register save order

2023-10-17 Thread Richard Sandiford
aarch64_save/restore_callee_saves looped over registers in register number order. This in turn meant that we could only use LDP and STP for registers that were consecutive both number-wise and offset-wise (after unsaved registers are excluded). This patch instead builds lists of the registers tha

[PATCH 2/2] aarch64: Put LR save slot first in more cases

2023-10-17 Thread Richard Sandiford
Now that the prologue and epilogue code iterates over saved registers in offset order, we can put the LR save slot first without compromising LDP/STP formation. This isn't worthwhile when shadow call stacks are enabled, since the first two registers are also push/pop candidates, and LR cannot be p

Re: [PATCH] Support g++ 4.8 as a host compiler.

2023-10-18 Thread Richard Sandiford
Jakub Jelinek writes: > On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote: >> It seemed like there was considerable support for bumping the minimum >> to beyond 4.8. I think we should wait until a decision has been made >> before adding more 4.8 workarounds.

[Backport RFA] lra: Avoid unfolded plus-0

2023-10-18 Thread Richard Sandiford
Vlad, is it OK if I backport the patch below to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has given a conditional OK on irc. Thanks, Richard Richard Sandiford writes: > While backporting another patch to an earlier release, I hit a > situation in

Re: PR111648: Fix wrong code-gen due to incorrect VEC_PERM_EXPR folding

2023-10-18 Thread Richard Sandiford
Prathamesh Kulkarni writes: > On Tue, 17 Oct 2023 at 02:40, Richard Sandiford > wrote: >> Prathamesh Kulkarni writes: >> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc >> > index 4f8561509ff..55a6a68c16c 100644 >> > --- a/gcc/fold-const.cc >&g

Re: [PATCH 01/11] rtl-ssa: Fix bug in function_info::add_insn_after

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > In the case that !insn->is_debug_insn () && next->is_debug_insn (), this > function was missing an update of the prev pointer on the first nondebug > insn following the sequence of debug insns starting at next. > > This can lead to corruption of the insn chain, in that we end

Re: [PATCH 02/11] rtl-ssa: Add drop_memory_access helper

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > Add a helper routine to access-utils.h which removes the memory access > from an access_array, if it has one. > > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? > > gcc/ChangeLog: > > * rtl-ssa/access-utils.h (drop_memory_access): New. > --- > g

Re: [PATCH 03/11] rtl-ssa: Add entry point to allow re-parenting uses

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > This is needed by the upcoming aarch64 load pair pass, as it can > re-order stores (when alias analysis determines this is safe) and thus > change which mem def a given use consumes (in the RTL-SSA view, there is > no alias disambiguation of memory). > > Bootstrapped/regteste

Re: [PATCH 04/11] rtl-ssa: Support inferring uses of mem in change_insns

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > Currently, rtl_ssa::change_insns requires all new uses and defs to be > specified explicitly. This turns out to be rather inconvenient for > forming load pairs in the new aarch64 load pair pass, as the pass has to > determine which mem def the final load pair consumes, and t

Re: [PATCH 07/11] aarch64, testsuite: Prevent stp in lr_free_1.c

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > The test is looking for individual stores which are able to be merged > into stp instructions. The test currently passes -fno-schedule-fusion > -fno-peephole2, presumably to prevent these stores from being turned > into stps, but this is no longer sufficient with the new ldp

Re: [PATCH 08/11] aarch64, testsuite: Tweak sve/pcs/args_9.c to allow stps

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > With the new ldp/stp pass enabled, there is a change in the codegen for > this test as follows: > > add x8, sp, 16 > ptrue p3.h, mul3 > str p3, [x8] > - str x8, [sp, 8] > - str x9, [sp] > + stp x9, x8, [sp] >

Re: [PATCH 09/11] aarch64, testsuite: Fix up pr71727.c

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > The test is trying to check that we don't use q-register stores with > -mstrict-align, so actually check specifically for that. > > This is a prerequisite to avoid regressing: > > scan-assembler-not "add\tx0, x0, :" > > with the upcoming ldp fusion pass, as we change where th

Re: [PATCH 10/11] aarch64: Generalise TFmode load/store pair patterns

2023-10-18 Thread Richard Sandiford
Alex Coplan writes: > This patch generalises the TFmode load/store pair patterns to TImode and > TDmode. This brings them in line with the DXmode patterns, and uses the > same technique with separate mode iterators (TX and TX2) to allow for > distinct modes in each arm of the load/store pair. > >

Re: [PATCH V2 3/7] aarch64: Implement system register validation tools

2023-10-18 Thread Richard Sandiford
Generally looks really good. Some comments below. Victor Do Nascimento writes: > Given the implementation of a mechanism of encoding system registers > into GCC, this patch provides the mechanism of validating their use by > the compiler. In particular, this involves: > > 1. Ensuring a suppli

Re: [PATCH V2 5/7] aarch64: Implement system register r/w arm ACLE intrinsic functions

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Implement the aarch64 intrinsics for reading and writing system > registers with the following signatures: > > uint32_t __arm_rsr(const char *special_register); > uint64_t __arm_rsr64(const char *special_register); > void* __arm_rsrp(const char *spe

Re: [PATCH V2 2/7] aarch64: Add support for aarch64-sys-regs.def

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > This patch defines the structure of a new .def file used for > representing the aarch64 system registers, what information it should > hold and the basic framework in GCC to process this file. > > Entries in the aarch64-system-regs.def file should be as follows: > >

Re: [PATCH V2 4/7] aarch64: Add basic target_print_operand support for CONST_STRING

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Motivated by the need to print system register names in output > assembly, this patch adds the required logic to > `aarch64_print_operand' to accept rtxs of type CONST_STRING and > process these accordingly. > > Consequently, an rtx such as: > > (set (reg/i:DI 0 x0

Re: [PATCH V2 6/7] aarch64: Add front-end argument type checking for target builtins

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > In implementing the ACLE read/write system register builtins it was > observed that leaving argument type checking to be done at expand-time > meant that poorly-formed function calls were being "fixed" by certain > optimization passes, meaning bad code wasn't being p

Re: [PATCH V2 7/7] aarch64: Add system register duplication check selftest

2023-10-18 Thread Richard Sandiford
Victor Do Nascimento writes: > Add a build-time test to check whether system register data, as > imported from `aarch64-sys-reg.def' has any duplicate entries. > > Duplicate entries are defined as any two SYSREG entries in the .def > file which share the same encoding values (as specified by its `

Re: [PATCH, aarch64 3/4] aarch64: Add movprfx alternatives for predicate patterns

2018-07-02 Thread Richard Sandiford
Richard Henderson writes: > @@ -2687,34 +2738,60 @@ >aarch64_sve_prepare_conditional_op (operands, 5, ); > }) > > -;; Predicated floating-point operations. > -(define_insn "*cond_" > - [(set (match_operand:SVE_F 0 "register_operand" "=w") > +;; Predicated floating-point operations with sel

Re: [PATCH, aarch64 1/4] aarch64: Add movprfx alternatives for unpredicated patterns

2018-07-02 Thread Richard Sandiford
Richard Henderson writes: > * config/aarch64/aarch64.md (movprfx): New attr. > (length): Default movprfx to 8. > * config/aarch64/aarch64-sve.md (*mul3): Add movprfx alt. > (*madd, *msub (*mul3_highpart): Likewise. > (*3): Likewise. > (*v3): Likewise. >

Re: [PATCH, aarch64 2/4] aarch64: Remove predicate from inside SVE_COND_FP_BINARY

2018-07-02 Thread Richard Sandiford
Richard Henderson writes: > The predicate is present within the containing UNSPEC_SEL; > there is no need to duplicate it. > > * config/aarch64/aarch64-sve.md (cond_): > Remove match_dup 1 from the inner unspec. > (*cond_): Likewise. OK, thanks. Richard > --- > gcc/config/aar

Re: [PATCH, aarch64 4/4] aarch64: Add movprfx patterns for zero and unmatched select

2018-07-02 Thread Richard Sandiford
Richard Henderson writes: > * config/aarch64/aarch64-protos.h, config/aarch64/aarch64.c > (aarch64_sve_prepare_conditional_op): Remove. > * config/aarch64/aarch64-sve.md (cond_): > Allow aarch64_simd_reg_or_zero as select operand; remove > the aarch64_sve_prepare_cond

Re: [14/n] PR85694: Rework overwidening detection

2018-07-02 Thread Richard Sandiford
Christophe Lyon writes: > On Fri, 29 Jun 2018 at 13:36, Richard Sandiford > wrote: >> >> Richard Sandiford writes: >> > This patch is the main part of PR85694. The aim is to recognise at least: >> > >> > signed char *a, *b, *c; >> >

Re: [PATCH] [RFC] Higher-level reporting of vectorization problems

2018-07-02 Thread Richard Sandiford
Richard Biener writes: > On Fri, 22 Jun 2018, David Malcolm wrote: > >> NightStrike and I were chatting on IRC last week about >> issues with trying to vectorize the following code: >> >> #include >> std::size_t f(std::vector> const & v) { >> std::size_t ret = 0; >> for (auto const & w

Avoid matching the same pattern statement twice

2018-07-03 Thread Richard Sandiford
an up the PATTERN_DEF_SEQ handling, but they only apply after the complete PR85694 sequence, whereas this needs to go in before 14/n. Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu. OK to install? Richard 2018-07-03 Richard Sandiford gcc/ * tree-vect-patte

Clean up interface to vector pattern recognisers

2018-07-03 Thread Richard Sandiford
passing a single statement instead of a vector. It also gets rid of the clearing of STMT_VINFO_RELATED_STMT on failure, since no recognisers use it now. Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu. OK to install? Richard 2018-07-03 Richard Sandiford gcc

Ensure PATTERN_DEF_SEQ is empty before recognising patterns

2018-07-03 Thread Richard Sandiford
install? Richard 2018-07-03 Richard Sandiford gcc/ * tree-vect-patterns.c (new_pattern_def_seq): Delete. (vect_recog_dot_prod_pattern, vect_recog_sad_pattern) (vect_recog_widen_op_pattern, vect_recog_over_widening_pattern) (vect_recog_rotate_pattern

Pass more vector types to append_pattern_def_seq

2018-07-03 Thread Richard Sandiford
The PR85694 series added a vectype argument to append_pattern_def_seq. This patch makes more callers use it. Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu. OK to install? Richard 2018-07-03 Richard Sandiford gcc/ * tree-vect-patterns.c

Re: [14/n] PR85694: Rework overwidening detection

2018-07-03 Thread Richard Sandiford
Richard Biener writes: > On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford > wrote: >> >> Richard Sandiford writes: >> > This patch is the main part of PR85694. The aim is to recognise at least: >> > >> > signed char *a, *b, *c; >> >

Re: [14/n] PR85694: Rework overwidening detection

2018-07-04 Thread Richard Sandiford
Christophe Lyon writes: > On Tue, 3 Jul 2018 at 12:02, Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford >> > wrote: >> >> >> >> Richard Sandiford writes: >> >>

Re: Extend tree code folds to IFN_COND_*

2018-07-04 Thread Richard Sandiford
Finally getting back to this... Richard Biener writes: > On Wed, Jun 6, 2018 at 10:16 PM Richard Sandiford > wrote: >> >> > On Thu, May 24, 2018 at 11:36 AM Richard Sandiford >> > wrote: >> >> >> >> This patch adds match.pd support for

Re: [RFC, testsuite/guality] Use relative line numbers in gdb-test

2018-07-04 Thread Richard Sandiford
Tom de Vries writes: > [ was: [PATCH, testsuite/guality] Use line number vars in gdb-test ] > On Thu, Jun 28, 2018 at 07:49:30PM +0200, Tom de Vries wrote: >> Hi, >> >> I played around with pr45882.c and ran into FAILs. It took me a while to >> realize that the FAILs where due to the gdb-test (a

Re: [PATCH] doc clarification: DONE and FAIL in define_split and define_peephole2

2018-07-06 Thread Richard Sandiford
Paul Koning writes: > Currently DONE and FAIL are documented only for define_expand, but > they also work in essentially the same way for define_split and > define_peephole2. > > If FAIL is used in a define_insn_and_split, the output pattern cannot > be the usual "#" dummy value. > > This patch up

Re: [PATCH] doc clarification: DONE and FAIL in define_split and define_peephole2

2018-07-06 Thread Richard Sandiford
Paul Koning writes: > @@ -8615,6 +8639,34 @@ so here's a silly made-up example: >"") > @end smallexample > > +There are two special macros defined for use in the preparation statements: > +@code{DONE} and @code{FAIL}. Use them with a following semicolon, > +as a statement. > + > +@table @c

Re: calculate overflow type in wide int arithmetic

2018-07-07 Thread Richard Sandiford
Richard Biener writes: > On Fri, Jul 6, 2018 at 9:50 AM Aldy Hernandez wrote: >> >> >> >> On 07/05/2018 05:50 AM, Richard Biener wrote: >> > On Thu, Jul 5, 2018 at 9:35 AM Aldy Hernandez wrote: >> >> >> >> The reason for this patch are the changes showcased in tree-vrp.c. >> >> Basically I'd lik

Re: [PATCH 0/5] [RFC v2] Higher-level reporting of vectorization problems

2018-07-11 Thread Richard Sandiford
David Malcolm writes: > On Mon, 2018-06-25 at 11:10 +0200, Richard Biener wrote: >> On Fri, 22 Jun 2018, David Malcolm wrote: >> >> > NightStrike and I were chatting on IRC last week about >> > issues with trying to vectorize the following code: >> > >> > #include >> > std::size_t f(std::vector

Re: abstract wide int binop code from VRP

2018-07-11 Thread Richard Sandiford
Richard Biener writes: > On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote: >> >> Hmmm, I think we can do better, and since this hasn't been reviewed yet, >> I don't think anyone will mind the adjustment to the patch ;-). >> >> I really hate int_const_binop_SOME_RANDOM_NUMBER. We should abstr

Re: abstract wide int binop code from VRP

2018-07-11 Thread Richard Sandiford
Aldy Hernandez writes: > On 07/11/2018 08:52 AM, Richard Biener wrote: >> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote: >>> >>> Hmmm, I think we can do better, and since this hasn't been reviewed yet, >>> I don't think anyone will mind the adjustment to the patch ;-). >>> >>> I really hat

Re: RFC: lra-constraints.c and TARGET_HARD_REGNO_CALL_PART_CLOBBERED question/patch

2018-07-11 Thread Richard Sandiford
Jeff Law writes: > On 07/11/2018 02:07 PM, Steve Ellcey wrote: >> I have a reload/register allocation question and possible patch.  While >> working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was >> saving and restoring registers that it did not need to.  I tracked it >> down to lra

Re: abstract wide int binop code from VRP

2018-07-12 Thread Richard Sandiford
Aldy Hernandez writes: > On 07/11/2018 01:33 PM, Richard Sandiford wrote: >> Aldy Hernandez writes: >>> On 07/11/2018 08:52 AM, Richard Biener wrote: >>>> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote: >>>>> >>>>> Hmmm, I thi

Re: [PATCH][GCC][AARCH64] Canonicalize aarch64 widening simd plus insns

2018-07-12 Thread Richard Sandiford
Looks good to me FWIW (not a maintainer), just a minor formatting thing: Matthew Malcomson writes: > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9 > 100644 > --

Re: Add IFN_COND_FMA functions

2018-07-12 Thread Richard Sandiford
Richard Biener writes: > On Thu, May 24, 2018 at 2:08 PM Richard Sandiford < > richard.sandif...@linaro.org> wrote: > >> This patch adds conditional equivalents of the IFN_FMA built-in functions. >> Most of it is just a mechanical extension of the binary stuff. >

[gen/AArch64] Generate helpers for substituting iterator values into pattern names

2018-07-13 Thread Richard Sandiford
but OK for the AArch64 parts? Any objections to this approach or syntax? Richard 2018-07-13 Richard Sandiford gcc/ * doc/md.texi: Expand the documentation of instruction names to mention port-local uses. Document '@' in pattern names. * read-md.h (overloaded_insta

Re: [PATCH]Use MIN/MAX_EXPR for intrinsics or __builtin_fmin/max when appropriate

2018-07-18 Thread Richard Sandiford
Richard Biener writes: > On Wed, Jul 18, 2018 at 11:50 AM Kyrill Tkachov > wrote: >> >> >> On 18/07/18 10:44, Richard Biener wrote: >> > On Tue, Jul 17, 2018 at 3:46 PM Kyrill Tkachov >> > wrote: >> >> Hi Richard, >> >> >> >> On 17/07/18 14:27, Richard Biener wrote: >> >>> On Tue, Jul 17, 2018 a

Re: [PATCH][Fortran][v2] Use MIN/MAX_EXPR for min/max intrinsics

2018-07-18 Thread Richard Sandiford
Thanks for doing this. Kyrill Tkachov writes: > + calc = build_call_expr_internal_loc (input_location, ifn, type, > + 2, mvar, convert (type, val)); (indentation looks off) > diff --git a/gcc/testsuite/gfortran.dg/max_fmaxl_aarch64.f90 > b/gcc/testsuite

[wwwdocs] Document new sve-acle-branch

2018-07-18 Thread Richard Sandiford
E ACLE implementation. + The branch is based off and merged with trunk. Please send patches to + gcc-patches with an [SVE ACLE] tag in the subject line. + There's no need to use changelogs; the changelogs will instead be + written when the work is ready to be merged into trunk. The branch i

[AArch64] Add support for 16-bit FMOV immediates

2018-07-18 Thread Richard Sandiford
gives: fmovh0, 1.328125e-1 Tested on aarch64-linux-gnu, both with and without SVE. OK to install? Richard 2018-07-18 Richard Sandiford gcc/ * config/aarch64/aarch64.c (aarch64_float_const_representable_p): Allow HFmode constants if TARGET_FP_F16INST. gcc/test

[SVE ACLE] Add initial support for arm_sve.h

2018-07-18 Thread Richard Sandiford
This patch adds the target framework for handling the SVE ACLE, starting with four functions: svadd, svptrue, svsub and svsubr. The ACLE has both overloaded and non-overloaded names. Without the equivalent of clang's __attribute__((overloadable)), a header file that declared all functions would n

Re: RFC: Patch to implement Aarch64 SIMD ABI

2018-07-19 Thread Richard Sandiford
Hi, Thanks for doing this. Steve Ellcey writes: > This is a patch to support the Aarch64 SIMD ABI [1] in GCC.  I intend > to eventually follow this up with two more patches; one to define the > TARGET_SIMD_CLONE* macros and one to improve the GCC register > allocation/usage when calling SIMD fun

Re: [SVE ACLE] Add initial support for arm_sve.h

2018-07-19 Thread Richard Sandiford
Richard Biener writes: > On Wed, Jul 18, 2018 at 8:08 PM Richard Sandiford > wrote: >> >> This patch adds the target framework for handling the SVE ACLE, >> starting with four functions: svadd, svptrue, svsub and svsubr. >> >> The ACLE has both overloade

Handle SLP of call pattern statements

2018-07-20 Thread Richard Sandiford
re handled correctly. The patch therefore just removes the whole if block. The loop also needed commutative swapping to be extended to at least AVG_FLOOR. This gives +3.9% on 525.x264_r at -O3. Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and x86_64-li

Fold pointer range checks with equal spans

2018-07-20 Thread Richard Sandiford
alternative would be not to do this in match.pd and instead get tree-data-ref.c to do it itself. I started out that way but thought the match.pd approach seemed cleaner. Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2018-

Make the vectoriser drop to strided accesses for stores with gaps

2018-07-20 Thread Richard Sandiford
"interleaved store with gaps\n"); return false; } But I think we should do that separately and see what the fallout from this change is first. Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and x86_64-linux-gnu. OK to install? Richard 2018-0

Re: Make the vectoriser drop to strided accesses for stores with gaps

2018-07-20 Thread Richard Sandiford
Richard Biener writes: > On Fri, Jul 20, 2018 at 12:57 PM Richard Sandiford > wrote: >> >> We could vectorise: >> >> for (...) >>{ >> a[0] = ...; >> a[1] = ...; >> a[2] = ...; >> a[3]

Re: Fold pointer range checks with equal spans

2018-07-23 Thread Richard Sandiford
Marc Glisse writes: > On Fri, 20 Jul 2018, Richard Sandiford wrote: > >> --- gcc/match.pd 2018-07-18 18:44:22.565914281 +0100 >> +++ gcc/match.pd 2018-07-20 11:24:33.692045585 +0100 >> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) >>(if

[00/46] Remove vinfo_for_stmt etc.

2018-07-24 Thread Richard Sandiford
The aim of this series is to: (a) make the vectoriser refer to statements using its own expanded stmt_vec_info rather than the underlying gimple stmt. This reduces the number of stmt lookups from 480 in current sources to under 100. (b) make the remaining lookups relative the owning vec_

[01/46] Move special cases out of get_initial_def_for_reduction

2018-07-24 Thread Richard Sandiford
This minor clean-up avoids repeating the test for double reductions and also moves the vect_get_vec_def_for_operand call to the same function as the corresponding vect_get_vec_def_for_stmt_copy. 2018-07-24 Richard Sandiford gcc/ * tree-vect-loop.c (get_initial_def_for_reduction

[02/46] Remove dead vectorizable_reduction code

2018-07-24 Thread Richard Sandiford
nt is to remove the only path through vectorizable_reduction in which stmt and stmt_info refer to different statements. 2018-07-24 Richard Sandiford gcc/ * tree-vect-loop.c (vectorizable_reduction): Assert that the function is not called for second and subsequent members of

  1   2   3   4   5   6   7   8   9   10   >