Thanks for the update.
Manos Anagnostakis writes:
> Improves on: 834fc2bf
>
> This improves the code structure of the ldp-stp policies
> patch introduced in 834fc2bf
>
> Bootstrapped and regtested on aarch64-linux.
>
> gcc/ChangeLog:
> * config/aarch64/aarch64-opts.h (enum aarch64_ldp_polic
Richard Biener writes:
> On Thu, 28 Sep 2023, Jakub Jelinek wrote:
>
>> Hi!
>>
>> On Tue, Aug 29, 2023 at 05:09:52PM +0200, Jakub Jelinek via Gcc-patches
>> wrote:
>> > On Tue, Aug 29, 2023 at 11:42:48AM +0100, Richard Sandiford wrote:
>> > >
Manos Anagnostakis writes:
> Improves on: 834fc2bf
>
> This improves the code structure of the ldp-stp policies
> patch introduced in 834fc2bf
>
> Bootstrapped and regtested on aarch64-linux.
>
> gcc/ChangeLog:
> * config/aarch64/aarch64-opts.h (enum aarch64_ldp_policy): Removed.
> (en
rtl-tests.cc and simplify-rtx.cc used partial specialisation
to try to restrict the NUM_POLY_INT_COEFFS>1 tests without
resorting to preprocessor tests. That now triggers an error
in some configurations, since the NUM_POLY_INT_COEFFS>1 tests
used the global poly_int64, whose definition does not de
Jan-Benedict Glaw writes:
> Hi Richard,
>
> On Thu, 2023-09-28 10:55:46 +0100, Richard Sandiford
> wrote:
>> poly_int was written before the switch to C++11 and so couldn't
>> use explicit default constructors. This led to an awkward split
>> between poly_int
Tamar Christina writes:
> Hi All,
>
> This adds an implementation for masked copysign along with an optimized
> pattern for masked copysign (x, -1).
It feels like we're ending up with a lot of AArch64-specific code that
just hard-codes the observation that changing the sign is equivalent to
chang
Tamar Christina writes:
> Hi,
>
>> The lowpart_subreg should simplify this back into CONST0_RTX (mode),
>> making it no different from:
>>
>> emti_move_insn (target, CONST0_RTX (mode));
>>
>> If the intention is to share zeros between modes (sounds good!), then I think
>> the subreg needs to
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Thursday, October 5, 2023 8:29 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkach
"Jose E. Marchesi" writes:
> ping
I don't know this code very well, and have AFAIR haven't worked
with an assembler that requires external declarations, but since
it's at a second ping :)
>
>> ping
>>
>>> [Differences from V1:
>>> - Prototype for call_from_call_insn moved before comment block.
>
Richard Biener writes:
> On Thu, 5 Oct 2023, Tamar Christina wrote:
>
>> > I suppose the idea is that -abs(x) might be easier to optimize with other
>> > patterns (consider a - copysign(x,...), optimizing to a + abs(x)).
>> >
>> > For abs vs copysign it's a canonicalization, but (negate (abs @0))
Richard Biener writes:
> On Thu, Oct 5, 2023 at 10:46 PM Tamar Christina
> wrote:
>>
>> > -Original Message-----
>> > From: Richard Sandiford
>> > Sent: Thursday, October 5, 2023 9:26 PM
>> > To: Tamar Christina
>> > Cc: gcc
Richard Biener writes:
>> Am 07.10.2023 um 11:23 schrieb Richard Sandiford
>> >> Richard Biener writes:
>>> On Thu, 5 Oct 2023, Tamar Christina wrote:
>>>
>>>>> I suppose the idea is that -abs(x) might be easier to optimize with other
>&
Richard Earnshaw writes:
> On 03/10/2023 16:18, Victor Do Nascimento wrote:
>> In implementing the ACLE read/write system register builtins it was
>> observed that leaving argument type checking to be done at expand-time
>> meant that poorly-formed function calls were being "fixed" by certain
>> o
Saurabh Jha writes:
> On 10/6/2023 2:24 PM, Saurabh Jha wrote:
>> Hey,
>>
>> This patch adds support for the Cortex-X4 CPU to GCC.
>>
>> Regression testing for aarch64-none-elf target and found no regressions.
>>
>> Okay for gcc-master? I don't have commit access so if it looks okay,
>> could som
Robin Dapp writes:
> Hi Richard,
>
> cool, thanks. I just gave it a try with my test cases and it does what
> it is supposed to do, at least if I disable the register pressure check :)
> A cursory look over the test suite showed no major regressions and just
> some overly specific tests.
>
> My t
Robin Dapp writes:
> Hi Tamar,
>
>> The only comment I have is whether you actually need this helper
>> function? It looks like all the uses of it are in cases you have, or
>> will call conditional_internal_fn_code directly.
> removed the cond_fn_p entirely in the attached v3.
>
> Bootstrapped and
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Saturday, October 7, 2023 10:58 AM
>> To: Richard Biener
>> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
>> nd ; Richard Earnshaw ;
>> Marcus Shawcroft ; Kyrylo
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Monday, October 9, 2023 10:56 AM
>> To: Tamar Christina
>> Cc: Richard Biener ; gcc-patches@gcc.gnu.org;
>> nd ; Richard Earnshaw ;
>> Marcus Shawcroft ; Kyrylo
Prathamesh Kulkarni writes:
> Hi,
> The attached patch attempts to fix PR111648.
> As mentioned in PR, the issue is when a1 is a multiple of vector
> length, we end up creating following encoding in result: { base_elem,
> arg[0], arg[1], ... } (assuming S = 1),
> where arg is chosen input vector,
Jakub Jelinek writes:
> Hi!
>
> As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
> by the wide_int/widest_int maximum precision limitation, which is depending
> on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
> That is fairly low limit for _Bi
Robin Dapp writes:
>> It'd be good to expand on this comment a bit. What kind of COND are you
>> anticipating? A COND with the neutral op as the else value, so that the
>> PLUS_EXPR (or whatever) can remain unconditional? If so, it would be
>> good to sketch briefly how that happens, and why it
Jakub Jelinek writes:
> On Mon, Oct 09, 2023 at 03:44:10PM +0200, Jakub Jelinek wrote:
>> Thanks, just quick answers, will work on patch adjustments after trying to
>> get rid of rwide_int (seems dwarf2out has very limited needs from it, just
>> some routine to construct it in GCed memory (and nev
Richard Biener writes:
> On Tue, Aug 22, 2023 at 12:42 PM Szabolcs Nagy via Gcc-patches
> wrote:
>>
>> From: Richard Sandiford
>>
>> The prologue/epilogue pass allows the prologue sequence
>> to contain jumps. The sequence is then pa
"Richard Earnshaw (lists)" writes:
> On 09/10/2023 14:12, Victor Do Nascimento wrote:
>>
>>
>> On 10/7/23 12:53, Richard Sandiford wrote:
>>> Richard Earnshaw writes:
>>>> On 03/10/2023 16:18, Victor Do Nascimento wrote:
>>>>>
Robin Dapp writes:
>> It wasn't very clear, sorry, but it was the last sentence I was asking
>> for clarification on, not the other bits. Why do we want to avoid
>> generating a COND_ADD when the operand is a vectorisable call?
>
> Ah, I see, apologies. Upon thinking about it a bit more (thanks)
Jakub Jelinek writes:
> @@ -2036,11 +2075,20 @@ wi::lrshift_large (HOST_WIDE_INT *val, c
> unsigned int xlen, unsigned int xprecision,
> unsigned int precision, unsigned int shift)
> {
> - unsigned int len = rshift_large_common (val, xval, xlen, xprecision,
> s
HAO CHEN GUI writes:
> Hi,
> Vector mode instructions are efficient on some targets (e.g. ppc64).
> This patch enables vector mode for compare_by_pieces. The non-member
> function widest_fixed_size_mode_for_size takes by_pieces_operation
> as the second argument and decide whether vector mode is
Jakub Jelinek writes:
> On Thu, Oct 12, 2023 at 11:54:14AM +0100, Richard Sandiford wrote:
>> Jakub Jelinek writes:
>> > @@ -2036,11 +2075,20 @@ wi::lrshift_large (HOST_WIDE_INT *val, c
>> > unsigned int xlen, unsigned int xprecision,
>> >
"Jose E. Marchesi" writes:
> Hi Richard.
> Thanks for looking at this! :)
>
>
>> "Jose E. Marchesi" writes:
>>> ping
>>
>> I don't know this code very well, and have AFAIR haven't worked
>> with an assembler that requires external declarations, but since
>> it's at a second ping :)
>>
>>>
pi
Robin Dapp via Gcc-patches writes:
> Hi,
>
> as Juzhe noticed in gcc.dg/pr92301.c there was still something missing in
> the last patch. The attached v2 makes sure we always have a COND_LEN
> operation
> before returning true and initializes len and bias even if they are unused.
>
> Bootstrapped
"Jose E. Marchesi" writes:
>> "Jose E. Marchesi" writes:
>>> Hi Richard.
>>> Thanks for looking at this! :)
>>>
>>>
"Jose E. Marchesi" writes:
> ping
I don't know this code very well, and have AFAIR haven't worked
with an assembler that requires external declarations, but
Richard Sandiford writes:
> Robin Dapp via Gcc-patches writes:
>> [...]
>> @@ -386,9 +390,29 @@ try_conditional_simplification (internal_fn ifn,
>> gimple_match_op *res_op,
>> default:
>>gcc_unreachable ();
>> }
>> - *res_op = c
Tamar Christina writes:
> Hi All,
>
> At the moment, trying to use -march=armv9-a with any ACLE header such as
> arm_neon.h results in rows and rows of warnings saying:
>
> : warning: "__ARM_ARCH" redefined
> : note: this is the location of the previous definition
>
> This is obviously not useful
"Roger Sayle" writes:
> I'd like to ping my patch for restoring bootstrap using g++ 4.8.5
> (the system compiler on RHEL 7 and later systems).
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632008.html
>
> Note the preprocessor #ifs can be removed; they are only there to document
> why t
Prathamesh Kulkarni writes:
> On Wed, 11 Oct 2023 at 16:57, Prathamesh Kulkarni
> wrote:
>>
>> On Wed, 11 Oct 2023 at 16:42, Prathamesh Kulkarni
>> wrote:
>> >
>> > On Mon, 9 Oct 2023 at 17:05, Richard Sandiford
>> > wrote:
>> > >
Juzhe-Zhong writes:
> This patch fixes this following FAILs in RISC-V regression:
>
> FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump
> vect "Loop contains only SLP stmts"
> FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP
> stmts"
> FAIL: g
Thanks for the update. The comments below are mostly asking for
cosmetic changes.
HAO CHEN GUI writes:
> Hi,
> Vector mode instructions are efficient for compare on some targets.
> This patch enables vector mode for compare_by_pieces. Currently,
> vector mode is enabled for compare, set and cl
Robin Dapp writes:
>> Why are the contents of this if statement wrong for COND_LEN?
>> If the "else" value doesn't matter, then the masked form can use
>> the "then" value for all elements. I would have expected the same
>> thing to be true of COND_LEN.
>
> Right, that one was overly pessimistic.
Robin Dapp writes:
>>> I don't know much about valueisation either :) But it does feel
>>> like we're working around the lack of a LEN form of COND_EXPR.
>>> In other words, it seems odd that we can do:
>>>
>>> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias)
>>>
>>> but we can't do:
>>>
>>> IFN_C
Richard Biener writes:
> On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford
> wrote:
>>
>> Robin Dapp writes:
>> >> Why are the contents of this if statement wrong for COND_LEN?
>> >> If the "else" value doesn't matter, then the masked
Robin Dapp writes:
> Thank you for the explanation.
>
> So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along
> with the respective helper and expand functions, what would be the
> way forward?
IMO it'd be worth starting with the _LEN form only.
> Generate an IFN_VCOND_MASK(_LEN) h
aarch64_save/restore_callee_saves looped over registers in register
number order. This in turn meant that we could only use LDP and STP
for registers that were consecutive both number-wise and
offset-wise (after unsaved registers are excluded).
This patch instead builds lists of the registers tha
Now that the prologue and epilogue code iterates over saved
registers in offset order, we can put the LR save slot first
without compromising LDP/STP formation.
This isn't worthwhile when shadow call stacks are enabled, since the
first two registers are also push/pop candidates, and LR cannot be
p
Jakub Jelinek writes:
> On Sun, Oct 15, 2023 at 12:43:10PM +0100, Richard Sandiford wrote:
>> It seemed like there was considerable support for bumping the minimum
>> to beyond 4.8. I think we should wait until a decision has been made
>> before adding more 4.8 workarounds.
Vlad, is it OK if I backport the patch below to fix
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111528 ? Jakub has
given a conditional OK on irc.
Thanks,
Richard
Richard Sandiford writes:
> While backporting another patch to an earlier release, I hit a
> situation in
Prathamesh Kulkarni writes:
> On Tue, 17 Oct 2023 at 02:40, Richard Sandiford
> wrote:
>> Prathamesh Kulkarni writes:
>> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
>> > index 4f8561509ff..55a6a68c16c 100644
>> > --- a/gcc/fold-const.cc
>&g
Alex Coplan writes:
> In the case that !insn->is_debug_insn () && next->is_debug_insn (), this
> function was missing an update of the prev pointer on the first nondebug
> insn following the sequence of debug insns starting at next.
>
> This can lead to corruption of the insn chain, in that we end
Alex Coplan writes:
> Add a helper routine to access-utils.h which removes the memory access
> from an access_array, if it has one.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> gcc/ChangeLog:
>
> * rtl-ssa/access-utils.h (drop_memory_access): New.
> ---
> g
Alex Coplan writes:
> This is needed by the upcoming aarch64 load pair pass, as it can
> re-order stores (when alias analysis determines this is safe) and thus
> change which mem def a given use consumes (in the RTL-SSA view, there is
> no alias disambiguation of memory).
>
> Bootstrapped/regteste
Alex Coplan writes:
> Currently, rtl_ssa::change_insns requires all new uses and defs to be
> specified explicitly. This turns out to be rather inconvenient for
> forming load pairs in the new aarch64 load pair pass, as the pass has to
> determine which mem def the final load pair consumes, and t
Alex Coplan writes:
> The test is looking for individual stores which are able to be merged
> into stp instructions. The test currently passes -fno-schedule-fusion
> -fno-peephole2, presumably to prevent these stores from being turned
> into stps, but this is no longer sufficient with the new ldp
Alex Coplan writes:
> With the new ldp/stp pass enabled, there is a change in the codegen for
> this test as follows:
>
> add x8, sp, 16
> ptrue p3.h, mul3
> str p3, [x8]
> - str x8, [sp, 8]
> - str x9, [sp]
> + stp x9, x8, [sp]
>
Alex Coplan writes:
> The test is trying to check that we don't use q-register stores with
> -mstrict-align, so actually check specifically for that.
>
> This is a prerequisite to avoid regressing:
>
> scan-assembler-not "add\tx0, x0, :"
>
> with the upcoming ldp fusion pass, as we change where th
Alex Coplan writes:
> This patch generalises the TFmode load/store pair patterns to TImode and
> TDmode. This brings them in line with the DXmode patterns, and uses the
> same technique with separate mode iterators (TX and TX2) to allow for
> distinct modes in each arm of the load/store pair.
>
>
Generally looks really good. Some comments below.
Victor Do Nascimento writes:
> Given the implementation of a mechanism of encoding system registers
> into GCC, this patch provides the mechanism of validating their use by
> the compiler. In particular, this involves:
>
> 1. Ensuring a suppli
Victor Do Nascimento writes:
> Implement the aarch64 intrinsics for reading and writing system
> registers with the following signatures:
>
> uint32_t __arm_rsr(const char *special_register);
> uint64_t __arm_rsr64(const char *special_register);
> void* __arm_rsrp(const char *spe
Victor Do Nascimento writes:
> This patch defines the structure of a new .def file used for
> representing the aarch64 system registers, what information it should
> hold and the basic framework in GCC to process this file.
>
> Entries in the aarch64-system-regs.def file should be as follows:
>
>
Victor Do Nascimento writes:
> Motivated by the need to print system register names in output
> assembly, this patch adds the required logic to
> `aarch64_print_operand' to accept rtxs of type CONST_STRING and
> process these accordingly.
>
> Consequently, an rtx such as:
>
> (set (reg/i:DI 0 x0
Victor Do Nascimento writes:
> In implementing the ACLE read/write system register builtins it was
> observed that leaving argument type checking to be done at expand-time
> meant that poorly-formed function calls were being "fixed" by certain
> optimization passes, meaning bad code wasn't being p
Victor Do Nascimento writes:
> Add a build-time test to check whether system register data, as
> imported from `aarch64-sys-reg.def' has any duplicate entries.
>
> Duplicate entries are defined as any two SYSREG entries in the .def
> file which share the same encoding values (as specified by its `
Richard Henderson writes:
> @@ -2687,34 +2738,60 @@
>aarch64_sve_prepare_conditional_op (operands, 5, );
> })
>
> -;; Predicated floating-point operations.
> -(define_insn "*cond_"
> - [(set (match_operand:SVE_F 0 "register_operand" "=w")
> +;; Predicated floating-point operations with sel
Richard Henderson writes:
> * config/aarch64/aarch64.md (movprfx): New attr.
> (length): Default movprfx to 8.
> * config/aarch64/aarch64-sve.md (*mul3): Add movprfx alt.
> (*madd, *msub (*mul3_highpart): Likewise.
> (*3): Likewise.
> (*v3): Likewise.
>
Richard Henderson writes:
> The predicate is present within the containing UNSPEC_SEL;
> there is no need to duplicate it.
>
> * config/aarch64/aarch64-sve.md (cond_):
> Remove match_dup 1 from the inner unspec.
> (*cond_): Likewise.
OK, thanks.
Richard
> ---
> gcc/config/aar
Richard Henderson writes:
> * config/aarch64/aarch64-protos.h, config/aarch64/aarch64.c
> (aarch64_sve_prepare_conditional_op): Remove.
> * config/aarch64/aarch64-sve.md (cond_):
> Allow aarch64_simd_reg_or_zero as select operand; remove
> the aarch64_sve_prepare_cond
Christophe Lyon writes:
> On Fri, 29 Jun 2018 at 13:36, Richard Sandiford
> wrote:
>>
>> Richard Sandiford writes:
>> > This patch is the main part of PR85694. The aim is to recognise at least:
>> >
>> > signed char *a, *b, *c;
>> >
Richard Biener writes:
> On Fri, 22 Jun 2018, David Malcolm wrote:
>
>> NightStrike and I were chatting on IRC last week about
>> issues with trying to vectorize the following code:
>>
>> #include
>> std::size_t f(std::vector> const & v) {
>> std::size_t ret = 0;
>> for (auto const & w
an up the PATTERN_DEF_SEQ handling, but they
only apply after the complete PR85694 sequence, whereas this needs
to go in before 14/n.
Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?
Richard
2018-07-03 Richard Sandiford
gcc/
* tree-vect-patte
passing a single
statement instead of a vector. It also gets rid of the clearing of
STMT_VINFO_RELATED_STMT on failure, since no recognisers use it now.
Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?
Richard
2018-07-03 Richard Sandiford
gcc
install?
Richard
2018-07-03 Richard Sandiford
gcc/
* tree-vect-patterns.c (new_pattern_def_seq): Delete.
(vect_recog_dot_prod_pattern, vect_recog_sad_pattern)
(vect_recog_widen_op_pattern, vect_recog_over_widening_pattern)
(vect_recog_rotate_pattern
The PR85694 series added a vectype argument to append_pattern_def_seq.
This patch makes more callers use it.
Tested on aarch64-linux-gnu, arm-linux-gnueabihf and x86_64-linux-gnu.
OK to install?
Richard
2018-07-03 Richard Sandiford
gcc/
* tree-vect-patterns.c
Richard Biener writes:
> On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford
> wrote:
>>
>> Richard Sandiford writes:
>> > This patch is the main part of PR85694. The aim is to recognise at least:
>> >
>> > signed char *a, *b, *c;
>> >
Christophe Lyon writes:
> On Tue, 3 Jul 2018 at 12:02, Richard Sandiford
> wrote:
>>
>> Richard Biener writes:
>> > On Fri, Jun 29, 2018 at 1:36 PM Richard Sandiford
>> > wrote:
>> >>
>> >> Richard Sandiford writes:
>> >>
Finally getting back to this...
Richard Biener writes:
> On Wed, Jun 6, 2018 at 10:16 PM Richard Sandiford
> wrote:
>>
>> > On Thu, May 24, 2018 at 11:36 AM Richard Sandiford
>> > wrote:
>> >>
>> >> This patch adds match.pd support for
Tom de Vries writes:
> [ was: [PATCH, testsuite/guality] Use line number vars in gdb-test ]
> On Thu, Jun 28, 2018 at 07:49:30PM +0200, Tom de Vries wrote:
>> Hi,
>>
>> I played around with pr45882.c and ran into FAILs. It took me a while to
>> realize that the FAILs where due to the gdb-test (a
Paul Koning writes:
> Currently DONE and FAIL are documented only for define_expand, but
> they also work in essentially the same way for define_split and
> define_peephole2.
>
> If FAIL is used in a define_insn_and_split, the output pattern cannot
> be the usual "#" dummy value.
>
> This patch up
Paul Koning writes:
> @@ -8615,6 +8639,34 @@ so here's a silly made-up example:
>"")
> @end smallexample
>
> +There are two special macros defined for use in the preparation statements:
> +@code{DONE} and @code{FAIL}. Use them with a following semicolon,
> +as a statement.
> +
> +@table @c
Richard Biener writes:
> On Fri, Jul 6, 2018 at 9:50 AM Aldy Hernandez wrote:
>>
>>
>>
>> On 07/05/2018 05:50 AM, Richard Biener wrote:
>> > On Thu, Jul 5, 2018 at 9:35 AM Aldy Hernandez wrote:
>> >>
>> >> The reason for this patch are the changes showcased in tree-vrp.c.
>> >> Basically I'd lik
David Malcolm writes:
> On Mon, 2018-06-25 at 11:10 +0200, Richard Biener wrote:
>> On Fri, 22 Jun 2018, David Malcolm wrote:
>>
>> > NightStrike and I were chatting on IRC last week about
>> > issues with trying to vectorize the following code:
>> >
>> > #include
>> > std::size_t f(std::vector
Richard Biener writes:
> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote:
>>
>> Hmmm, I think we can do better, and since this hasn't been reviewed yet,
>> I don't think anyone will mind the adjustment to the patch ;-).
>>
>> I really hate int_const_binop_SOME_RANDOM_NUMBER. We should abstr
Aldy Hernandez writes:
> On 07/11/2018 08:52 AM, Richard Biener wrote:
>> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote:
>>>
>>> Hmmm, I think we can do better, and since this hasn't been reviewed yet,
>>> I don't think anyone will mind the adjustment to the patch ;-).
>>>
>>> I really hat
Jeff Law writes:
> On 07/11/2018 02:07 PM, Steve Ellcey wrote:
>> I have a reload/register allocation question and possible patch. While
>> working on the Aarch64 SIMD ABI[1] I ran into a problem where GCC was
>> saving and restoring registers that it did not need to. I tracked it
>> down to lra
Aldy Hernandez writes:
> On 07/11/2018 01:33 PM, Richard Sandiford wrote:
>> Aldy Hernandez writes:
>>> On 07/11/2018 08:52 AM, Richard Biener wrote:
>>>> On Wed, Jul 11, 2018 at 8:48 AM Aldy Hernandez wrote:
>>>>>
>>>>> Hmmm, I thi
Looks good to me FWIW (not a maintainer), just a minor formatting thing:
Matthew Malcomson writes:
> diff --git a/gcc/config/aarch64/aarch64-simd.md
> b/gcc/config/aarch64/aarch64-simd.md
> index
> aac5fa146ed8dde4507a0eb4ad6a07ce78d2f0cd..67b29cbe2cad91e031ee23be656ec61a403f2cf9
> 100644
> --
Richard Biener writes:
> On Thu, May 24, 2018 at 2:08 PM Richard Sandiford <
> richard.sandif...@linaro.org> wrote:
>
>> This patch adds conditional equivalents of the IFN_FMA built-in functions.
>> Most of it is just a mechanical extension of the binary stuff.
>
but OK for the AArch64 parts?
Any objections to this approach or syntax?
Richard
2018-07-13 Richard Sandiford
gcc/
* doc/md.texi: Expand the documentation of instruction names
to mention port-local uses. Document '@' in pattern names.
* read-md.h (overloaded_insta
Richard Biener writes:
> On Wed, Jul 18, 2018 at 11:50 AM Kyrill Tkachov
> wrote:
>>
>>
>> On 18/07/18 10:44, Richard Biener wrote:
>> > On Tue, Jul 17, 2018 at 3:46 PM Kyrill Tkachov
>> > wrote:
>> >> Hi Richard,
>> >>
>> >> On 17/07/18 14:27, Richard Biener wrote:
>> >>> On Tue, Jul 17, 2018 a
Thanks for doing this.
Kyrill Tkachov writes:
> + calc = build_call_expr_internal_loc (input_location, ifn, type,
> + 2, mvar, convert (type, val));
(indentation looks off)
> diff --git a/gcc/testsuite/gfortran.dg/max_fmaxl_aarch64.f90
> b/gcc/testsuite
E ACLE implementation.
+ The branch is based off and merged with trunk. Please send patches to
+ gcc-patches with an [SVE ACLE] tag in the subject line.
+ There's no need to use changelogs; the changelogs will instead be
+ written when the work is ready to be merged into trunk. The branch i
gives:
fmovh0, 1.328125e-1
Tested on aarch64-linux-gnu, both with and without SVE. OK to install?
Richard
2018-07-18 Richard Sandiford
gcc/
* config/aarch64/aarch64.c (aarch64_float_const_representable_p):
Allow HFmode constants if TARGET_FP_F16INST.
gcc/test
This patch adds the target framework for handling the SVE ACLE,
starting with four functions: svadd, svptrue, svsub and svsubr.
The ACLE has both overloaded and non-overloaded names. Without
the equivalent of clang's __attribute__((overloadable)), a header
file that declared all functions would n
Hi,
Thanks for doing this.
Steve Ellcey writes:
> This is a patch to support the Aarch64 SIMD ABI [1] in GCC. I intend
> to eventually follow this up with two more patches; one to define the
> TARGET_SIMD_CLONE* macros and one to improve the GCC register
> allocation/usage when calling SIMD fun
Richard Biener writes:
> On Wed, Jul 18, 2018 at 8:08 PM Richard Sandiford
> wrote:
>>
>> This patch adds the target framework for handling the SVE ACLE,
>> starting with four functions: svadd, svptrue, svsub and svsubr.
>>
>> The ACLE has both overloade
re handled correctly.
The patch therefore just removes the whole if block.
The loop also needed commutative swapping to be extended to at least
AVG_FLOOR.
This gives +3.9% on 525.x264_r at -O3.
Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-li
alternative would be not to do this in match.pd and instead get
tree-data-ref.c to do it itself. I started out that way but thought
the match.pd approach seemed cleaner.
Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu. OK to install?
Richard
2018-
"interleaved store with gaps\n");
return false;
}
But I think we should do that separately and see what the fallout
from this change is first.
Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
and x86_64-linux-gnu. OK to install?
Richard
2018-0
Richard Biener writes:
> On Fri, Jul 20, 2018 at 12:57 PM Richard Sandiford
> wrote:
>>
>> We could vectorise:
>>
>> for (...)
>>{
>> a[0] = ...;
>> a[1] = ...;
>> a[2] = ...;
>> a[3]
Marc Glisse writes:
> On Fri, 20 Jul 2018, Richard Sandiford wrote:
>
>> --- gcc/match.pd 2018-07-18 18:44:22.565914281 +0100
>> +++ gcc/match.pd 2018-07-20 11:24:33.692045585 +0100
>> @@ -4924,3 +4924,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>(if
The aim of this series is to:
(a) make the vectoriser refer to statements using its own expanded
stmt_vec_info rather than the underlying gimple stmt. This reduces
the number of stmt lookups from 480 in current sources to under 100.
(b) make the remaining lookups relative the owning vec_
This minor clean-up avoids repeating the test for double reductions
and also moves the vect_get_vec_def_for_operand call to the same
function as the corresponding vect_get_vec_def_for_stmt_copy.
2018-07-24 Richard Sandiford
gcc/
* tree-vect-loop.c (get_initial_def_for_reduction
nt is to remove the only path through vectorizable_reduction
in which stmt and stmt_info refer to different statements.
2018-07-24 Richard Sandiford
gcc/
* tree-vect-loop.c (vectorizable_reduction): Assert that the
function is not called for second and subsequent members of
1 - 100 of 9352 matches
Mail list logo