https://github.com/ARM-software/acle/pull/199 adds a new feature
macro for RCPC, for use in things like inline assembly. This patch
adds the associated support to GCC.
Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it. This was probably harmless in practic
Philipp Tomsich writes:
> This brings the extensions detected by -mcpu=native on Ampere-1 systems
> in sync with the defaults generated for -mcpu=ampere1.
>
> Note that some kernel versions may misreport the presence of PAUTH and
> PREDRES (i.e., -mcpu=native will add 'nopauth' and 'nopredres').
>
Wilco Dijkstra writes:
> Improve immediate expansion of immediates which can be created from a
> bitmask immediate and 2 MOVKs. This reduces the number of 4-instruction
> immediates in SPECINT/FP by 10-15%.
>
> Passes regress, OK for commit?
>
> gcc/ChangeLog:
>
> PR target/106583
>
Wilco Dijkstra writes:
> Since AArch64 sets all flags on logical operations, comparisons with zero
> can be combined into an AND even if the condition is LE or GT.
>
> Passes regress, OK for commit?
>
> gcc:
> PR target/105773
> * config/aarch64/aarch64.cc (aarch64_select_cc_mode):
Philipp Tomsich writes:
> Fixes: 341573406b39
>
> Don't subtract one from the result of strnlen() when trying to point
> to the first character after the current string. This issue would
> cause individual characters (where the 128 byte buffers are stitched
> together) to be lost.
>
> gcc/ChangeL
Philipp Tomsich writes:
> This brings the extensions detected by -mcpu=native on Ampere-1 systems
> in sync with the defaults generated for -mcpu=ampere1.
>
> Note that some early kernel versions on Ampere1 may misreport the
> presence of PAUTH and PREDRES (i.e., -mcpu=native will add 'nopauth'
>
Wilco Dijkstra via Gcc-patches writes:
> Hi Richard,
>
>> Did you consider handling the case where the movks aren't for
>> consecutive bitranges? E.g. the patch handles:
>
>> but it looks like it would be fairly easy to extend it to:
>>
>> 0x12345678
>
> Yes, with a more general search l
Wilco Dijkstra writes:
> Hi Richard,
>
>>> Yes, with a more general search loop we can get that case too -
>>> it doesn't trigger much though. The code that checks for this is
>>> now refactored into a new function. Given there are now many
>>> more calls to aarch64_bitmask_imm, I added a streamli
Richard Biener writes:
> On Mon, 10 Oct 2022, Andrew Stubbs wrote:
>> On 10/10/2022 12:03, Richard Biener wrote:
>> > The following picks up the prototype by Ju-Zhe Zhong for vectorizing
>> > first order recurrences. That solves two TSVC missed optimization PRs.
>> >
>> > There's a new scalar cy
Richard Biener writes:
> + /* First-order recurrence autovectorization needs to handle permutation
> + with indices = [nunits-1, nunits, nunits+1, ...]. */
> + vec_perm_builder sel (nunits, 1, 3);
> + for (int i = 0; i < 3; ++i)
> +sel.quick_push (nunits - dist + i);
> + vec_perm_indi
Jakub Jelinek writes:
> On Wed, Oct 05, 2022 at 04:02:25PM -0400, Jason Merrill wrote:
>> > > > @@ -5716,7 +5716,13 @@ emit_store_flag_1 (rtx target, enum rtx_
>> > > >{
>> > > > machine_mode optab_mode = mclass == MODE_CC ? CCmode :
>> > > > compare_mode;
>> > > > icode =
Various parts of the omp code checked whether the size of a decl
was an INTEGER_CST in order to determine whether the decl was
variable-sized or not. If it was variable-sized, it was expected
to have a DECL_VALUE_EXPR replacement, as for VLAs.
This patch uses poly_int_tree_p instead, so that vari
Hi,
Thanks for the submission. Some comments below on this patch,
but otherwise it looks good. I hope to get to the other patches
in the series soon.
I haven't followed all of the previous discussion, so some of these
points might already have been discussed. Sorry in advance if so.
xucheng..
Hi,
Some comments below, but otherwise it looks good to me.
xucheng...@loongson.cn writes:
> […]
> +(define_memory_constraint "k"
> + "A memory operand whose address is formed by a base register and
> (optionally scaled)
> + index register."
> + (and (match_code "mem")
> + (not (match_
Hi,
Some comments below, but otherwise it looks good to me.
A few of the comments are about removing hook or macro definitions
that are the same as the default. Doing that helps people who want
to update a hook interface in future, since there are then fewer
places to adjust.
xucheng...@loongso
Richard Biener writes:
> This adds a --param to allow disabling of vectorization of
> floating point inductions. Ontop of -Ofast this should allow
> 549.fotonik3d_r to not miscompare.
>
> While I thought of a more elaborate way of disabling certain
> vectorization kinds (reductions also came to m
Xi Ruoyao writes:
> I think this one obvious. Ok for trunk?
OK, thanks.
Richard
>
> gcc/
>
> PR target/104842
> * config/mips/mips.h (LUI_OPERAND): Cast the input to an unsigned
> value before adding an offset.
> ---
> gcc/config/mips/mips.h | 2 +-
> 1 file changed, 1 inser
xucheng...@loongson.cn writes:
> +#ifndef _GCC_LOONGARCH_BASE_INTRIN_H
> +#define _GCC_LOONGARCH_BASE_INTRIN_H
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +typedef struct drdtime
> +{
> + unsigned long dvalue;
> + unsigned long dtimeid;
> +} __drdtime_t;
> +
> +typedef struct rdtime
xucheng...@loongson.cn writes:
> diff --git a/libgcc/config/loongarch/crti.S b/libgcc/config/loongarch/crti.S
> new file mode 100644
> index 000..27b7eab3626
> --- /dev/null
> +++ b/libgcc/config/loongarch/crti.S
> @@ -0,0 +1,43 @@
> +/* Copyright (C) 2021-2022 Free Software Foundation, Inc
xucheng...@loongson.cn writes:
> diff --git a/gcc/testsuite/lib/target-supports.exp
> b/gcc/testsuite/lib/target-supports.exp
> index 737e1a8913b..843b508b010 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -286,6 +286,10 @@ proc check_config
xucheng...@loongson.cn writes:
> From: chenglulu
>
> 2022-03-04 Chenghua Xu
> Lulu Cheng
>
> * contrib/config-list.mk: Add LoongArch triplet.
> * gcc/doc/install.texi: Add LoongArch options section.
> * gcc/doc/invoke.texi: Add LoongArch options section.
> *
Xi Ruoyao via Gcc-patches writes:
> On Fri, 2022-03-04 at 15:17 +0800, xucheng...@loongson.cn wrote:
>
>> The binutils has been merged into trunk:
>> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307
>>
>> Note: We split -mabi= into -mabi=lp64d/f/s
Richard Biener via Gcc-patches writes:
> On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu wrote:
>>
>> On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
>> > On Tue, Mar 1, 2022 at 11:41 PM H.J. Lu via Gcc-patches
>> > wrote:
>> > >
>> > > Add TARGET_FOLD_MEMCPY_MAX for the maximum number o
Xi Ruoyao writes:
> Bootstrapped and regtested on mips64el-linux-gnuabi64.
>
> I'm not sure if it's "correct" to clobber other registers during the
> zeroing of scratch registers. But I can't really come up with a better
> idea: on MIPS there is no simple way to clear one bit in FCSR (i. e.
> FCC
"Roger Sayle" writes:
> Hi Richard,
>> Yes, which is why I think the target should claim argument passing happens
> in reg:HI.
>
> Unfortunately, this hits another "feature" of the nvptx backend; it's a
>
> /* Implement TARGET_MODES_TIEABLE_P. */
> bool nvptx_modes_tieable_p (machine_mode, machi
Richard Biener writes:
> On Wed, Mar 9, 2022 at 7:04 PM Richard Sandiford
> wrote:
>>
>> Richard Biener via Gcc-patches writes:
>> > On Wed, Mar 2, 2022 at 10:18 PM H.J. Lu wrote:
>> >>
>> >> On Wed, Mar 02, 2022 at 09:51:26AM +0100, Richard Biener wrote:
>> >> > On Tue, Mar 1, 2022 at 11:41 PM
Sorry for the slow response, was out for a few days.
Xi Ruoyao writes:
> On Sat, 2022-03-12 at 18:48 +0800, Xi Ruoyao via Gcc-patches wrote:
>> On Fri, 2022-03-11 at 21:26 +, Qing Zhao wrote:
>> > Hi, Ruoyao,
>> >
>> > (I might not be able to reply to this thread till next Wed due to a
>> >
Xi Ruoyao writes:
> libsanitizer/
>
> * sanitizer_common/sanitizer_atomic_clang.h: Ensures to only
> include sanitizer_atomic_clang_mips.h for O32.
OK, thanks.
Richard
> ---
> libsanitizer/sanitizer_common/sanitizer_atomic_clang.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 dele
Xi Ruoyao writes:
> Bootstrapped and regtested on mips64el-linux-gnuabi64.
>
> bootstrap-ubsan revealed 3 bugs (PR 104842, 104843, 104851).
> bootstrap-asan did not reveal any new bug.
>
> gcc/
>
> * config/mips/mips.h (SUBTARGET_SHADOW_OFFSET): Define.
> * config/mips/mips.cc (mips_op
Jakub Jelinek writes:
> Hi!
>
> We unshare all RTL created during expansion, but when
> aarch64_load_symref_appropriately is called after expansion like in the
> following testcases, we use imm in both HIGH and LO_SUM operands.
> If imm is some RTL that shouldn't be shared like a non-sharable CONS
Richard Biener via Gcc-patches writes:
> On Mon, Mar 14, 2022 at 8:26 PM Roger Sayle
> wrote:
>> I've been wondering about the possible performance/missed-optimization
>> impact of my patch for PR middle-end/98420 and similar IEEE correctness
>> fixes that disable constant folding optimizations
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch adds tuning structures for Neoverse N2.
>
> 2022-03-16 Tamar Christina
> Andre Vieira
>
> * config/aarch64/aarch64.cc (neoversen2_addrcost_table,
> neoversen2_regmove_cost,
> neoversen2_advsimd_vector_cost,
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch adds tuning structs for -mcpu/-mtune=demeter.
>
>
> 2022-03-16 Tamar Christina
> Andre Vieira
>
> * config/aarch64/aarch64.cc (demeter_addrcost_table,
> demeter_regmove_cost,
> demeter_advsimd_vector
"Andre Vieira (lists)" writes:
> This patch introduces a struct to differentiate between different
> memmove costs to enable a better modeling of memory operations. These
> have been modelled for
> -mcpu/-mtune=neoverse-v1/neoverse-n1/neoverse-n2/neoverse-512tvb, for
> all other tunings all en
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch updates the register move tunings for
> -mcpu/-mtune={neoverse-v1,neoverse-512tvb}.
>
> 2022-03-16 Tamar Christina
> Andre Vieira
>
> * config/aarch64/aarch64.cc (neoversev1_regmove_cost): New
> tuning struc
"Andre Vieira (lists)" writes:
> Hi,
>
> This patch implements the costing function
> determine_suggested_unroll_factor for aarch64.
> It determines the unrolling factor by dividing the number of X
> operations we can do per cycle by the number of X operations in the loop
> body, taking this in
Xi Ruoyao writes:
>>
>> If we have to go this way, I think it’s better to make the change you
>> suggested above,
>> and then also update the documentation, both internal documentation on
>> how to define
>> the hook and the user level documentation on what the user might
>> expect when using
Richard Biener writes:
> The following arranges for the GIMPLE frontend to parse an
> additional loops(...) specification, currently consisting of
> 'normal' and 'lcssa'. The intent is to allow unit testing
> of passes inside the GIMPLE loop optimization pipeline where
> we keep the IL in loop-cl
Thanks, this addresses most of my comments from the v8 review.
There were a couple left over though:
chenglulu writes:
> +(define_attr "compression" "none,all"
> + (const_string "none"))
I still don't understand the purpose of keeping this for LoongArch.
> +(define_insn "truncdisi2_extended"
>
rtl-ssa chains definitions into an RPO list. It also groups
sequences of clobbers together into a single node, so that it's
possible to skip over the clobbers in constant time in order to
get the next or previous set.
When adding a clobber to an insn, the main DF barriers for that
clobber are the
Hi,
Thanks for the update. It looks like there are some unaddressed
comments from the v8 review:
chenglulu writes:
> gcc/
>
> * config/loongarch/larchintrin.h: New file.
> * config/loongarch/loongarch-builtins.cc: New file.
> ---
> gcc/config/loongarch/larchintrin.h | 409 +
chenglulu writes:
> diff --git a/gcc/testsuite/lib/target-supports.exp
> b/gcc/testsuite/lib/target-supports.exp
> index 737e1a8913b..843b508b010 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -286,6 +286,10 @@ proc check_configured_with {
chenglulu writes:
> +@item -msmall-data-limit=@var{number}
> +@opindex -msmall-data-limit
> +Put global and static data smaller than @code{number} bytes into a special
> +section (on some targets). The default value is 0.
One minor left-over from v8: this should be @var{number}
rather than @code
chenglulu writes:
> Hi, all:
>
> This is the v9 version of LoongArch Port based on
> 9fc8f278ebebc57537dc0cb9d33e36d932be0bc3.
> Please review.
Thanks for the update. I've sent follows-up for parts 4, 6, 11 and 12,
but otherwise v9 addresses all the comments I had. The series LGTM
with those i
chenglulu writes:
> Hi, all:
>
> This is the v10 version of LoongArch Port based on
> d1ca63a1b7d5986913b14567a4950b055a5a3f07.
OK for trunk. Thanks for the updates.
Richard
> Please review.
>
> We know it is stage4, I think it is ok for a new prot.
> The kernel side upstream waiting for a a
Richard Biener writes:
> Since we're now vectorizing by default at -O2 issues like PR101908
> become more important where we apply basic-block vectorization to
> parts of the function covering loads from function parameters passed
> on the stack. Since we have no good idea how the stack pushing
>
Richard Biener writes:
> On Mon, 28 Mar 2022, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > Since we're now vectorizing by default at -O2 issues like PR101908
>> > become more important where we apply basic-block vectorization to
>> > parts of the function covering loads from function
"Andre Vieira (lists)" writes:
> Hi,
>
> Addressed all of your comments bar the pred ops one.
>
> Is this OK?
>
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.cc (aarch64_vector_costs): Define
> determine_suggested_unroll_factor and m_nosve_pattern.
> (determine_suggested_unrol
Alexandre Oliva via Gcc-patches writes:
> These tests require a target that supports arm soft-float. The
> problem is that the test checks for compile-time soft-float support,
> but they may hit a problem when the linker complains that it can't
> combine the testcase's object file with hard-float
Alexandre Oliva via Gcc-patches writes:
> When the mode of regno_reg_rtx is not hard_regno_mode_ok for the
> target, try grouping the register with subsequent ones. This enables
> s16 to s31 and their hidden pairs to be zeroed with the default logic
> on some arm variants.
>
> Regstrapped on x86_
Richard Biener writes:
> The following makes sure that when we build the versioning condition
> for vectorization including the cost model check, we check for the
> cost model and branch over other versioning checks. That is what
> the cost modeling assumes, since the cost model check is the only
Qing Zhao writes:
> Hi,
>
> Per our discussion on:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html
>
> I come up with the following patch to:
>
> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook;
> 2. Add an assertion in function.cc to make sure the actually zer
Jakub Jelinek writes:
> Hi!
>
> As I wrote in the PR, our Fedora trunk gcc builds likely after r12-7842
> change are now failing (lto1 crashes).
> What happens is that when one bootstraps into an empty build directory
> (or set of them), mddeps.mk doesn't exist yet and so Makefile doesn't
> includ
Jakub Jelinek writes:
> Hi!
>
> Normally updates to the source directory files are guarded with
> --enable-maintainer-mode, e.g. we don't regenerate configure, config.h,
> Makefile.in in directories that use automake etc. unless gcc is configured
> that way. Otherwise the source tree can't be e.g
Alexandre Oliva writes:
> Hello, Richard,
>
> Thanks for the review!
>
> On Mar 31, 2022, Richard Sandiford wrote:
>
>>> + /* If the natural mode doesn't work, try some wider mode. */
>>> + if (!targetm.hard_regno_mode_ok (regno, mode))
>>> + {
>>> + for (int nregs = 2;
>>> +
check_load_store_for_partial_vectors predates the support for SLP
gathers and so had a hard-coded assumption that gathers/scatters
(and load/stores lanes) would be non-SLP operations. This patch
passes down the slp_node so that the routine can work out how
many vectors are needed in both the SLP a
Use error_n rather than error_at for “%d vectors”, so that
translators can pick different translations based on the
number (2 vs more than 2, etc.)
Tested on aarch64-linux-gnu & pushed.
Richard
gcc/
PR target/104897
* config/aarch64/aarch64-sve-builtins.cc
(function_reso
This PR is about -fpack-struct causing a crash when
is included. The new register_tuple_type code was expecting a
normal unpacked structure layout instead of a packed one.
For SVE we got around this by temporarily suppressing -fpack-struct,
so that the tuple types always have their normal ABI.
The mops cpy* patterns take three registers: a destination address,
a source address, and a size. The patterns clobber all three registers
as part of the operation. The set* patterns take a destination address,
a size, and a store value, and they clobber the first two registers as
part of the ope
Richard Biener via Gcc-patches writes:
> On Wed, 6 Apr 2022, Jakub Jelinek wrote:
>
>> On Wed, Apr 06, 2022 at 08:13:24AM +0200, Richard Biener wrote:
>> > On Tue, 5 Apr 2022, Jakub Jelinek wrote:
>> >
>> > > On Tue, Apr 05, 2022 at 11:28:53AM +0200, Richard Biener wrote:
>> > > > > In GIMPLE, we
Jakub Jelinek writes:
> On Wed, Apr 06, 2022 at 11:52:23AM +0200, Richard Biener wrote:
>> On Wed, 6 Apr 2022, Jakub Jelinek wrote:
>>
>> > On Wed, Apr 06, 2022 at 09:41:44AM +0100, Richard Sandiford wrote:
>> > > But it seems like the magic incantation to detect “real” built-in
>> > > function c
Tamar Christina writes:
> Hi All,
>
> The LS64 intrinsics used a machinery that's not safe to use unless being
> called from a pragma instantiation.
>
> This moves the initialization code to a new pragma for arm_acle.h.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> I didn
"Andre Vieira (lists)" writes:
> Hi,
>
> This addresses the compile-time increase seen in the PR target/105157.
> This was being caused by selecting the wrong core tuning, as when we
> added the latest AArch64 the TARGET_CPU_generic tuning was pushed beyond
> the 0x3f mask we used to encode bot
"Andre Vieira (lists)" writes:
> On 08/04/2022 08:04, Richard Sandiford wrote:
>> I think this would be better as a static assert at the top level:
>>
>>static_assert (TARGET_CPU_generic < TARGET_CPU_MASK,
>> "TARGET_CPU_NBITS is big enough");
> The motivation being that you want
Xi Ruoyao writes:
> Another brown paper bag fix for MIPS :(.
>
> This failure was not detected running mips.exp=pr102024-* with a cross
> compiler, so I just spotted it now running the test natively.
>
> ---
>
> The body of func is optimized away with -flto -fno-fat-lto-objects, so
> the psABI inf
Dan Li writes:
> Gentile ping for this :), thanks.
>
> Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590906.html
Sorry, I should have realised this at the time, but I don't think
we can do this after all. The ABI requires us to set up the frame
chain before assigning to the frame
Richard Biener via Gcc-patches writes:
> The following reverts the original PR105140 fix and goes for instead
> applying the additional fold_convert constraint for VECTOR_TYPE
> conversions also to fold_convertible_p. I did not try sanitizing
> all of this at this point.
>
> Bootstrapped on x86_6
Richard Biener writes:
> On Wed, 13 Apr 2022, Richard Sandiford wrote:
>
>> Richard Biener via Gcc-patches writes:
>> > The following reverts the original PR105140 fix and goes for instead
>> > applying the additional fold_convert constraint for VECTOR_TYPE
>> > conversions also to fold_convertib
In this PR, we were trying to set the unroll factor to a value higher
than the minimum VF (or more specifically, to a value that doesn't
divide the VF). I guess there are two approaches to this: let the
target pick any value it likes and make target-independent code pare
it back to something that
Richard Biener writes:
> When doing BB vectorization the scalar cost compute is derailed
> by patterns, causing lanes to be considered live and thus not
> costed on the scalar side. For the testcase in PR104010 this
> prevents vectorization which was done by GCC 11. PR103941
> shows similar case
Richard Biener writes:
> On Thu, 14 Apr 2022, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> > When doing BB vectorization the scalar cost compute is derailed
>> > by patterns, causing lanes to be considered live and thus not
>> > costed on the scalar side. For the testcase in PR104010
"Kewen.Lin" writes:
> on 2022/1/18 锟斤拷锟斤拷11:06, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> As discussed in PR104015, the test case slp-perm-9.c can be
>> fragile when vectorizer tries to use different vectorisation
>> strategies.
>>
>> As Richard suggested, this patch tries to make the check
In g:526e1639aa76b0a8496b0dc3a3ff2c450229544e I'd added support
for finding more consecutive MEMs. However, the check was too
eager, in that it matched MEM_REFs with the same base address
even if that base address was an arbitrary SSA name. This can
give wrong results if a MEM_REF from one loop i
In this PR the waccess pass was fed:
D.10779 ={v} {CLOBBER};
VIEW_CONVERT_EXPR(D.10779) = .MASK_LOAD_LANES (addr_5(D), 64B,
_2);
_7 = D.10779.__val[0];
However, the tracking of m_clobbers only looked at gassigns,
so it missed that the clobber on the first line was overwritten
by the call o
Richard Biener writes:
> On Tue, Jan 18, 2022 at 2:40 PM Richard Sandiford via Gcc-patches
> wrote:
>>
>> In this PR the waccess pass was fed:
>>
>> D.10779 ={v} {CLOBBER};
>> VIEW_CONVERT_EXPR(D.10779) = .MASK_LOAD_LANES (addr_5(D),
>> 64B, _2);
Richard Biener writes:
> This adds a missing check for the availability of intermediate vector
> types required to re-use the accumulator of a vectorized reduction
> in the vectorized epilogue. For SVE and VNx2DF vs V2DF with
> -msve-vector-bits=512 for example V4DF is not available.
>
> In addit
Martin Sebor writes:
> On 1/19/22 03:09, Richard Sandiford wrote:
>> Richard Biener writes:
>>> On Tue, Jan 18, 2022 at 2:40 PM Richard Sandiford via Gcc-patches
>>> wrote:
>>>>
>>>> In this PR the waccess pass was fed:
>>>>
&g
Richard Biener via Gcc-patches writes:
> Currently we diagnose vector lowering of V1mode operations that
> are not natively supported into V_C_E, scalar op plus CTOR with
> -Wvector-operation-performance but that's hardly useful behavior
> even though the way we lower things can be improved.
>
> T
"Andre Vieira (lists)" writes:
> On 20/01/2022 09:14, Christophe Lyon wrote:
>>
>>
>> On Wed, Jan 19, 2022 at 7:18 PM Andre Vieira (lists) via Gcc-patches
>> wrote:
>>
>> Hi Christophe,
>>
>> On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
>> > At some point during the de
"Andre Vieira (lists)" writes:
> On 13/01/2022 14:56, Christophe Lyon via Gcc-patches wrote:
>> The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
>> iterator instead of HI in mve_vmvnq_n_.
>>
>> 2022-01-13 Christophe Lyon
>>
>> gcc/
>> * config/arm/mve.md (mve_vmvnq_
Thanks for the patch and sorry for the (very) slow review.
Dan Li writes:
> diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> index 007b928c54b..9b3a35c06bf 100644
> --- a/gcc/c-family/c-attribs.c
> +++ b/gcc/c-family/c-attribs.c
> @@ -56,6 +56,8 @@ static tree handle_cold_attrib
Sorry for the slow response.
Alex Coplan writes:
> On 20/12/2021 13:19, Richard Sandiford wrote:
>> Alex Coplan via Gcc-patches writes:
>> > Hi,
>> >
>> > This fixes PR103500 i.e. ensuring that stack slots for
>> > passed-by-reference overaligned types are appropriately aligned. For the
>> > tes
cc:ing the x86 and s390 maintainers
soeren--- via Gcc-patches writes:
> From: Sören Tempel
>
> The -fsplit-stack option requires the pthread_t TCB definition in the
> libc to provide certain struct fields at specific hardcoded offsets. As
> far as I know, only glibc provides these fields at the
Andreas Krebbel via Gcc-patches writes:
> On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote:
>> On 1/14/22 08:37, Richard Biener wrote:
>> ...
>>> Can the gist of this bug be put into the GCC bugzilla so the rev can
>>> refer to it?
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
>>
Richard Sandiford via Gcc-patches writes:
> How about instead:
>
> (1) Define a new ASLK_* flag for assign_stack_local_1.
>
> (2) When the flag is set, make:
>
> if (alignment_in_bits > MAX_SUPPORTED_STACK_ALIGNMENT)
> {
> alignment_in_bits =
Richard Biener writes:
> The PR complains that when we only partially BB vectorize an
> if-converted loop body that this can leave unvectorized code
> unconditionally executed and thus effectively slow down code.
> For -O2 we already mitigated the issue by not doing BB vectorization
> when not all
soe...@soeren-tempel.net writes:
> From: Sören Tempel
>
> The -fsplit-stack option requires the pthread_t TCB definition in the
> libc to provide certain struct fields at specific hardcoded offsets. As
> far as I know, only glibc provides these fields at the required offsets.
> Most notably, musl
Dan Li writes:
>>> +
>>> if (flag_stack_usage_info)
>>>current_function_static_stack_size = constant_lower_bound
>>> (frame_size);
>>>
>>> @@ -9066,6 +9089,10 @@ aarch64_expand_epilogue (bool for_sibcall)
>>> RTX_FRAME_RELATED_P (insn) = 1;
>>>}
>>>
>>> + /*
Thanks for the discussion and sorry for the slow reply, was out most of
last week.
Dan Li writes:
> Thanks, Ard,
>
> On 1/26/22 00:10, Ard Biesheuvel wrote:
>> On Wed, 26 Jan 2022 at 08:53, Dan Li wrote:
>>>
>>> Hi, all,
>>>
>>> Sorry for bothering.
>>>
>>> I'm trying to commit aarch64 scs code
Dan Li writes:
> Shadow Call Stack can be used to protect the return address of a
> function at runtime, and clang already supports this feature[1].
>
> To enable SCS in user mode, in addition to compiler, other support
> is also required (as discussed in [2]). This patch only adds basic
> support
Sorry for the slow response, was out last week.
Christophe Lyon via Gcc-patches writes:
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index f16d320..5f559f8fd93 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -6239,9 +6239,14 @@ init_emit_once (void)
>
>/* For BImode, 1 and
Christophe Lyon via Gcc-patches writes:
> On Mon, Jan 31, 2022 at 7:01 PM Richard Sandiford via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
>> Sorry for the slow response, was out last week.
>>
>> Christophe Lyon via Gcc-patches writes:
>> > di
Hans-Peter Nilsson via Gcc-patches writes:
> I'm not seriously submitting this patch for approval. I just thought
> it'd be interesting to some people, at least those maintaining ports
> still using reload; I know it's reload and major ports don't really
> care about that anymore. TL;DR: scroll
Kyrylo Tkachov writes:
> Hi Richard,
>
> Sorry for the delay in getting back to this. I'm now working on a patch to
> adjust this.
>
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Tuesday, December 14, 2021 10:48 AM
>> To: Kyrylo Tkachov via Gcc-patches
>> Cc: Kyrylo Tkachov
apinski--- via Gcc-patches writes:
> From: Andrew Pinski
>
> The problem here is that aarch64_expand_setmem does not change the alignment
> for strict alignment case. This is version 3 of this patch, is is based on
> version 2 and moves the check for the number of instructions from the
> optimizi
Tamar Christina writes:
>> -Original Message-
>> From: Richard Sandiford
>> Sent: Friday, December 17, 2021 4:49 PM
>> To: Tamar Christina
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov
>> Subject: Re: [2/3 PATCH]AArch64 use canonical ord
Hans-Peter Nilsson writes:
>> From: Richard Sandiford
>> Hans-Peter Nilsson via Gcc-patches writes:
>> > The mystery isn't so much that there's code mismatching comments or
>> > intent, but that this code has been there "forever". There has been a
>> > function reg_classes_intersect_p, in gcc s
Richard Sandiford writes:
> Hans-Peter Nilsson writes:
>>> From: Richard Sandiford
>>> Hans-Peter Nilsson via Gcc-patches writes:
>>> > The mystery isn't so much that there's code mismatching comments or
>>> > intent, but that this code has been there "forever". There has been a
>>> > function
Following on from GCC 11 patch g:f31ddad8ac8, this one gives clean
guality.exp test results for aarch64-linux-gnu with modern gdb
(this time gdb 11.2).
The justification is the same as previously:
--
For people using older gdbs, it will trade one set of noisy results for
another set. I still
Many of the XFAILed TSVC tests pass for SVE. This patch updates
the markup accordingly.
Tested on aarch64-linux-gnu & pushed.
Richard
gcc/testsuite/
* gcc.dg/vect/tsvc/vect-tsvc-s1115.c: Don't XFAIL for SVE.
* gcc.dg/vect/tsvc/vect-tsvc-s114.c: Likewise.
* gcc.dg/vect/t
601 - 700 of 2183 matches
Mail list logo