Hi Feng:
Thanks for the patch! a few inline comments below, also don't include
all test files from doc generator, only include a few within the patch
is fine, e.g. pick one for each group, so that it won't make GCC
source tree bloat too much.
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/
From: Pan Li
The bugzilla 112813 has been fixed recently, add below test
case for the bug.
PR target/112813
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/pr112813-1.c: New test.
Signed-off-by: Pan Li
---
.../gcc.target/riscv/rvv/vsetvl/pr112813-1.c | 32 +++
LGTM Thanks.
juzhe.zh...@rivai.ai
From: pan2.li
Date: 2023-12-04 16:09
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Add test case for bug PR112813
From: Pan Li
The bugzilla 112813 has been fixed recently, add below test
case for the bug.
Committed, thanks Juzhe.
Pan
From: juzhe.zh...@rivai.ai
Sent: Monday, December 4, 2023 4:10 PM
To: Li, Pan2 ; gcc-patches
Cc: Li, Pan2 ; Wang, Yanzhang ;
kito.cheng
Subject: Re: [PATCH v1] RISC-V: Add test case for bug PR112813
LGTM Thanks.
juzhe.zh...@rivai
Hi!
I'd like to ping this patch.
Thanks
On Sat, Nov 25, 2023 at 11:17:48AM +0100, Jakub Jelinek wrote:
> The middle-end has been changed quite recently to canonicalize
> -abs (x) to copysign (x, -1) rather than the other way around.
> While I agree with that at GIMPLE level, since it matches the
2023-12-04 16:01 Kito Cheng wrote:
>Hi Feng:
>
>Thanks for the patch! a few inline comments below, also don't include
>all test files from doc generator, only include a few within the patch
>is fine, e.g. pick one for each group, so that it won't make GCC
>source tree bloat too much.
>
OK. All
Since the destination of reduction is not a vector register group, there
is no need to apply overlap constraint.
Also confirm Clang:
The mir in LLVM has early clobber:
early-clobber %49:vrm2 = PseudoVWADD_VX_M1 $noreg(tied-def 0), killed %17:vr,
%48:gpr, %0:gprnox0, 3, 0; example.c:59:24
The mi
>> mcmodel=large s not supported (yet) on any Darwin arch [PR90698], so
> the test needs skipping or xfailing, I think (either way with a
> reference to the PR).
>
> Pushed as
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=b74981b5cf32ebf4bfffd25e7174b5c80243447a
Thanks for fixing this.
Hi Haochen,
on 2023/12/1 10:42, HAO CHEN GUI wrote:
> Hi,
> The "fctid" is supported on 64-bit Power processors and powerpc 476. It
> need a guard to check it. The patch fixes the issue.
>
> Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
> no regressions. Is this OK for tru
On Thu, 23 Nov 2023 at 17:06, Prathamesh Kulkarni
wrote:
>
> Hi Richard,
> For the test-case mentioned in PR111702, compiling with -O2
> -frounding-math -fstack-protector-all results in following ICE during
> cse2 pass:
>
> test.c: In function 'foo':
> test.c:119:1: internal compiler error: in ins
On Mon, Dec 4, 2023 at 8:48 AM Kito Cheng wrote:
LGTM
I've double-checked this in the Zc-1.0.4-3.pdf:
* Zcmp is incompatible with Zcd
* Zcmp depends on Zca
* Zcmt is incompatible with Zcd
* Zcmt depends on Zca and Zicsr
The implies-relations are already implemented.
This patch enforces the inco
Hi Jakub,
on 2023/11/25 18:17, Jakub Jelinek wrote:
> Hi!
>
> The middle-end has been changed quite recently to canonicalize
> -abs (x) to copysign (x, -1) rather than the other way around.
> While I agree with that at GIMPLE level, since it matches the GIMPLE
> goal of as few operations as possi
On Wed, Nov 29, 2023 at 1:25 PM Richard Biener
wrote:
>
> On Wed, Nov 29, 2023 at 10:35 AM Uros Bizjak wrote:
> >
> > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with:
> >
> > internal compiler error: RTL check: expected elt 0 type 'e' or 'u',
> > have 'E' (rtx unspec) in t
The following makes sure we are not losing address-space info
when expanding a __builtin_memcpy (synthesized by gimplification,
which _might_ be the other actual problem). The issue is with
get_memory_rtx which is also used by other builtin expansions
but is not aware of address-spaces. The follo
The following avoids turning aggregate copy or initialization involving
non-default address-spaces to memcpy or memset since they are not
prepared for that.
GIMPLE verification no longer(?) accepts WITH_SIZE_EXPR in aggregate
copies, the following re-allows that.
Sofar untested, will test on x86_
Hi,
As PR112788 shows, on rs6000 with -mabi=ieeelongdouble type _Float128
has the different type precision (128) from that (127) of type long
double, but actually they has the same underlying mode, so they have
the same precision as the mode indicates the same real type format
ieee_quad_format.
I
Hi,
Gentle ping this series:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html
BR,
Kewen
>
>> on 2022/11/24 17:15, Kewen Lin wrote:
>>> Hi,
>>>
>>> Following Segher's suggestion, this patch series is to rework
>>> function rs6000_emit_vector_compare for ve
This patch documents the optimization parameter
riscv-strcmp-inline-limit, which can be used to tweak the behaviour
of -minline-strcmp and -minline-strncmp.
gcc/ChangeLog:
PR target/112650
* doc/invoke.texi: Document riscv-strcmp-inline-limit.
Signed-off-by: Christoph Müllner
--
On Mon, Dec 4, 2023 at 4:46 AM Kito Cheng wrote:
>
> Wait, I got this on my machine?
>
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/doc/invoke.texi:29774:
> misplaced }
> ../../../../riscv-gnu-toolchain-trunk/gcc/gcc/doc/invoke.texi:29786:
> misplaced }
@{n} should be @var{n}.
I was too opti
Hi,
Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609993.html
BR,
Kewen
>
on 2023/1/16 17:08, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> As Honza pointed out in [1], the current uses of function
> optimize_function_for_speed_p in rs6000_option_ov
Loop vecotorization can not optimize following case due to SCEV is not affine
failure (i+offset may overflow):
int A[1024 * 2];
int foo (unsigned offset, unsigned N)
{
int sum = 0;
for (unsigned i = 0; i < N; i++)
sum += A[i + offset];
return sum;
}
Consider this example:
#include "riscv_vector.h"
void
foo6 (void *in, void *out)
{
vfloat64m8_t accum = __riscv_vle64_v_f64m8 (in, 4);
vfloat64m4_t high_eew64 = __riscv_vget_v_f64m8_f64m4 (accum, 1);
vint64m4_t high_eew64_i = __riscv_vreinterpret_v_f64m4_i64m4 (high_eew64);
vint32m4_t high
Add missing "s390" while expanding vec_step to __builtin_s390_vec_step.
gcc/ChangeLog:
* config/s390/vecintrin.h (vec_step): Expand vec_step to
__builtin_s390_vec_step.
---
gcc/config/s390/vecintrin.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/gcc/con
The recent warning patches broke Solaris bootstrap:
/vol/gcc/src/hg/master/local/libiberty/pex-unix.c:326:3: error: initialization
of 'pid_t (*)(struct pex_obj *, pid_t, int *, struct pex_time *, int, const
char **, int *)' {aka 'long int (*)(struct pex_obj *, long int, int *, struct
pex_tim
The recent warning changes broke gm2 bootstrap on Solaris:
/vol/gcc/src/hg/master/local/gcc/m2/mc/mc.flex: In function 'handleFile':
/vol/gcc/src/hg/master/local/gcc/m2/mc/mc.flex:297:21: error: implicit
declaration of function 'alloca' [-Wimplicit-function-declaration]
297 | char *s = (char
On Mon, Nov 20, 2023 at 03:46:06PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> > This is added to enable function multiversioning, but can also be used
> > directly. The interface is chosen to match that used in LLVM's
> > compiler-rt, to facilitate cross-compiler compatibility.
>
The recent warning patches broke Ada bootstrap on Solaris:
adaint.c: In function '__gnat_kill':
adaint.c:3597:3: error: implicit declaration of function 'kill'
[-Wimplicit-function-declaration]
3597 | kill (pid, sig);
| ^~~~
expect.c: In function '__gnat_expect_poll':
expect.c:409:5:
LGTM.
Regards
Robin
> The recent warning patches broke Ada bootstrap on Solaris:
>
> adaint.c: In function '__gnat_kill':
> adaint.c:3597:3: error: implicit declaration of function 'kill'
> [-Wimplicit-function-declaration]
> 3597 | kill (pid, sig);
> | ^~~~
>
> expect.c: In function '__gnat_expect_poll'
"Roger Sayle" writes:
> The recent change to represent language and target attribute tables using
> vec.h's array_slice template class triggers an issue/bug in older g++
> compilers, specifically the g++ 4.8.5 system compiler of older RedHat
> distributions. This exhibits as the following compila
The recent warning patches broke the libssp build on Solaris:
/vol/gcc/src/hg/master/local/libssp/gets-chk.c: In function '__gets_chk':
/vol/gcc/src/hg/master/local/libssp/gets-chk.c:67:12: error: implicit
declaration of function 'gets'; did you mean 'getw'?
[-Wimplicit-function-declaration]
Hi,
this changes the vec_extract path of extract_bit_field to use QImode
instead of BImode when extracting from mask vectors and changes
GET_MODE_BITSIZE to GET_MODE_PRECISION. This fixes an ICE on riscv
where we did not find a vec_extract optab and continued with the generic
code that requires 1
Jakub Jelinek writes:
> On Sat, Dec 02, 2023 at 11:04:04AM +, Richard Sandiford wrote:
>> I still maintain that so much stuff relies on the lack of false-positive
>> REG_UNUSED notes that (whatever the intention might have been) we need
>> to prevent the false positive. Like Andrew says, any
On Wed, Nov 29, 2023 at 05:53:56PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> > This patch adds support for the "target_version" attribute to the middle
> > end and the C++ frontend, which will be used to implement function
> > multiversioning in the aarch64 backend.
> >
> > On ta
The following avoids corrupting the SCEV cache by my last change
to propagate constant final values immediately. The easiest fix
is to keep a dead initialization around.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/112827
* tree-scalar-evoluti
On Sat, 2 Dec 2023, Hans-Peter Nilsson wrote:
> > Date: Fri, 1 Dec 2023 08:07:14 +0100 (CET)
> > From: Richard Biener
>
> > On Fri, 1 Dec 2023, Hans-Peter Nilsson wrote:
> >
> > > > From: Hans-Peter Nilsson
> > > > Date: Thu, 30 Nov 2023 18:09:10 +0100
> > >
> > > Richard B.:
> > > > > >
> +(define_mode_attr widen_ternop_dest_constraint [
> + (RVVM8QI "=vd, vr, vd, vr, vd, vr, ?&vr")
> + (RVVM4QI "=vd, vr, vd, vr, vd, vr, ?&vr")
> + (RVVM2QI "=vd, vr, vd, vr, vd, vr, ?&vr")
> + (RVVM1QI "=vd, vr, vd, vr, vd, vr, ?&vr")
> + (RVVMF2QI "=vd, vr, vd, vr, vd, vr, ?&vr")
> + (RVVMF
On LoongArch architecture, using the latest gcc14 in regression test,
it is found that the vector test cases in vector directory appear FAIL
entries with unmatched pointer types. In order to solve this kind of
problem, the type of the variable in the check result is modified with
the parameter type
On Sat, 2 Dec 2023 at 21:24, Costas Argyris wrote:
>
> Use std::vector instead of malloc'd pointer
> to get automatic freeing of memory.
You can't include there. Instead you need to define
INCLUDE_VECTOR before "system.h"
Shouldn't you be using resize, not reserve? Otherwise mdswitches[i] is
und
On Thu, 30 Nov 2023 at 19:23, Patrick Palka wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
OK, thanks for simplifying it.
>
> -- >8 --
>
> Use the existing _Partial range adaptor closure object in the
> definition of ranges::to instead of essentially open coding it.
>
> li
On Mon, 2023-12-04 at 20:14 +0800, chenxiaolong wrote:
> On LoongArch architecture, using the latest gcc14 in regression test,
> it is found that the vector test cases in vector directory appear FAIL
> entries with unmatched pointer types. In order to solve this kind of
> problem, the type of the v
The following adjusts the C FE specific qualified type building
to preserve address-space info also for ARRAY_TYPE.
Bootstrap / regtest running on x86_64-unknown-linux-gnu, OK?
Thanks,
Richard.
PR c/86869
gcc/c/
* c-typeck.cc (c_build_qualified_type): Preserve address-space
On Mon, 2023-12-04 at 20:31 +0800, Xi Ruoyao wrote:
> On Mon, 2023-12-04 at 20:14 +0800, chenxiaolong wrote:
> > On LoongArch architecture, using the latest gcc14 in regression test,
> > it is found that the vector test cases in vector directory appear FAIL
> > entries with unmatched pointer types.
On Sat, Dec 2, 2023 at 4:53 PM Arsen Arsenović wrote:
>
> contrib/ChangeLog:
>
> * download_prerequisites
> : Parse --only-gettext.
> (echo_archives): Check only_gettext and stop early if true.
> (helptext): Document --only-gettext.
> ---
> Afternoon,
>
> This patch
On Sat, Dec 2, 2023 at 5:03 PM Arsen Arsenović wrote:
>
> This fixes issues reported by David Edelsohn , and by
> Eric Gallager .
>
> ChangeLog:
>
> * Makefile.def (gettext): Disable (via missing)
> {install-,}{pdf,html,info,dvi} and TAGS targets. Set no_install
> to true.
LGTM
On Mon, Dec 4, 2023 at 5:55 PM Christoph Müllner
wrote:
>
> This patch documents the optimization parameter
> riscv-strcmp-inline-limit, which can be used to tweak the behaviour
> of -minline-strcmp and -minline-strncmp.
>
> gcc/ChangeLog:
>
> PR target/112650
> * doc/invoke.
Hi,
recent -std changes caused testsuite failures. Fix those by adding
-std=gnu99 and -Wno-incompatible-pointer-types.
Going to commit as obvious.
Regards
Robin
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr112552.c: Add
-Wno-incompatible-pointer-types.
*
Richard Sandiford writes:
> "Roger Sayle" writes:
>> The recent change to represent language and target attribute tables using
>> vec.h's array_slice template class triggers an issue/bug in older g++
>> compilers, specifically the g++ 4.8.5 system compiler of older RedHat
>> distributions. This
On Fri, Nov 24, 2023 at 04:22:54PM +, Richard Sandiford wrote:
> Andrew Carlotti writes:
> > This adds initial support for function multiversioning on aarch64 using
> > the target_version and target_clones attributes. This loosely follows
> > the Beta specification in the ACLE [1], although w
The following fixes the intermediate conversions inserted by
convert_to_integer when facing address-spaces and converts
to their effective [u]intptr_t when they are registered_builtin_types
by considering those also from c_common_type_for_size and not
only from c_common_type_for_mode.
Bootstrap an
For __builtin_bswap vectorization we still require an equal vector
type size. Re-instantiate that check.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR tree-optimization/112818
* tree-vect-stmts.cc (vectorizable_bswap): Check input and
output vector types
Consider this example:
#include "riscv_vector.h"
void
foo6 (void *in, void *out)
{
vfloat64m8_t accum = __riscv_vle64_v_f64m8 (in, 4);
vfloat64m4_t high_eew64 = __riscv_vget_v_f64m8_f64m4 (accum, 1);
vint64m4_t high_eew64_i = __riscv_vreinterpret_v_f64m4_i64m4 (high_eew64);
vint32m4_t high
LGTM.
Regards
Robin
Adapt patch in V2 with explictly write constraints in the pattern:
[V2] RISC-V: Support highest-number regno overlap for widen ternary - Patchwork
(sourceware.org)
Thanks.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2023-12-04 20:13
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kit
I'd suggest the same thing as in the other patch, i.e. not having
the large number of identical lines in the iterator. That's just
my opinion, though. Rest LGTM.
Regards
Robin
In serious high register pressure case (appended in this patch):
We see vluxei8.v v0,(s1),v1,v0.t which is not allowed.
Since according to RVV ISA:
+;; The destination vector register group for a masked vector instruction
cannot overlap the source mask register (v0),
+;; unless the destina
On Mon, Dec 4, 2023 at 6:32 AM liuhongt wrote:
>
> .i.e. for below cases.
>a[0] = b1;
>a[1] = b2;
>..
>a[n] = bn;
>
> There're extra dependences when contructing the vector, but not for
> scalar store. According to experiments, it's generally worse.
>
> The patch adds an cut-off he
On Mon, 4 Dec 2023, Robin Dapp wrote:
> Hi,
>
> this changes the vec_extract path of extract_bit_field to use QImode
> instead of BImode when extracting from mask vectors and changes
> GET_MODE_BITSIZE to GET_MODE_PRECISION. This fixes an ICE on riscv
> where we did not find a vec_extract optab
On Sat, Dec 2, 2023 at 7:38 AM Andrew Pinski wrote:
>
> When I moved two_value to match.pd, I removed the check for the {0,+-1}
> as I had placed it after the {0,+-1} case for cond in match.pd.
> In the case of {0,+-1} and non boolean, before we would optmize those
> case to just `(convert)a` but
On Sat, Dec 2, 2023 at 7:38 AM Andrew Pinski wrote:
>
> While working on PR 111972, I was getting a regression
> due to zero_one_valued_p matching a signed 1 bit integer
> when it came to convert. This patch fixes that by checking
> the outer type too.
>
> Bootstrapped and tested on x86_64-linux-g
The PR shows that we'll ICE eventually when last_clique wraps. The
following avoids this by refusing to hand out new cliques after
exhausting them. We then use zero (no clique) as conservative
fallback.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
PR middle-end/112785
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/eh_return-3.c: Fix when retaa is available.
---
gcc/testsuite/gcc.target/aarch64/eh_return-3.c | 4
1 file changed, 4 insertions(+)
diff --git a/gcc/testsuite/gcc.target/aarch64/eh_return-3.c
b/gcc/testsuite/gcc.target/aarch64/eh_return
On 12/4/23 06:17, Robin Dapp wrote:
Hi,
recent -std changes caused testsuite failures. Fix those by adding
-std=gnu99 and -Wno-incompatible-pointer-types.
Going to commit as obvious.
Regards
Robin
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr112552.c: Add
-
Hi Andrew,
On 03.12.23 01:32, Andrew Stubbs wrote:
This patch series is a rework of the patch series posted in August.
https://patchwork.sourceware.org/project/gcc/list/?series=23045&state=%2A&archive=both
The series implements device-specific allocators and adds a low-latency
allocator for bot
I cannot "grep" – all three patches do contain .texi changes. I have a
comment to them, but I will comment individually on them.
Hence, scratch:
On 04.12.23 16:34, Tobias Burnus wrote:
On 03.12.23 01:32, Andrew Stubbs wrote:
This patch series is a rework of the patch series posted in August.
h
On 03.12.23 01:32, Andrew Stubbs wrote:
This patch adds support for allocating low-latency ".shared" memory on
NVPTX GPU device, via the omp_low_lat_mem_space and omp_alloc. The memory
can be allocated, reallocated, and freed using a basic but fast algorithm,
is thread safe and the size of the l
On 2023-12-02 04:42, Martin Uecker wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
-- >8 --
It came up that a good hardening strategy is to disable trampolines
which may require executable stack. Therefore the following patch
adds -Werror=trampolines to -fhardened.
This
On Dez 04 2023, Siddhesh Poyarekar wrote:
> For hardened code in C, I think we really should look to step away from
> nested functions instead of adding ways to continue supporting it. There's
> probably a larger conversation to be had about the utility of nested
> functions in general for C (and
Hello!
Thank you, as always, for the great work that you do on libstdc++. The
inout_ptr implementation properly handles the issue raised in LWG 3897
but it seems like having an explicit test might be a good idea.
I hope that this helps!
Will
-- >8 --
Add a test to verify that the implementation
On Mon, Dec 04, 2023 at 05:39:04PM +0100, Andreas Schwab wrote:
> On Dez 04 2023, Siddhesh Poyarekar wrote:
>
> > For hardened code in C, I think we really should look to step away from
> > nested functions instead of adding ways to continue supporting it. There's
> > probably a larger conversatio
On 2023-12-04 11:39, Andreas Schwab wrote:
On Dez 04 2023, Siddhesh Poyarekar wrote:
For hardened code in C, I think we really should look to step away from
nested functions instead of adding ways to continue supporting it. There's
probably a larger conversation to be had about the utility of n
Rainer Orth writes:
> The recent warning patches broke Solaris bootstrap:
>
> /vol/gcc/src/hg/master/local/libiberty/pex-unix.c:326:3: error:
> initialization of 'pid_t (*)(struct pex_obj *, pid_t, int *, struct pex_time
> *, int, const char **, int *)' {aka 'long int (*)(struct pex_obj *, lon
Rainer Orth writes:
> The recent warning changes broke gm2 bootstrap on Solaris:
>
> /vol/gcc/src/hg/master/local/gcc/m2/mc/mc.flex: In function 'handleFile':
> /vol/gcc/src/hg/master/local/gcc/m2/mc/mc.flex:297:21: error: implicit
> declaration of function 'alloca' [-Wimplicit-function-declarat
Szabolcs Nagy writes:
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/eh_return-3.c: Fix when retaa is available.
OK, thanks.
Richard
> ---
> gcc/testsuite/gcc.target/aarch64/eh_return-3.c | 4
> 1 file changed, 4 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/e
The tester recently started failing va-arg-22.c on microblaze-linux:
gcc.c-torture/execute/va-arg-22.c -O0 (test for excess errors)
It was failing with an undefined reference to "r7" at link time. This
was ultimately tracked down to a HImode load using (reg+reg) addressing
mode, but which
On Wed, 8 Nov 2023, Kito Cheng wrote:
> OK, then LGTM, thanks for the explanation :)
Please don't top-post on a GCC mailing list (and preferably in off-list
replies to such mailing list messages unless it's been agreed to somehow
with the participants), as it makes it difficult to make context
Am Montag, dem 04.12.2023 um 11:46 -0500 schrieb Siddhesh Poyarekar:
> On 2023-12-04 11:39, Andreas Schwab wrote:
> > On Dez 04 2023, Siddhesh Poyarekar wrote:
> >
> > > For hardened code in C, I think we really should look to step away from
> > > nested functions instead of adding ways to continu
Richard Sandiford writes:
> Jakub Jelinek writes:
>> On Sat, Dec 02, 2023 at 11:04:04AM +, Richard Sandiford wrote:
>>> I still maintain that so much stuff relies on the lack of false-positive
>>> REG_UNUSED notes that (whatever the intention might have been) we need
>>> to prevent the false
Richard Biener writes:
> OK.
Thanks. I'll wait for the Binutils and GDB maintainers to weigh in
before pushing (plus, I can't push there).
Have a lovely day!
--
Arsen Arsenović
signature.asc
Description: PGP signature
This is an RTL pass that detects store forwarding from stores to larger loads
(load pairs).
This optimization is SPEC2017-driven and was found to be beneficial for some
benchmarks,
through testing on ampere1/ampere1a machines.
For example, it can transform cases like
str d5, [sp, #320]
fmul d
Hi Richard,
>> Enable lock-free 128-bit atomics on AArch64. This is backwards compatible
>> with
>> existing binaries, gives better performance than locking atomics and is what
>> most users expect.
>
> Please add a justification for why it's backwards compatible, rather
> than just stating that
[Branching this into a separate conversation to avoid derailing the
patch, which isn't directly related]
On 2023-12-04 12:21, Martin Uecker wrote:
I do not really agree with that. Nested functions can substantially
improve code quality and in C can avoid type unsafe use of
void* pointers in ca
> "Arsen" == Arsen Arsenović writes:
Arsen> Thanks. I'll wait for the Binutils and GDB maintainers to weigh in
Arsen> before pushing (plus, I can't push there).
Seems fine to me. Thank you.
Tom
Am Montag, dem 04.12.2023 um 13:27 -0500 schrieb Siddhesh Poyarekar:
> [Branching this into a separate conversation to avoid derailing the
> patch, which isn't directly related]
>
> On 2023-12-04 12:21, Martin Uecker wrote:
> > I do not really agree with that. Nested functions can substantially
On Mon, Dec 04, 2023 at 01:27:32PM -0500, Siddhesh Poyarekar wrote:
> [Branching this into a separate conversation to avoid derailing the patch,
> which isn't directly related]
>
> On 2023-12-04 12:21, Martin Uecker wrote:
> > I do not really agree with that. Nested functions can substantially
>
The recently-installed patch for interprocedural value-range propagation
enabled some folding that was not expected by the strub-const testcases,
causing them to fail.
I'm making the following adjustments to them to restore the behavior
they tested for, and to make them more future-proof to future
From: Kong Lingling
For *one_cmplsi2_2_zext, it will be splitted to xor, so its NDD form will be
added together with xor NDD support.
gcc/ChangeLog:
* config/i386/i386.md (one_cmpl2): Add new constraints for NDD
and adjust output template.
(*one_cmpl2_1): Likewise.
On Monday, December 4th, 2023 at 9:39 PM, waffl3x
wrote:
> On Monday, December 4th, 2023 at 9:35 PM, waffl3x waff...@protonmail.com
> wrote:
>
>
>
> > > > @@ -15402,6 +15450,8 @@ tsubst_decl (tree t, tree args, tsubst_flags_t
> > > > complain,
> >
> > > > gcc_checking_assert (TYPE_MAIN_VARIANT
Hi,
APX NDD patches have been posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636604.html
Thanks to Hongtao's review, the V2 patch adds support of zext sematic with
memory input as NDD by default clear upper bits of dest for any operand size.
Also we support TImode shift with n
Manos Anagnostakis writes:
> This is an RTL pass that detects store forwarding from stores to larger loads
> (load pairs).
>
> This optimization is SPEC2017-driven and was found to be beneficial for some
> benchmarks,
> through testing on ampere1/ampere1a machines.
>
> For example, it can transf
From: Kong Lingling
gcc/ChangeLog:
* config/i386/i386-expand.cc (ix86_expand_unary_operator): Add use_ndd
parameter and adjust for NDD.
* config/i386/i386-protos.h: Add use_ndd parameter for
ix86_unary_operator_ok and ix86_expand_unary_operator.
* config/i
On Tue, Dec 5, 2023 at 10:32 AM Hongyu Wang wrote:
>
> Hi,
>
> APX NDD patches have been posted at
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636604.html
>
> Thanks to Hongtao's review, the V2 patch adds support of zext sematic with
> memory input as NDD by default clear upper bits
The process of creating BTF_KIND_DATASEC records involves iterating
through variable declarations, determining which section they will be
placed in, and creating an entry in the appropriate DATASEC record
accordingly.
For variables without e.g. an explicit __attribute__((section)), we use
categori
1. Rebase Xi Ruoyao's patch a to the latest commit.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636798.html
2. remove the #if
!defined(IN_LIBGCC2) && !defined(IN_TARGET_LIBS) && !defined(IN_RTS)
guards in loongarch-def.h and loongarch-opts.h as they'll be unneeded.
3. Described in Loo
Under APX NDD, previous TImode allocation will have issue that it was
originally allocated using continuous pair, like rax:rdi, rdi:rdx.
This will cause issue for all TImode NDD patterns. For NDD we will not
assume the arithmetic operations like add have dependency between dest
and src1, then writ
The instructions defined in LoongArch Reference Manual v1.1 are not the
instruction
set v1.1 version. The CPU defined later may only support some instructions in
LoongArch Reference Manual v1.1. Therefore, the macro ISA_BASE_LA64V110 and
related definitions are removed here.
gcc/ChangeLog:
From: Xi Ruoyao
We'll use HOST_WIDE_INT in LoongArch static properties in following patches.
To keep the same readability as C99 designated initializers, create a
std::array like data structure with position setter function, and add
field setter functions for structs used in loongarch-def.cc.
R
From: Kong Lingling
gcc/ChangeLog:
* config/i386/i386.md: (addsi_1_zext): Add new alternatives for
NDD and adjust output templates.
(*add_2): Likewise.
(*addsi_2_zext): Likewise.
(*add_3): Likewise.
(*addsi_3_zext): Likewise.
(*adddi_4): Li
gcc/ChangeLog:
* config/i386/i386.md (*movcc_noc): Extend with new constraints
to support NDD.
(*movsicc_noc_zext): Likewise.
(*movsicc_noc_zext_1): Likewise.
(*movqicc_noc): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/i386/apx-ndd-cmov.c: New
NDD uses evex prefix, so when segment prefix is also applied, the instruction
could excceed its 15byte limit, especially adding immediates. This could happen
when "e" constraint accepts any UNSPEC_TPOFF/UNSPEC_NTPOFF constant and it will
add the offset to segment register, which will be encoded usi
1 - 100 of 168 matches
Mail list logo