On 5/5/21 1:32 AM, Richard Biener wrote:
On Wed, May 5, 2021 at 4:20 AM Martin Sebor via Gcc-patches
wrote:
Even when explicitly enabled, -Walloca-larger-than doesn't run
unless optimization is enabled as well. This prevents diagnosing
alloca calls with constant arguments in excess of the lim
Hi All,
Currently when using -mcpu=native or -march=native on a CPU that is unknown to
the compiler the compiler currently just used -march=armv8-a and enables none
of the extensions.
To make this a bit more useful this patch changes it to still use -march=armv8.a
but to enable the extensions. W
Hi All,
There's no reason that the sign of the operands of dot-product have to all be
the same. The only restriction really is that the sign of the multiplicands
are the same, however the sign between the multiplier and the accumulator need
not be the same.
The type of the overall operations sho
Hi All,
This patch adds support for a dot product where the sign of the multiplication
arguments differ. i.e. one is signed and one is unsigned but the precisions are
the same.
#define N 480
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#define SIGNEDNESS_4
Hi All,
This adds optabs implementing usdot_prod.
The following testcase:
#define N 480
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#define SIGNEDNESS_4 unsigned
SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restri
Hi All,
This adds optabs implementing usdot_prod.
The following testcase:
#define N 480
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#define SIGNEDNESS_4 unsigned
SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restri
Hi All,
This adds testcases to test for auto-vect detection of the new sign differing
dot product.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* doc/sourcebuild.texi (arm_v8_2a_i8mm_neon_hw): Document.
gcc/testsuite/Chan
Forgot to CC maintainers..
-Original Message-
From: Tamar Christina
Sent: Wednesday, May 5, 2021 6:39 PM
To: gcc-patches@gcc.gnu.org
Cc: nd
Subject: [PATCH 3/4][AArch32]: Add support for sign differing dot-product usdot
for NEON.
Hi All,
This adds optabs implementing usdot_prod.
Th
On Tue, Jan 19, 2021 at 04:10:33PM +, Richard Sandiford via Gcc-patches
wrote:
> Ah, ok, thanks for the extra context.
>
> So AIUI the problem when recording xmm2<-di isn't just:
>
> [A] partial_subreg_p (vd->e[sr].mode, GET_MODE (src))
>
> but also that:
>
> [B] partial_subreg_p (vd->e[
On Wed, May 5, 2021 at 4:50 PM H.J. Lu wrote:
>
> On Wed, May 05, 2021 at 09:36:16AM +0200, Richard Biener wrote:
> > On Mon, May 3, 2021 at 11:31 AM Ivan Sorokin via Gcc-patches
> > wrote:
> > >
> > > Prior to this commit GCC -O2 generated quite bad code for this
> > > function:
> > >
> > > bool
> On May 5, 2021, at 8:45 AM, Segher Boessenkool
> wrote:
>
> Hi~
>
> On Tue, May 04, 2021 at 04:08:22PM +0100, Richard Earnshaw wrote:
>> On 03/05/2021 23:55, Segher Boessenkool wrote:
>>> CC_STATUS_INIT is suggested in final.c to also be useful for ports that
>>> are not CC0, and at least
Ping!
On Tue, 20 Apr 2021 at 12:51, Paul Richard Thomas <
paul.richard.tho...@gmail.com> wrote:
> Hi All,
>
> This is another PDT warm-up patch before tackling the real beast: PR82649.
>
> As the contributor wrote in the PR, "The F08 standard clearly
> distinguishes between type parameter definit
On RISC-V we are facing the fact, that our conditional branches
require Pmode conditions. Currently, we generate them explicitly
with a check for Pmode and then calling the proper generator
(i.e. gen_cbranchdi4 on RV64 and gen_cbranchsi4 on RV32).
Let's simplify this code by generating the INSN hel
On Mon, Apr 26, 2021 at 4:40 PM Kito Cheng wrote:
>
> This patch is a good and simple improvement which could be an independent
> patch.
>
> There is only 1 comment from me for this patch, could you also add @
> to cbranch pattern for floating mode, I would prefer make the
> gen_cbranch4 could ha
This series provides a cleanup of the current atomics implementation
of RISC-V:
* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS
As both are very related, I merged the patches into one series.
The first patch could be squashed into the fo
We don't have any special treatment of MEMMODEL_SYNC_* values,
so let's hide them behind the memmodel_base() function.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_amo_acquire):
Ignore MEMMODEL_SYNC_* values.
* config/riscv/riscv.c (riscv_memmod
The ratified A extension supports '.aq', '.rl' and '.aqrl' as
memory ordering suffixes. Let's emit them in case we get a '%A'
conversion specifier for riscv_print_operand().
As '%A' was already used for a similar, but restricted, purpose
(only '.aq' was emitted so far), this does not require any o
A previous patch took care, that the proper memory ordering suffixes
for AMOs are emitted. Therefore there is no reason to keep the fence
generation mechanism for release operations.
gcc/
PR 100265
* config/riscv/riscv.c (riscv_memmodel_needs_release_fence):
Remove fu
Using AMOSWAP as atomic store does not allow us to do sub-word accesses.
Further, it is not consistent with our atomic_load () implementation.
The benefit of AMOSWAP is that the resulting code sequence will be
smaller (comapred to FENCE+STORE), however, this does not weight
out for the lack of sub-
mem_thread_fence gets the desired memory model as operand.
Let's emit fences according to this value (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
gcc/
PR 100265
* config/riscv/sync.md (mem_thread_fence):
Emit fences according t
A recent commit introduced a mechanism to emit proper fences
for RISC-V. Additionally, we already have emit_move_insn ().
Let's reuse this code and provide atomic_load and
atomic_store for RISC-V (as defined in section
"Code Porting and Mapping Guidelines" of the unpriv spec).
Note, that this works
In order to emit LR/SC sequences, let's provide INSNs, which
take care of memory ordering constraints.
gcc/
PR 100266
* config/rsicv/sync.md (UNSPEC_LOAD_RESERVED): New.
* config/rsicv/sync.md (UNSPEC_STORE_CONDITIONAL): New.
* config/riscv/sync.md (riscv_load_r
The current model of the LR and SC INSNs requires a sign-extension
to use the generated SImode value for conditional branches, which
only operate on XLEN registers.
However, the sign-extension is actually not required in both cases,
therefore this patch introduces additional INSNs that consume
the
The existing CAS implementation uses an INSN definition, which provides
the core LR/SC sequence. Additionally to that, there is a follow-up code,
that evaluates the results and calculates the return values.
This has two drawbacks: a) an extension to sub-word CAS implementations
is not possible (eve
Atomic instructions require zero-offset memory addresses.
If we allow all addresses, the nonzero-offset addresses will
be prepared in an extra register in an extra instruction before
the actual atomic instruction.
This patch introduces the predicate "riscv_sync_memory_operand",
which restricts the
On 05/05/21 2:01 pm, Jonathan Wakely via Libstdc++ wrote:
Passing plain char to isdigit is undefined if the value is negative.
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_alnum): Pass unsigned
char to std::isdigit.
Tested powerpc64le-linux. Committed to trunk.
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/10/11?
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (move_iterator::base): Make the
const& overload return a const reference and remove its
constraint as per LWG 3391. Make unconditionally noexcept.
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/10/11?
libstdc++-v3/ChangeLog:
* include/std/ranges (filter_view::_Iterator::base): Make the
const& overload return a const reference and remove its
constraint as per LWG 3533. Make unconditionally noexcept.
/i386.c (ix86_compute_frame_layout): For a SEH target,
always return the establisher frame for __builtin_frame_address (0).
2021-05-05 Eric Botcazou
* gcc.c-torture/execute/20210505-1.c: New test.
--
Eric Botcazoudiff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index
On 05/05/21 21:57 +0200, François Dumont via Libstdc++ wrote:
On 05/05/21 2:01 pm, Jonathan Wakely via Libstdc++ wrote:
Passing plain char to isdigit is undefined if the value is negative.
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_alnum): Pass unsigned
char t
The libiberty hash table includes a helper function for strings, but
no equality function. Consequently, this equality function has been
reimplemented a number of times in both the gcc and binutils-gdb
source trees. This patch adds the function to the libiberty hash
table, as a step toward the go
The libiberty hash table defines a hash function for strings, but not
an equality function. This means that various files have had to
implement their own comparison function over the years.
This series resolves this for gcc. Once this is in, I plan to import
the change into binutils-gdb and appl
This changes one spot in GCC to use the new htab_eq_string function.
gcc
* gengtype-state.c (read_state): Use htab_eq_string.
(string_eq): Remove.
---
gcc/gengtype-state.c | 11 +--
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/gcc/gengtype-state.c b/gcc/g
This changes godump to use the new htab_eq_string function.
gcc
* godump.c (string_hash_eq): Remove.
(go_finish): Use htab_eq_string.
---
gcc/godump.c | 14 +++---
1 file changed, 3 insertions(+), 11 deletions(-)
diff --git a/gcc/godump.c b/gcc/godump.c
index 7864d9d63e5
On Wed, 5 May 2021, David Laight via Libc-alpha wrote:
> > __u64 can't be formatted with %llu on all architectures. That's not
> > true for uint64_t, where you have to use %lu on some architectures to
> > avoid compiler warnings (and technically undefined behavior). There are
> > preprocessor ma
On Tue, 4 May 2021, Christophe Lyon via Gcc-patches wrote:
> The new test gcc.c-torture/execute/ieee/cdivchkld.c needs fmaxl(),
> which may not be available, for instance on aarch64-elf with newlib.
> As discussed in the PR, requiring c99_runtime enables to skip the test
> in this case.
>
> 2021-
Hello,
Over the last year, we have discussed and agreed that in order to support
multiple debug formats, we keep DWARF as the default internal debug format; Any
new debug format to be supported feeds off DWARF dies. This requirement
specification has worked well for addition for CTF/BTF overall.
To support multiple debug formats, we need to move away from explicit
enumeration of each individual combination of debug formats.
gcc/c-family/ChangeLog:
* c-opts.c (c_common_post_options): Adjust access to debug_type_names.
* c-pch.c (struct c_pch_validity): Use type uint32_t.
This patch introduces a dwarf_debuginfo_p predicate that abstracts and
replaces complex checks on write_symbols.
gcc/c-family/ChangeLog:
* c-lex.c (init_c_lex): Use dwarf_debuginfo_p.
gcc/ChangeLog:
* config/c6x/c6x.c (c6x_output_file_unwind): Use dwarf_debuginfo_p.
* dw
Hi,
The attached patch replaces __builtin_neon_vtst* (a, b) with (a & b) != 0.
Bootstrapped and tested on arm-linux-gnueabihf and cross-tested on arm*-*-*.
OK to commit ?
Thanks,
Prathamesh
vtst-1.diff
Description: Binary data
On Wed, May 5, 2021 at 12:37 PM Christoph Muellner
wrote:
> The existing CAS implementation uses an INSN definition, which provides
> the core LR/SC sequence. Additionally to that, there is a follow-up code,
> that evaluates the results and calculates the return values.
> This has two drawbacks:
On 2021-05-01 00:27, Jeff Law wrote:
On 4/29/2021 3:50 AM, Jiufu Guo via Gcc-patches wrote:
When there is the possibility that overflow may happen on the loop
index,
a few optimizations would not happen. For example code:
foo (int *a, int *b, unsigned k, unsigned n)
{
while (++k != n)
On Wed, May 5, 2021 at 12:23 PM Christoph Muellner
wrote:
> gcc/
> PR 100266
> * config/rsicv/riscv.c (riscv_block_move_loop): Simplify.
> * config/rsicv/riscv.md (cbranch4): Generate helpers.
>
OK. Committed. Though I had to fix the ChangeLog entry. It was indente
On 2021-05-01 05:37, Segher Boessenkool wrote:
Hi!
On Thu, Apr 29, 2021 at 05:50:48PM +0800, Jiufu Guo wrote:
When there is the possibility that overflow may happen on the loop
index,
a few optimizations would not happen. For example code:
foo (int *a, int *b, unsigned k, unsigned n)
{
whil
Gentle ping, thanks.
On 2021/4/16 15:10, Xiong Hu Luo wrote:
fmod/fmodf and remainder/remainderf could be expanded instead of library
call when fast-math build, which is much faster.
fmodf:
fdivs f0,f1,f2
frizf0,f0
fnmsubs f1,f2,f0,f1
remainderf:
fdivs f0,f1,f2
On Fri, Apr 30, 2021 at 4:10 PM Christoph Müllner via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:
> On Sat, May 1, 2021 at 12:48 AM Jeff Law wrote:
> > On 4/26/2021 5:38 AM, Christoph Muellner via Gcc-patches wrote:
> > > [ree] PR rtl-optimization/100264: Handle more PARALLEL SET expressions
>
Hi All,
Although I had undertaken to concentrate on PDTs, PR99819 so intrigued me
that I became locked into it :-( After extensive, fruitless rummaging
through decl.c and trans-decl.c, I realised that the problem was far
simpler than it seemed and that it lay in class.c. After that PR was fixed,
P
101 - 147 of 147 matches
Mail list logo