Re: [PATCH 1/5] vec: Add quick_emplace_push/safe_emplace_push

2024-10-23 Thread Andrew Pinski
On Tue, Oct 22, 2024 at 11:49 PM Richard Biener wrote: > > On Tue, Oct 22, 2024 at 5:31 PM Andrew Pinski > wrote: > > > > This adds quick_emplace_push and safe_emplace_push to vec. > > These are like std::vector's emplace_back so you don't need an extra > > copy of the struct around. > > > > Sin

Re: [Patch, fortran] PR116733: Generic processing of assumed rank objects (f202y)

2024-10-23 Thread Tobias Burnus
Regarding 202y: I think it is in general useful to have an implementation of features before the standard is released, also to find issues before the standard is released. The downside I currently see is that the none of the features is really ready (in the sense that there are explicit edits).

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-23 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi all, > > Some vector rotate operations can be implemented in a single instruction > rather than using the fallback SHL+USRA sequence. > In particular, when the rotate amount is half the bitwidth of the element > we can use a REV64,REV32,REV16 instruction. > This patch a

Re: [PATCH] match.pd: Add std::pow folding optimizations.

2024-10-23 Thread Jennifer Schmitz
> On 22 Oct 2024, at 13:14, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, 22 Oct 2024, Jennifer Schmitz wrote: > >> >> >>> On 22 Oct 2024, at 11:05, Richard Biener wrote: >>> >>> External email: Use caution opening links or attachments >

[PATCH 1/2 v2] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

2024-10-23 Thread Li Xu
From: xuli When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating a branch instruction. Form2: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ {

Re: [PATCH v4 3/7] OpenMP: C front-end support for dispatch + adjust_args

2024-10-23 Thread Paul-Antoine Arras
Here is an updated patch following these comments. On 09/10/2024 19:15, Tobias Burnus wrote: First comments; I need to have a deeper, but now I need fetch some victuals. Paul-Antoine Arras wrote: This patch adds support to the C front-end to parse the `dispatch` construct and the `adjust_args

Re: [PATCH 5/6] aarch64: Emit XAR for vector rotates where possible

2024-10-23 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi all, > > We can make use of the integrated rotate step of the XAR instruction > to implement most vector integer rotates, as long we zero out one > of the input registers for it. This allows for a lower-latency sequence > than the fallback SHL+USRA, especially when we

[committed] Fortran: Minor follow-up cleanup to error.cc

2024-10-23 Thread Tobias Burnus
Committed attached patch as r15-4565-g0ecc45a88d7722. It removes 'terminal_width', an unused leftover before switching to the common diagnostic, which I missed when doing the last cleanup. Best regards, Tobias commit 0ecc45a88d772268a3bd83af02759857da0826d4 Author: Tobias Burnus Date: Wed Oct

[PATCH v2 0/4] aarch64: add minimal support of AEABI build attributes for GCS

2024-10-23 Thread Matthieu Longo
The primary focus of this patch series is to add support for build attributes in the context of GCS (Guarded Control Stack, an Armv9.4-a extension) to the AArch64 backend. It addresses comments from revision 1 [2] and 2 [3], and proposes a different approach compared to the previous implementati

[PATCH v2 2/4] aarch64: add minimal support of AEABI build attributes for GCS.

2024-10-23 Thread Matthieu Longo
From: Srinath Parvathaneni GCS (Guarded Control Stack, an Armv9.4-a extension) requires some caution at runtime. The runtime linker needs to reason about the compatibility of a set of relocable object files that might not have been compiled with the same compiler. Up until now, GNU properties are

[PATCH v2 4/4] aarch64: encapsulate note.gnu.property emission into a class

2024-10-23 Thread Matthieu Longo
gcc/ChangeLog: * config.gcc: Add aarch64-dwarf-metadata.o to extra_objs. * config/aarch64/aarch64-dwarf-metadata.h (class section_note_gnu_property): Encapsulate GNU properties code into a class. * config/aarch64/aarch64.cc (GNU_PROPERTY_AARCH64_FEAT

[PATCH v2 1/4] aarch64: add debug comments to feature properties in .note.gnu.property

2024-10-23 Thread Matthieu Longo
GNU properties are emitted to provide some information about the features used in the generated code like BTI, GCS, or PAC. However, no debug comment are emitted in the generated assembly even if -dA is provided. It makes understanding the information stored in the .note.gnu.property section more d

[PATCH v2 3/4] aarch64: improve assembly debug comments for AEABI build attributes

2024-10-23 Thread Matthieu Longo
The previous implementation to emit AEABI build attributes did not support string values (asciz) in aeabi_subsection, and was not emitting values associated to tags in the assembly comments. This new approach provides a more user-friendly interface relying on typing, and improves the emitted assem

Re: [PATCH 2/6] aarch64: Use canonical RTL representation for SVE2 XAR and extend it to fixed-width modes

2024-10-23 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi all, > > The MD pattern for the XAR instruction in SVE2 is currently expressed with > non-canonical RTL by using a ROTATERT code with a constant rotate amount. > Fix it by using the left ROTATE code. This necessitates splitting out the > expander separately to translat

[PATCH v2 1/2] aarch64: Add support for mfloat8x{8|16}_t types

2024-10-23 Thread Andrew Carlotti
Compared to v1, I've split changes that aren't used for the type definitions into a separate patch. I've also added some tests, mostly along the lines suggested by Richard S. Bootstrapped and regression tested on aarch64; ok for master? gcc/ChangeLog: * config/aarch64/aarch64-builtins.c

[PATCH 1/5] Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to introduce new IFN for strided load and store. LOAD: v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias) STORE: MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias) The IFN target below code example similar as below void foo (int * a, int * b, int

[PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR for invariant stride memory access. For example as below void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Bef

[PATCH 3/5] RISC-V: Adjust the gather-scatter testcases due to middle-end change

2024-10-23 Thread pan2 . li
From: Pan Li After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the strided case need to be adjust for IR check. The below test suites are passed for this patch: * The riscv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided

[PATCH 5/5] RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li Form 1: void __attribute__((noinline))\ vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \ long stride, size_t size)\ {

Re: [PATCH v4 2/7] OpenMP: middle-end support for dispatch + adjust_args

2024-10-23 Thread Tobias Burnus
Hi PA, thanks for the update. Paul-Antoine Arras wrote: […] Please find attached a revised patch. LGTM, except: * The update to builtins.cc's builtin_fnspec  is lacking in the changelog list. * And the new testcase, new gcc/testsuite/c-c++-common/gomp/dispatch-10.c, has to be put into 3/

[PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

2024-10-23 Thread Li Xu
From: xuli When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating a branch instruction. Form2: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ {

Re: [PATCH] Implement Fortran diagnostic buffering for non-textual formats [PR105916]

2024-10-23 Thread Tobias Burnus
David Malcolm wrote: In order to handle various awkward parsing issues, the Fortran frontend implements buffering of diagnostics, so that diagnostics reported to global_dc can be either: (a) immediately issued, or (b) speculatively reported to global_dc, and stored in a buffer, to either be issue

[PATCH 4/5] RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stri

[PATCH v2 2/2] aarch64: Add mfloat vreinterpret intrinsics

2024-10-23 Thread Andrew Carlotti
This patch splits out some of the qualifier handling from the v1 patch, and adjusts the VREINTERPRET* macros to include support for mf8 intrinsics. Bootstrapped and regression tested on aarch64; ok for master? gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (MODE_d_mf8): New.

Re: [PATCH] match: Reject non-const internal functions [PR117260]

2024-10-23 Thread Richard Biener
On Wed, Oct 23, 2024 at 8:50 AM Richard Biener wrote: > > On Tue, Oct 22, 2024 at 7:21 PM Andrew Pinski > wrote: > > > > When internal functions support was added to match > > (r6-4979-gc9e926ce2bdc8b), > > the check for ECF_CONST was the builtin function side. Though before > > r15-4503-g8d6d

[PATCH 01/22] aarch64: Add -mbranch-protection=gcs option

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy This enables Guarded Control Stack (GCS) compatible code generation. The "standard" branch-protection type enables it, and the default depends on the compiler default. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch_gcs_enabled): Declare. * config/aa

[PATCH 08/22] aarch64: Add __builtin_aarch64_gcs* tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcspopm-1.c: New test. * gcc.target/aarch64/gcspr-1.c: New test. * gcc.target/aarch64/gcsss-1.c: New test. --- gcc/testsuite/gcc.target/aarch64/gcspopm-1.c | 69 gcc/testsuite/gcc.targ

[PATCH 06/22] aarch64: Add GCS instructions

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Add instructions for the Guarded Control Stack extension. GCSSS1 and GCSSS2 are modelled as a single GCSSS unspec, because they are always used together in the compiler. Before GCSPOPM and GCSSS2 an extra "mov xn, 0" is added to clear the output register, this is needed to g

[PATCH 11/22] aarch64: Add ACLE feature macros for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define macros for GCS. --- gcc/config/aarch64/aarch64-c.cc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc ind

[PATCH 04/22] aarch64: Add __builtin_aarch64_chkfeat

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Builtin for chkfeat: the input argument is used to initialize x16 then execute chkfeat and return the updated x16. Note: ACLE __chkfeat(x) plans to flip the bits to be more intuitive (xor the input to output), but for the builtin that seems unnecessary complication. gcc/Chan

[PATCH 09/22] aarch64: Add GCS support for nonlocal stack save

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Nonlocal stack save and restore has to also save and restore the GCS pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto. The GCS specific code is only emitted if GCS branch-protection is enabled and the code always checks at runtime if GCS is enabled. The ne

[PATCH 13/22] aarch64: Add target pragma tests for gcs

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add gcs specific tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cp

[PATCH 15/22] aarch64: Emit GNU property NOTE for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64.cc (GNU_PROPERTY_AARCH64_FEATURE_1_GCS): Define. (aarch64_file_end_indicate_exec_stack): Set GCS property bit. --- gcc/config/aarch64/aarch64.cc | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/confi

[PATCH 21/22] aarch64: Fix tests incompatible with GCS

2024-10-23 Thread Yury Khrustalev
From: Matthieu Longo gcc/testsuite/ChangeLog: * g++.target/aarch64/return_address_sign_ab_exception.C: Update. * gcc.target/aarch64/eh_return.c: Update. --- .../return_address_sign_ab_exception.C| 19 +-- gcc/testsuite/gcc.target/aarch64/eh_return.c | 13

[PATCH 18/22] aarch64: libitm: Add GCS support

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Transaction begin and abort use setjmp/longjmp like operations that need to be updated for GCS compatibility. We use similar logic to libc setjmp/longjmp that support switching stack and thus switching GCS (e.g. due to longjmp out of a makecontext stack), this is kept even tho

[PATCH 12/22] aarch64: Add test for GCS ACLE defs

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_1.c: GCS test. --- .../gcc.target/aarch64/pragma_cpp_predefs_1.c | 30 +++ 1 file changed, 30 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_1.c b/gcc/t

[PATCH 00/22] aarch64: Add support for Guarded Control Stack extension

2024-10-23 Thread Yury Khrustalev
This patch series adds support for the Guarded Control Stack extension [1]. GCS marking for binaries is specified in [2]. Regression tested on AArch64 and no regressions have been found. Is this OK for trunk? Sources and branches: - binutils-gdb: sourceware.org/git/binutils-gdb.git users/ARM/g

[PATCH 19/22] aarch64: Introduce indirect_return attribute

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Tail calls of indirect_return functions from non-indirect_return functions are disallowed even if BTI is disabled, since the call site may have BTI enabled. Following x86, mismatching attribute on function pointers is not a type error even though this can lead to bugs. Neede

[PATCH 14/22] aarch64: Add GCS support to the unwinder

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Follows the current linux ABI that uses single signal entry token and shared shadow stack between thread and alt stack. Could be behind __ARM_FEATURE_GCS_DEFAULT ifdef (only do anything special with gcs compat codegen) but there is a runtime check anyway. Change affected test

[PATCH 22/22] aarch64: Fix nonlocal goto tests incompatible with GCS

2024-10-23 Thread Yury Khrustalev
gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-3.c: New test. * gcc.target/aarch64/sme/nonlocal_goto_4.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_5.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_6.c: Update. --- .../gcc.target/aarch64/gcs-nonlo

[PATCH 02/22] aarch64: Add branch-protection target pragma tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add branch-protection tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 50 +++ 1 file changed, 50 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/prag

[PATCH 03/22] aarch64: Add support for chkfeat insn

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy This is a hint space instruction to check for enabled HW features and update the x16 register accordingly. Use unspec_volatile to prevent reordering it around calls since calls can enable or disable HW features. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_ch

[PATCH 05/22] aarch64: Add __builtin_aarch64_chkfeat tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/chkfeat-1.c: New test. * gcc.target/aarch64/chkfeat-2.c: New test. --- gcc/testsuite/gcc.target/aarch64/chkfeat-1.c | 75 gcc/testsuite/gcc.target/aarch64/chkfeat-2.c | 15 2 files change

[PATCH 16/22] aarch64: libgcc: add GCS marking to asm

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy libgcc/ChangeLog: * config/aarch64/aarch64-asm.h (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libgcc/config/aarch64/aarch64-asm.h | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(

[PATCH 20/22] aarch64: Add tests and docs for indirect_return attribute

2024-10-23 Thread Yury Khrustalev
From: Richard Ball This patch adds a new testcase and docs for the indirect_return attribute. gcc/ChangeLog: * doc/extend.texi: Add AArch64 docs for indirect_return attribute. gcc/testsuite/ChangeLog: * gcc.target/aarch64/indirect_return.c: New test. Co-authore

[PATCH 07/22] aarch64: Add GCS builtins

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Add new builtins for GCS: void *__builtin_aarch64_gcspr (void) uint64_t __builtin_aarch64_gcspopm (void) void *__builtin_aarch64_gcsss (void *) The builtins are always enabled, but should be used behind runtime checks in case the target does not support GCS. They are t

[PATCH 17/22] aarch64: libatomic: add GCS marking to asm

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libatomic/config/linux/aarch64/atomic_16.S | 11 +-- 1 file changed, 9 insertions(+), 2 de

[PATCH 10/22] aarch64: Add non-local goto and jump tests for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy These are scan asm tests only, relying on existing execution tests for runtime coverage. gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-1.c: New test. * gcc.target/aarch64/gcs-nonlocal-2.c: New test. --- .../gcc.target/aarch64/gcs-nonlocal-1.c

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Akram Ahmad writes: > This renames the existing {s,u}q{add,sub} instructions to use the > standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and > IFN_SAT_SUB. > > The NEON intrinsics for saturating arithmetic and their corresponding > builtins are changed to use these standard names to

Re: [PATCH v3] AArch64: Fix copysign patterns

2024-10-23 Thread Richard Sandiford
Wilco Dijkstra writes: > The current copysign pattern has a mismatch in the predicates and constraints > - > operand[2] is a register_operand but also has an alternative X which allows > any > operand. Since it is a floating point operation, having an integer > alternative > makes no sense. C

[PATCH] libstdc++: Replace std::__to_address in C++20 branch in

2024-10-23 Thread Jonathan Wakely
As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the usage of std::__to_address to std::to_address in the C++20-specific branch that works on types satisfying std::contiguous_iterator. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (assign(Iter, Iter)): Call

[PATCH] libstdc++: Add GLIBCXX_TESTSUITE_STDS example to docs

2024-10-23 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * doc/xml/manual/test.xml: Add GLIBCXX_TESTSUITE_STDS example. * doc/html/manual/test.html: Regenerate. --- This patch is also available as a pull request in the forge: https://forge.sourceware.org/gcc/gcc-TEST/pulls/1 libstdc++-v3/doc/html/manual/test.ht

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-23 Thread Richard Sandiford
Soumya AR writes: > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc > b/gcc/config/aarch64/aarch64-sve-builtins.cc > index 41673745cfe..aa556859d2e 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc > @@ -1143,11 +1143,14 @@ aarch6

[PATCH 1/2] Relax vect_check_scalar_mask check

2024-10-23 Thread Richard Biener
When the mask is not a constant or external def there's no need to check the scalar type, in particular with SLP and the mask being a VEC_PERM_EXPR there isn't a scalar operand ready to check (not one vect_is_simple_use will get you). We later check the vector type and reject non-mask types there.

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Richard Sandiford writes: > Akram Ahmad writes: >> This renames the existing {s,u}q{add,sub} instructions to use the >> standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and >> IFN_SAT_SUB. >> >> The NEON intrinsics for saturating arithmetic and their corresponding >> builtins are cha

Re: [PATCH v3] aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]

2024-10-23 Thread Richard Sandiford
Pengxuan Zheng writes: > This is similar to the recent improvements to the Advanced SIMD popcount > expansion by using SVE. We can utilize SVE to generate more efficient code for > scalar mode popcount too. > > Changes since v1: > * v2: Add a new VNx1BI mode and a new test case for V1DI. > * v3: A

[PATCH 2/2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-23 Thread Richard Biener
The following implements masked load-lane discovery for SLP. The challenge here is that a masked load has a full-width mask with group-size number of elements when this becomes a masked load-lanes instruction one mask element gates all group members. We already have some discovery hints in place,

Re: [PATCH v4 2/7] OpenMP: middle-end support for dispatch + adjust_args

2024-10-23 Thread Paul-Antoine Arras
Here is the updated patch. On 23/10/2024 11:41, Tobias Burnus wrote: * The update to builtins.cc's builtin_fnspec  is lacking in the changelog list. Added missing items to the ChangeLog. * And the new testcase, new gcc/testsuite/c-c++-common/gomp/ dispatch-10.c, has to be put into 3/7 or lat

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/22/24 2:17 PM, Jakub Jelinek wrote: Hi! The following testcase shows that the previous get_member_function_from_ptrfunc changes weren't sufficient and we still have cases where -fsanitize=undefined with pointers to member functions can cause wrong code being generated and related false pos

[COMMITTED] PR tree-optimization/117222 - Implement operator_pointer_diff::fold_range

2024-10-23 Thread Andrew MacLeod
pointer_diff depends on range_operator::fold_range to do the generic fold, which invokes wi_fold on subranges.  It also in turn invokes op1_op2_relation_effect for relation effects. This worked fine when pointers were implemented with irange, but when the transition to prange was made, a new

Re: [PATCH v2 2/4] aarch64: add minimal support of AEABI build attributes for GCS.

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > @@ -24803,6 +24834,16 @@ aarch64_start_file (void) > asm_fprintf (asm_out_file, "\t.arch %s\n", > aarch64_last_printed_arch_string.c_str ()); > > + /* Check whether the current assembly supports gcs build attributes, if not > + fallback to .note.gn

Re: [PATCH v2 0/4] aarch64: add minimal support of AEABI build attributes for GCS

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > The primary focus of this patch series is to add support for build attributes > in the context of GCS (Guarded Control Stack, an Armv9.4-a extension) to the > AArch64 backend. > It addresses comments from revision 1 [2] and 2 [3], and proposes a different > approach com

Re: [PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-23 Thread Richard Sandiford
Evgeny Karpov writes: > Tuesday, October 22, 2024 > Richard Sandiford wrote: > >>> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT, >>> it might trigger a relocation issue. >>> >>> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L >> >> Sorry to press the issue

Re: [PATCH v7] Target-independent store forwarding avoidance.

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 04:27:29PM +0200, Konstantinos Eleftheriou wrote: Just random ChangeLog formatting nits, not actual patch review: > gcc/ChangeLog: > > * Makefile.in: Add avoid-store-forwarding.o Missing . at the end. Though, you should really also mention what you're changing, so

[PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jonathan Wakely
The __alignas_is_defined macro has been required by C++ since C++11, and C++ Library DR 4036 clarified that __alignof_is_defined should be defined too. The macros alignas and alignof should not be defined, as they're keywords in C++. Technically it's implementation-defined whether __STDC_VERSION_

Re: [PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-23 Thread Richard Sandiford
Jennifer Schmitz writes: > This patch folds svindex with constant arguments into a vector series. > We implemented this in svindex_impl::fold using the function build_vec_series. > For example, > svuint64_t f1 () > { > return svindex_u642 (10, 3); > } > compiled with -O2 -march=armv8.2-a+sve, is

Re: [PATCH 3/9] Simplify X /[ex] Y cmp Z -> X cmp (Y * Z)

2024-10-23 Thread Andrew MacLeod
On 10/18/24 12:48, Richard Sandiford wrote: [+ranger folks, who I forgot to CC originally, sorry!] This patch applies X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. The closest check for "is representable" on range operations seemed to be overflow_free_p. However, that is desi

Re: [PATCH] libstdc++: Replace std::__to_address in C++20 branch in

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 13:18, Jonathan Wakely wrote: > > As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the > usage of std::__to_address to std::to_address in the C++20-specific > branch that works on types satisfying std::contiguous_iterator. > > libstdc++-v3/ChangeLog: > >

Re: [PATCH 2/2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-23 Thread Richard Sandiford
Richard Biener writes: > The following implements masked load-lane discovery for SLP. The > challenge here is that a masked load has a full-width mask with > group-size number of elements when this becomes a masked load-lanes > instruction one mask element gates all group members. We already > h

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-23 Thread Patrick Palka
On Tue, 22 Oct 2024, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > -- >8 -- > This patch implements C++26 Pack Indexing, as described in > . > > The issue discussing how to mangle pack indexes has not been resolved > yet

[pushed] doc: remove obsolete deprecated info

2024-10-23 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- These formerly deprecated features eventually made it into the C++ standard. gcc/ChangeLog: * doc/extend.texi (Deprecated Features): Remove text about some no-longer-deprecated features. --- gcc/doc/extend.texi | 10 --

Re: counted_by attribute and type compatibility

2024-10-23 Thread Qing Zhao
> On Oct 22, 2024, at 15:16, Martin Uecker wrote: > >>> >>> I doesn't really make sense when they are inconsistent. >>> Still, we could just warn and pick one of the attributes >>> when forming the composite type. >> >> If both are defined locally, such inconsistencies should be very ea

Re: [PATCH v6] Target-independent store forwarding avoidance.

2024-10-23 Thread Konstantinos Eleftheriou
Hi Jeff, thanks for the feedback. Indeed, there was an issue with copying back the load register when the load is eliminated. I just sent a new version (https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666230.html). On Fri, Oct 18, 2024 at 9:55 PM Jeff Law wrote: > > > > On 10/18/24 3:57 AM

Re: [PATCH 3/3] aarch64: Add SVE support for simd clones [PR 96342]

2024-10-23 Thread Victor Do Nascimento
On 2/1/24 21:59, Richard Sandiford wrote: Andre Vieira writes: This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/Ch

Re: [Bug libstdc++/115285] [12/13/14/15 Regression] std::unordered_set can have duplicate value

2024-10-23 Thread François Dumont
Sorry but I'm not sure, is it also ok for the 3 backports ? On 22/10/2024 22:43, Jonathan Wakely wrote: On Tue, 22 Oct 2024 at 18:28, François Dumont wrote: Hi libstdc++: Always instantiate key_type to compute hash code [PR115285] Even if it is possible to compute a hash code fro

RE: [PATCH v3] aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]

2024-10-23 Thread Pengxuan Zheng (QUIC)
> Pengxuan Zheng writes: > > This is similar to the recent improvements to the Advanced SIMD > > popcount expansion by using SVE. We can utilize SVE to generate more > > efficient code for scalar mode popcount too. > > > > Changes since v1: > > * v2: Add a new VNx1BI mode and a new test case for V

Re: [Bug libstdc++/115285] [12/13/14/15 Regression] std::unordered_set can have duplicate value

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 18:37, François Dumont wrote: > > Sorry but I'm not sure, is it also ok for the 3 backports ? Yeah, I should have said - OK for the branches too, thanks. > > On 22/10/2024 22:43, Jonathan Wakely wrote: > > On Tue, 22 Oct 2024 at 18:28, François Dumont wrote: > >> Hi > >>

testsuite: Use -std=gnu17 in gcc.dg/pr114115.c

2024-10-23 Thread Joseph Myers
One test failing with a -std=gnu23 default that I wanted to investigate further is gcc.dg/pr114115.c. Building with -std=gnu23 produces a warning: pr114115.c:18:8: warning: 'ifunc' resolver for 'foo_ifunc2' should return 'void * (*)(void)' [-Wattribute-alias=] It turns out that this warning (fr

[PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-23 Thread Evgeny Karpov
Tuesday, October 22, 2024 Richard Sandiford wrote: >> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT, >> it might trigger a relocation issue. >> >> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L > > Sorry to press the issue, but: why does that happen? #def

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-23 Thread Jason Merrill
On 10/23/24 10:20 AM, Patrick Palka wrote: On Tue, 22 Oct 2024, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- This patch implements C++26 Pack Indexing, as described in . The issue discussing how to mangle pack indexes ha

[committed] libstdc++: Add -D_GLIBCXX_ASSERTIONS default for -O0 to API history

2024-10-23 Thread Jonathan Wakely
Excuse the huge diff, it's because it adds a new section heading so all the TOC pages and section listings change. Pushed to trunk. -- >8 -- libstdc++-v3/ChangeLog: * doc/xml/manual/evolution.xml: Document that assertions are enabled for unoptimized builds. * doc/html/*:

Re: [PATCH] Implement Fortran diagnostic buffering for non-textual formats [PR105916]

2024-10-23 Thread David Malcolm
On Wed, 2024-10-23 at 11:03 +0200, Tobias Burnus wrote: > David Malcolm wrote: > > In order to handle various awkward parsing issues, the Fortran > > frontend > > implements buffering of diagnostics, so that diagnostics reported > > to > > global_dc can be either: > > (a) immediately issued, or > >

[PATCH v7] Target-independent store forwarding avoidance.

2024-10-23 Thread Konstantinos Eleftheriou
From: kelefth This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0, [x1] # Expensive store forwarding to larger load. To

[PATCH] top-level: Add pull request template for Forgejo

2024-10-23 Thread Jonathan Wakely
This complements the existing .github/PULL_REQUEST_TEMPLATE.md file, which is used when somebody opens a pull request for an unofficial mirror/fork of GCC on Github. The text in the existing file is very specific to GitHub and doesn't make much sense to include on every PR created on forge.sourcewa

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Akram Ahmad
On 23/10/2024 12:20, Richard Sandiford wrote: Thanks for doing this. The approach looks good. My main question is: are we sure that we want to use the Advanced SIMD instructions for signed saturating SI and DI arithmetic on GPRs? E.g. for addition, we only saturate at the negative limit if bot

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Akram Ahmad writes: > On 23/10/2024 12:20, Richard Sandiford wrote: >> Thanks for doing this. The approach looks good. My main question is: >> are we sure that we want to use the Advanced SIMD instructions for >> signed saturating SI and DI arithmetic on GPRs? E.g. for addition, >> we only satu

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jason Merrill
On 10/23/24 10:39 AM, Jonathan Wakely wrote: The __alignas_is_defined macro has been required by C++ since C++11, and C++ Library DR 4036 clarified that __alignof_is_defined should be defined too. The macros alignas and alignof should not be defined, as they're keywords in C++. Technically it's

Re: [PATCH] SVE intrinsics: Fold division and multiplication by -1 to neg.

2024-10-23 Thread Richard Sandiford
Jennifer Schmitz writes: > Because a neg instruction has lower latency and higher throughput than > sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv, > this is already implemented on the RTL level; for svmul, the > optimization was still missing. > This patch implements foldin

Re: [PATCH v2 2/2] aarch64: Add mfloat vreinterpret intrinsics

2024-10-23 Thread Richard Sandiford
Andrew Carlotti writes: > This patch splits out some of the qualifier handling from the v1 patch, and > adjusts the VREINTERPRET* macros to include support for mf8 intrinsics. > > Bootstrapped and regression tested on aarch64; ok for master? > > gcc/ChangeLog: > > * config/aarch64/aarch64-bu

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-10-23 Thread Richard Biener
On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote: > > From: Andi Kleen > > Retrieving sys/user time in timevars is quite expensive because it > always needs a system call. Only getting the wall time is much > cheaper because operating systems have optimized paths for this. > > The sys time isn't t

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-23 Thread Richard Biener
On Tue, Oct 15, 2024 at 1:10 AM James K. Lowden wrote: > > Consequent to advice, I'm preparing the Cobol front-end patches as a > small number of hopefully meaningful patches covering many files. > > 1. meta files used by autotools etc. > 2. gcc/cobol/*.h > 3. gcc/cobol/*.{y,l,cc} > 4. libgcob

Re: [PATCH v2 1/2] aarch64: Add support for mfloat8x{8|16}_t types

2024-10-23 Thread Richard Sandiford
Andrew Carlotti writes: > Compared to v1, I've split changes that aren't used for the type definitions > into a separate patch. I've also added some tests, mostly along the lines > suggested by Richard S. > > Bootstrapped and regression tested on aarch64; ok for master? > > gcc/ChangeLog: > >

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/23/24 12:33 PM, Jakub Jelinek wrote: On Wed, Oct 23, 2024 at 12:27:32PM -0400, Jason Merrill wrote: On 10/22/24 2:17 PM, Jakub Jelinek wrote: The following testcase shows that the previous get_member_function_from_ptrfunc changes weren't sufficient and we still have cases where -fsanitize

[Pushed] aarch64: Fix warning in aarch64_ptrue_reg

2024-10-23 Thread Andrew Pinski
After r15-4579-g9ffcf1f193b477, we get the following warning/error while bootstrapping on aarch64: ``` ../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* aarch64_ptrue_reg(machine_mode, unsigned int)’: ../../gcc/gcc/config/aarch64/aarch64.cc:3643:21: error: comparison of integer expr

Re: [PATCH 2/4] RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729]

2024-10-23 Thread Vineet Gupta
On 10/22/24 12:02, rep.dot@gmail.com wrote: >> +/* { dg-final { scan-assembler-times "%sfp" 0 } } */ > scan-assembler-not, please Fixed and also in the other patch. Thx, -Vineet

Re: [PATCH v2 3/4] aarch64: improve assembly debug comments for AEABI build attributes

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > The previous implementation to emit AEABI build attributes did not > support string values (asciz) in aeabi_subsection, and was not > emitting values associated to tags in the assembly comments. > > This new approach provides a more user-friendly interface relying on > typ

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 12:27:32PM -0400, Jason Merrill wrote: > On 10/22/24 2:17 PM, Jakub Jelinek wrote: > > The following testcase shows that the previous > > get_member_function_from_ptrfunc > > changes weren't sufficient and we still have cases where > > -fsanitize=undefined with pointers to

RE: [Pushed] aarch64: Fix warning in aarch64_ptrue_reg

2024-10-23 Thread Pengxuan Zheng (QUIC)
My bad. Thanks for fixing this quickly, Andrew! Thanks, Pengxuan > > After r15-4579-g9ffcf1f193b477, we get the following warning/error while > bootstrapping on aarch64: > ``` > ../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* > aarch64_ptrue_reg(machine_mode, unsigned int)’: > ../.

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/23/24 3:07 PM, Jakub Jelinek wrote: On Wed, Oct 23, 2024 at 08:53:36PM +0200, Jakub Jelinek wrote: save_expr has been doing that at least since 1992, likely before that. Though, that 4073 /* Array ref is const/volatile if the array elements are 4074 or if the array is.

[PATCH 3/2] c++: remove WILDCARD_DECL

2024-10-23 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- This tree code was added as part of the initial Concepts TS implementation to support type-constraints introducing any kind of template-parameter, not just type template-parameters, e.g. template concept C

  1   2   >