Re: [PATCH] aarch64: Use SVE for V2DImode integer min/max operations

2025-09-04 Thread Andrew Pinski
On Thu, Sep 4, 2025 at 5:50 AM Kyrylo Tkachov wrote: > > Hi all, > > Unlike Advanced SIMD, SVE has instruction to perform smin, smax, umin, umax > on 64-bit elements. Thus, we can use them with the fixed-width V2DImode > expander. Most of the machinery is already there on the define_insn side, >

Re: [PATCH] match: simplify `VCE(a) ==/!= 0` to `a ==/!= 0` [PR105749]

2025-09-04 Thread Richard Biener
On Fri, Sep 5, 2025 at 1:50 AM Andrew Pinski wrote: > > SRA likes to create VCE(a) when it comes to bool. This confuses > a few different passes including jump threading and uninitialization > warning. This removes the VCE in one case where it will help. > Values outside of 0/1 with the VCE will p

Re: [Patch, fortran] PR87362 - [PDT] ICE on variable declaration with undefined PDT parameter

2025-09-04 Thread Paul Richard Thomas
Duly swung! Pushed as r16-3589. Thanks Paul On Thu, 4 Sept 2025 at 17:21, Jerry D wrote: > On 9/4/25 6:22 AM, Paul Richard Thomas wrote: > > Hi All, > > > > Although PR87362 is marked as fixed, the error becomes rather more > explicit with > > this patch, which I actually developed for PR1024

[PATCH] bitint: Lower the partial limbs of extended _BitInts with m_limb_type.

2025-09-04 Thread Yang Yujie
Lower the partial limbs of extended _BitInts like the full limbs for most operations, so that explicit extensions can be inserted only where they are really needed. gcc/ChangeLog: * gimple-lower-bitint.cc (struct bitint_large_huge): Remove the abi_load_p parameter of limb_access.

Re: [PATCH 1/2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-04 Thread Richard Biener
On Thu, Sep 4, 2025 at 7:17 PM Jakub Jelinek wrote: > > On Thu, Sep 04, 2025 at 05:59:25PM +0100, Iain Sandoe wrote: > > PR c++/119060 > > > > gcc/ChangeLog: > > > > * builtins.cc (expand_builtin): Handle BUILT_IN_OBSERVABLE_CHKPT. > > * builtins.def (BUILT_IN_OBSERVABLE_CHKPT):

[PATCH 1/1] RISC-V: Only Save/Restore required registers for ILP32E/LP64E

2025-09-04 Thread Jim Lin
Previously the spec https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/70 has changed the save/restore routines to save/restore the registers which are really used for ILP32E/LP64 rather than always save/restore all of ra/s0/s1. I also found here that lacks the implementation for lp

[PATCH v2 2/2] [x86] Use vpermil{ps, pd} instead of vperm{d, q} when permutation is in-lane.

2025-09-04 Thread liuhongt
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: * config/i386/i386-expand.cc (expand_vec_perm_vpermil): Extend to handle V8SImode. (avx_vpermilp_parallel): Extend to handle vector integer modes with same vector size and

Re: [PATCH v5 2/2] RISC-V: Allow VLS types using up to LMUL 8

2025-09-04 Thread Kito Cheng
On Fri, Sep 5, 2025 at 9:21 AM Kito Cheng wrote: > > On Thu, Sep 4, 2025 at 11:50 PM Robin Dapp wrote: > > > > > The layout will be different between VLEN=128 and VLEN=256 (and also > > > any larger VLEN) > > > > > > Give a practical example: > > > vec1 allocated into v8, and v9, the reg layout w

[PATCH v2 6/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation

2025-09-04 Thread Kees Cook
Implement RISC-V-specific KCFI backend. - Function preamble generation using .word directives for type ID storage at offset from function entry point (no alignment NOPs needed due to fix 4-byte instruction size). - Scratch register allocation using t1/t2 (x6/x7) following RISC-V procedure c

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Sam James
Michal Jireš writes: > On 9/4/25 8:53 PM, Sam James wrote: >> Sam James writes: >> >>> Andi Kleen writes: >>> Sam James writes: > Michal Jires writes: > >> I did handle node->iterate_referring, but forgot cnode->callers. >> >> Only change are contents of the new

Re: [PATCH v5 2/2] RISC-V: Allow VLS types using up to LMUL 8

2025-09-04 Thread Kito Cheng
On Thu, Sep 4, 2025 at 11:50 PM Robin Dapp wrote: > > > The layout will be different between VLEN=128 and VLEN=256 (and also > > any larger VLEN) > > > > Give a practical example: > > vec1 allocated into v8, and v9, the reg layout will be: > > > > VLEN = 128 > > v8 = [0, 1, 2, 3] > > v9 = [4, 5, 6

Re: [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API

2025-09-04 Thread Kees Cook
On Thu, Sep 04, 2025 at 05:50:45PM -0700, Andrew Pinski wrote: > On Thu, Sep 4, 2025 at 5:27 PM Kees Cook wrote: > > + > > + /* Unknown builtin type - this should never happen in a well-formed C > > program. */ > > + debug_tree (type); > > + internal_error ("mangle: Unknown builtin type in fu

[PATCH v2 1/7] mangle: Introduce C typeinfo mangling API

2025-09-04 Thread Kees Cook
To support the KCFI type-id which needs to convert unique function prototypes into unique 32-bit values, add a subset of the Itanium C++ mangling ABI for C typeinfo of function prototypes, but then do hashing, which is needed by KCFI to get a 32-bit hash value for a given function prototype. Option

Re: [PATCH v2 1/7] mangle: Introduce C typeinfo mangling API

2025-09-04 Thread Andrew Pinski
On Thu, Sep 4, 2025 at 5:27 PM Kees Cook wrote: > > To support the KCFI type-id which needs to convert unique function > prototypes into unique 32-bit values, add a subset of the Itanium C++ > mangling ABI for C typeinfo of function prototypes, but then do > hashing, which is needed by KCFI to get

Re: [PATCH v2 1/3] LoongArch: Fix the semantic of 16B CAS

2025-09-04 Thread Lulu Cheng
在 2025/9/4 下午7:48, Lulu Cheng 写道: 在 2025/8/22 下午4:14, Xi Ruoyao 写道: In a CAS operation, even if expected != *memory we still need to do an atomic load of *memory into output.  But I made a mistake in the initial implementation, causing the output to contain junk in this situation. Like a nor

[PATCH v2 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure

2025-09-04 Thread Kees Cook
Implements the Linux Kernel Control Flow Integrity ABI, which provides a function prototype based forward edge control flow integrity protection by instrumenting every indirect call to check for a hash value before the target function address. If the hash at the call site and the hash at the target

[PATCH] match: simplify `VCE(a) ==/!= 0` to `a ==/!= 0` [PR105749]

2025-09-04 Thread Andrew Pinski
SRA likes to create VCE(a) when it comes to bool. This confuses a few different passes including jump threading and uninitialization warning. This removes the VCE in one case where it will help. Values outside of 0/1 with the VCE will produce undefined code so removing the VCE is always fine as we

[PATCH v2 5/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation

2025-09-04 Thread Kees Cook
Implement ARM 32-bit KCFI backend supporting ARMv7+: - Function preamble generation using .word directives for type ID storage at -4 byte offset from function entry point (no prefix NOPs needed due to 4-byte instruction alignment). - Use movw/movt instructions for 32-bit immediate loading. -

[PATCH v2 3/7] x86: Add x86_64 Kernel Control Flow Integrity implementation

2025-09-04 Thread Kees Cook
Implement x86_64-specific KCFI backend: - Implies -mindirect-branch-register since KCFI needs call target in a register for typeid hash loading. - Function preamble generation with type IDs positioned at -(4+prefix_nops) offset from function entry point. - Function-aligned KCFI preambles usi

[PATCH v2 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048]

2025-09-04 Thread Kees Cook
Hi! Here is v2, which is substantially changed compared to the RFC[1]. This series implements[2][3] the Linux Kernel Control Flow Integrity ABI, which provides a function prototype based forward edge control flow integrity protection by instrumenting every indirect call to check for a hash value

[PATCH v2 4/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation

2025-09-04 Thread Kees Cook
Implement AArch64-specific KCFI backend. - Function preamble generation using .word directives for type ID storage at offset from function entry point (no default alignment NOPs needed due to fixed 4-byte instruction size). - Trap debugging through ESR (Exception Syndrome Register) encoding

Re: Fix ICE with auto-fdo and -fpartial-profiling

2025-09-04 Thread Kugan Vivekanandarajah
Hi Honza, > On 5 Sep 2025, at 1:30 am, Jan Hubicka wrote: > > External email: Use caution opening links or attachments > > > Hi, > with -fpartial-profling we ICE building perlbench and gcc from spec2k17 since > afdo_annotate_cfg applies knowlede about zero profiles too early. This patch > mo

Re: [Patch fortran] PR84432 & PR114815 - test for non-conforming default PDT initializers

2025-09-04 Thread Steve Kargl
On Thu, Sep 04, 2025 at 10:53:48PM +0200, Harald Anlauf wrote: > Hi Paul! > > Am 04.09.25 um 20:46 schrieb Paul Richard Thomas: > > Hi All, > > > > PDT components with default initializers must have type parameter and > > length expressions that reduce to compile time integer constants. The chunk

Re: [WIP] C++ vs. -ftrivial-auto-var-init=

2025-09-04 Thread Qing Zhao
> On Sep 4, 2025, at 16:15, Jakub Jelinek wrote: > > On Thu, Sep 04, 2025 at 07:47:17PM +, Qing Zhao wrote: >>> On Sep 4, 2025, at 11:01, Jakub Jelinek wrote: >>> >>> On Thu, Sep 04, 2025 at 02:45:16PM +0200, Jakub Jelinek via Gcc wrote: On Wed, Sep 03, 2025 at 03:38:53PM +0200, Jaku

Re: [Patch fortran] PR84432 & PR114815 - test for non-conforming default PDT initializers

2025-09-04 Thread Harald Anlauf
Hi Paul! Am 04.09.25 um 20:46 schrieb Paul Richard Thomas: Hi All, PDT components with default initializers must have type parameter and length expressions that reduce to compile time integer constants. The chunk in expr.cc verifies that this is the case for array bounds and character lengths.

Re: [PATCH v2 2/2] testsuite: arm: factorize arm_v8_neon_ok flags

2025-09-04 Thread Christophe Lyon
Hi Torbjorn, Sorry for the delay On Mon, 1 Sept 2025 at 17:14, Torbjorn SVENSSON wrote: > > > > On 2025-09-01 16:59, Christophe Lyon wrote: > > On Wed, 27 Aug 2025 at 13:25, Torbjorn SVENSSON > > wrote: > >> > >> > >> > >> On 2025-08-18 19:24, Christophe Lyon wrote: > >>> Like we do in othe

Re: [PATCH 10/10] i386: Mark a tree node in i386.cc as TREE_SIDE_EFFECTS

2025-09-04 Thread Joseph Myers
On Mon, 11 Aug 2025, mmalcom...@nvidia.com wrote: > + /* Ensure that this doesn't get optimised out of the COMPOUND_EXPR we > + * define below. It appears on first glance that the fact the > + * initialisation argument is a function call would mean this > + * automatically

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Michal Jireš
On 9/4/25 8:53 PM, Sam James wrote: Sam James writes: Andi Kleen writes: Sam James writes: Michal Jires writes: I did handle node->iterate_referring, but forgot cnode->callers. Only change are contents of the newly separated mark_symbol_referenced_from_asm Thanks, I'll try the n

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Sam James
Sam James writes: > Andi Kleen writes: > >> Sam James writes: >> >>> Michal Jires writes: >>> I did handle node->iterate_referring, but forgot cnode->callers. Only change are contents of the newly separated mark_symbol_referenced_from_asm >>> >>> Thanks, I'll try the new pa

Re: [PATCH v2 01/10] Add -mgrow-frame-downwards

2025-09-04 Thread Jeff Law
On 9/1/25 3:53 PM, Aleksandar Rakic wrote: Hi Jeff, [PATCH v2 02/12] Fix unsafe comparison against stack_pointer_rtx: https://sourceware.org/pipermail/gcc-patches/2025-March/677829.html Is this still an issue? I thought this was fixed a while back in the renamer. I don't necessarily thing

[Patch fortran] PR84432 & PR114815 - test for non-conforming default PDT initializers

2025-09-04 Thread Paul Richard Thomas
Hi All, PDT components with default initializers must have type parameter and length expressions that reduce to compile time integer constants. The chunk in expr.cc verifies that this is the case for array bounds and character lengths. This error checking results in pdt_26.f03 segfaulting because

[PATCH v2 3/3] ibstdc++: Reuse _Bind_back_t functor in ranges::_Partial

2025-09-04 Thread Tomasz Kamiński
This patch refactors ranges::_Partial to be implemented using _Bind_back_t. This allows it to benefit from the changes in r16-3398-g250dd5b5604fbc, specifically making the closure trivially copyable. Since _Bind_back_t already provides an optimized implementation for a single bound argument, specia

Re: [PATCH] RISC-V: Add pattern for vector-scalar single-width floating-point add

2025-09-04 Thread Paul-Antoine Arras
Sorry, this patch seems to be causing a regression in the testsuite. I'll come up with a fixed version. On 04/09/2025 13:14, Paul-Antoine Arras wrote: This pattern enables the combine pass (or late-combine, depending on the case) to merge a vec_duplicate into a plus RTL instruction. Before thi

Re: [PATCH 2/2] libstdc++: Implement P1494 and P3641 Partial program correctness [PR119060]

2025-09-04 Thread Jonathan Wakely
On Thu, 4 Sept 2025, 18:02 Iain Sandoe, wrote: > The facility (with the original shorter name) has been in use on the > contracts development branch for almost a year, and has been tested in > isolation on x86_64-darwin and powerpc64le-linux. > OK for trunk? > thanks > Iain > > --- 8< --- > > Thi

Re: [PATCH v1 0/1] c: Add support for array parameters in _Countof

2025-09-04 Thread Joseph Myers
On Wed, 3 Sep 2025, Alejandro Colomar wrote: > Hi Joseph, > > On Wed, Sep 03, 2025 at 03:44:28PM +, Joseph Myers wrote: > > On Wed, 3 Sep 2025, Alejandro Colomar wrote: > > > > > Hi Joseph, > > > > > > I'd like to ping about this thread. > > > > As far as I know, nothing has been resolved

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Andi Kleen
Sam James writes: > Michal Jires writes: > >> I did handle node->iterate_referring, but forgot cnode->callers. >> >> Only change are contents of the newly separated >> mark_symbol_referenced_from_asm > > Thanks, I'll try the new patch now. > > With the workaround I mentioned earlier, I managed t

Re: [PATCH v1 2/2] AArch64: Add LUTv2 intrinsics

2025-09-04 Thread Karl Meakin
On 03/09/2025 13:33, Kyrylo Tkachov wrote: Hi Karl, On 2 Sep 2025, at 16:16, Karl Meakin wrote: gcc/ChangeLog: * config/aarch64/aarch64-sme.md (@aarch64_sme_write_zt): New insn. (aarch64_sme_lut_zt): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_type): New type format

Re: [PATCH] tree-optimization/121685 - accesses to *this are not trapping

2025-09-04 Thread Jonathan Wakely
On Thu, 4 Sept 2025 at 09:57, Richard Biener wrote: > > On Thu, 4 Sep 2025, Richard Biener wrote: > > > On Thu, Sep 4, 2025 at 10:27 AM Jonathan Wakely wrote: > > > > > > On Thu, 4 Sept 2025 at 08:26, Richard Biener wrote: > > > > > > > > On Wed, 3 Sep 2025, Jakub Jelinek wrote: > > > > > > > >

Re: [RFC PATCH 3/7] kcfi: Add core Kernel Control Flow Integrity infrastructure

2025-09-04 Thread Peter Zijlstra
On Wed, Sep 03, 2025 at 09:24:22PM -0700, Kees Cook wrote: > > If the hacker knows these, it should be quite easy for them to come up with > > a > > matched typeid, is it? > > The hashes aren't considered secret -- they need to be known/match between > compilation units, and even across languag

Re: [PATCH v1] libstdc++: Add _GLIBCXX_RESOLVE_LIB_DEFECTS for 4314 in .

2025-09-04 Thread Tomasz Kaminski
On Wed, Sep 3, 2025 at 6:03 PM Jonathan Wakely wrote: > On Wed, 3 Sept 2025 at 16:30, Luc Grosheintz > wrote: > > > > In r16-2328-g29d53f6213e0a1 we fixed a bug related to user-defined > > objects that can convert to an integers only via an rvalue reference. > > The same commit also implemented

Re: Do not auto-enable loop optimizations with autoFDO

2025-09-04 Thread Andi Kleen
Jan Hubicka writes: > With -O2 we automatically enable several loop optimizations with > -fprofile-use. > The rationale is that those optimizations at -O3 only mainly since they may > hurt performance or not pay back in code size when used blindly on all loops. > Profile feedback gives us data o

[PATCH v2 2/3] libstdc++: Move _Binder and related aliases to separate file.

2025-09-04 Thread Tomasz Kamiński
bits/binders.h is already mapped in libstdc++-v3/doc/doxygen/stdheader.cc. libstdc++-v3/ChangeLog: * include/Makefile.am: Add bits/binders.h * include/Makefile.in: Add bits/binders.h * include/std/functional (std::_Indexed_bound_arg, std::_Binder) (std::__make_boun

[PATCH v2 0/3] libstdc++: Reuse bind_back in ranges::_Partial

2025-09-04 Thread Tomasz Kamiński
v2: Moves change from std::indirect to std::__indirect to first patch, so second patch is plain move, without sneaky changes. Patch 3 is not modified. I haven't implemented Nathan suggestions to use enum instead of bool, in my opinion this add miniscule compile time for no user benefit. And we alr

[PATCH v4] Fix sanitizer attribute infrastructure to use standard TREE_LIST format [PR113264]

2025-09-04 Thread Kees Cook
The __attribute__((__copy__)) functionality was crashing when copying sanitizer-related attributes because these attributes violated the standard GCC attribute infrastructure by storing INTEGER_CST values directly instead of wrapping them in TREE_LIST like all other attributes. Wrap sanitizer attr

Re: [PATCH v3] Fix sanitizer attribute infrastructure to use standard TREE_LIST format [PR113264]

2025-09-04 Thread Kees Cook
On Tue, Aug 26, 2025 at 06:22:31PM +, Qing Zhao wrote: > Hi, Kees, > > > On Aug 26, 2025, at 13:25, Kees Cook wrote: > > > > The __attribute__((__copy__)) functionality was crashing when copying > > sanitizer-related attributes because these attributes violated the standard > > GCC attribute

Re: [PATCH v2 0/3] libstdc++: Reuse bind_back in ranges::_Partial

2025-09-04 Thread Jonathan Wakely
On Thu, 4 Sept 2025 at 10:04, Tomasz Kamiński wrote: > > v2: Moves change from std::indirect to std::__indirect to first patch, Ah, invoke ... I was confused what this had to do with std::indirect! :-) > so second patch is plain move, without sneaky changes. > Patch 3 is not modified. > > I have

Re: [PATCH v2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-09-04 Thread Jonathan Wakely
On Tue, 19 Aug 2025 at 16:24, Patrick Palka wrote: > > On Wed, 16 Jul 2025, Tomasz Kaminski wrote: > > > > > > > On Tue, Jul 15, 2025 at 6:13 PM Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk only > > (since it impacts ABI)? > > > > Changes i

Re: [PATCH] tree-optimization/121685 - accesses to *this are not trapping

2025-09-04 Thread Jonathan Wakely
On Thu, 4 Sept 2025 at 09:40, Richard Biener wrote: > > On Thu, Sep 4, 2025 at 10:27 AM Jonathan Wakely wrote: > > > > On Thu, 4 Sept 2025 at 08:26, Richard Biener wrote: > > > > > > On Wed, 3 Sep 2025, Jakub Jelinek wrote: > > > > > > > On Wed, Sep 03, 2025 at 06:41:08PM +0200, Richard Biener w

Re: [RFC PATCH v2] [AutoFDO] Source filename tracking in GCOV

2025-09-04 Thread Dhruv Chawla
On 18/08/25 13:55, dhr...@nvidia.com wrote: External email: Use caution opening links or attachments From: Dhruv Chawla This patch is a respin of the RFC originally posted at https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686835.html. The patch reads the file names from the GCOV file an

Re: [PATCH 1/2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-04 Thread Jakub Jelinek
On Thu, Sep 04, 2025 at 05:59:25PM +0100, Iain Sandoe wrote: > PR c++/119060 > > gcc/ChangeLog: > > * builtins.cc (expand_builtin): Handle BUILT_IN_OBSERVABLE_CHKPT. > * builtins.def (BUILT_IN_OBSERVABLE_CHKPT): New. > > gcc/c-family/ChangeLog: > > * c-common.cc: Add __b

Re: [PATCH] gcc: don't default to -gstatement-frontiers, -gvariable-location-views for DWARF

2025-09-04 Thread Richard Biener
On Thu, Sep 4, 2025 at 12:28 AM Jeff Law wrote: > > > > On 9/3/25 1:27 PM, Sam James wrote: > > Back in GCC 8, with r8-5241-g8697bf9f46f361 (-gstatement-frontiers), > > r8-6658-g58006663903200, r8-6657-gbd2b9f1e2d67ec > > (-gvariable-location-views), > > some advanced GNU extensions to DWARF were

[PATCH v2 1/3] libstdc++: Merge bind_front and bind_back binders

2025-09-04 Thread Tomasz Kamiński
The _Bind_front and _Bind_back class templates are now merged into a single _Binder implementation that accepts _Back as a template parameter. This makes the bind_back implementation available in C++20 mode, allowing it to be used for range adaptor closures. With zero bound arguments, bind_back an

Re: [PATCH 1/3]middle-end: clear the user unroll flag if the cost model has overriden it

2025-09-04 Thread Richard Sandiford
Richard Biener writes: > On Wed, 3 Sep 2025, Richard Sandiford wrote: > >> Tamar Christina writes: >> > We also don't ever force unrolling for predicated SVE because for >> > predicated SVE we have to balance predicate throughput limitations >> > of any given CPU. Having the user unroll factor b

Re: [PATCH] gcc: don't default to -gstatement-frontiers, -gvariable-location-views for DWARF

2025-09-04 Thread Sam James
Jakub Jelinek writes: > On Thu, Sep 04, 2025 at 09:58:14AM +0200, Richard Biener wrote: >> I'm in favor of disabling but I also fear the code will bitrot quickly if so? >> We might also want to have a set of testcases for the -fcompare-debug >> issues that are fixed by the change of defaults? >>

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Sam James
Sam James writes: > Michal Jires writes: > >> I did handle node->iterate_referring, but forgot cnode->callers. >> >> Only change are contents of the newly separated >> mark_symbol_referenced_from_asm > > Thanks, I'll try the new patch now. > > With the workaround I mentioned earlier, I managed t

[PATCH v2][PR119702] rs6000: Use vector addition when left shifting by 1

2025-09-04 Thread Avinash Jayakar
Hello, This is the second version of the patch proposed for master aiming to fix PR119702. I request the review of this patch. The following sequence of assembly in powerpc64le vspltisw 0,1 vsld 2,2,0 is replaced by this vaddudm 2,2,2 whenever there is a vector left shi

[PATCH v3 1/2] libstdc++: Implement constant_wrapper, cw from P2781R9.

2025-09-04 Thread Luc Grosheintz
This is a partial implementation of P2781R9. It adds std::cw and std::constant_wrapper, but doesn't modify __integral_constant_like for span/mdspan. libstdc++-v3/ChangeLog: * include/bits/version.def (constant_wrapper): Add. * include/bits/version.h: Regenerate. * include/

[PATCH] TLC to vect_create_epilog_for_reduction

2025-09-04 Thread Richard Biener
The following removes back-and-forth of state in vect_create_epilog_for_reduction and code that's pointless, in particular around double reduction handling which isn't that special as it seems. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. * tree-vect-loop.cc (v

[PATCH 2/2] libstdc++: Implement P1494 and P3641 Partial program correctness [PR119060]

2025-09-04 Thread Iain Sandoe
The facility (with the original shorter name) has been in use on the contracts development branch for almost a year, and has been tested in isolation on x86_64-darwin and powerpc64le-linux. OK for trunk? thanks Iain --- 8< --- This implements the library parts of P1494 as amended by P3641. For G

[PATCH 1/2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-04 Thread Iain Sandoe
This patch (with the original shorter name) has been in use on the contracts development branch for almost a year, and has been tested in isolation on x86_64-darwin and powerpc64le-linux. OK for trunk? thanks Iain --- 8< --- The paper provides a mechanism that serves to demarc epochs within the c

[PATCH 0/2] Implement P1494/P3641 Partial program correctness.

2025-09-04 Thread Iain Sandoe
This implements the "observable_checkpoint()" functionality that serves as an epoch delimiter in the source code, where it stipulates that any code before this (and the checkpoint) are considered observable. When code that would exhibit undefined behaviour if reached is seen then that can result i

Re: [PATCH 1/2] RISC-V: Add pattern for vector-scalar widening floating-point multiply

2025-09-04 Thread Paul-Antoine Arras
This is a slightly amended patch that fixes modes and instruction type attribute. Here is the relevant snippet: diff --git gcc/config/riscv/autovec-opt.md gcc/config/riscv/autovec-opt.md index d4335dc04ba..67f4d9ce3a8 100644 --- gcc/config/riscv/autovec-opt.md +++ gcc/config/riscv/autovec-opt.md

[PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Michal Jires
I did handle node->iterate_referring, but forgot cnode->callers. Only change are contents of the newly separated mark_symbol_referenced_from_asm --- This new pass heuristically detects symbols referenced by toplevel assembly to prevent their optimization. Heuristics is done by comparing

Re: [PATCH v3 2/2] libstdc++: Adjust span/mdspan CTAD for P2781R9.

2025-09-04 Thread Tomasz Kaminski
On Thu, Sep 4, 2025 at 2:23 PM Luc Grosheintz wrote: > A usecase for P2781R9 is more ergonomic creation of span and mdspan with > mixed static and dynamic extents, e.g.: > > span(ptr, cw<3>) > extents(cw<3>, 5, cw<7>) > mdspan(ptr, cw<3>, 5, cw<7>) > > should be deduced as: > span

Re: [PATCH] tree-optimization: fabs(a+0.0) -> fabs(a) for non trapping case

2025-09-04 Thread Matteo Nicoli
Dear Richard,No, I don’t have access to gcc git. Anyway, I updated the patch description and attached here the new patch.Best regards,Matteo tree-optimization-121595.patch Description: Binary data On Sep 2, 2025, at 1:57 PM, Richard Biener wrote:On Fri, Aug 29, 2025 at 2:20 PM Matteo Nicoli wrote

Re: [PATCH 07/14] lto: Stream out partitioned toplevel assembly

2025-09-04 Thread Jan Hubicka
> Toplevel assembly is now streamed as partitioned instead of into the > first partition. > > gcc/ChangeLog: > > * lto-cgraph.cc (output_symtab): Remove asm_nodes_out. > * lto-streamer-out.cc (lto_output_toplevel_asms): Use > partitioning. > (create_order_remap): Remove as

[PATCH] AArch64: Add isnan expander [PR 66462]

2025-09-04 Thread Wilco Dijkstra
Add an expander for isnan using integer arithmetic. Since isnan is just a compare, enable it only with -fsignaling-nans to avoid generating spurious exceptions. This fixes part of PR66462. int isnan1 (float x) { return __builtin_isnan (x); } Before: fcmps0, s0 csetw0, v

Re: [PATCH 06/14] lto: Partition toplevel assembly in 1to1

2025-09-04 Thread Jan Hubicka
> 1to1 partitioning now also partitions toplevel assembly. > Other partitionings keep the old behavior of putting all > toplevel assembly into single partition. > > gcc/ChangeLog: > > * lto-cgraph.cc (compute_ltrans_boundary): Add asm_node. > > gcc/lto/ChangeLog: > > * lto-partition

Re: [PATCH] gcc: don't default to -gstatement-frontiers, -gvariable-location-views for DWARF

2025-09-04 Thread Jeff Law
On 9/4/25 1:58 AM, Richard Biener wrote: I'm in favor of disabling but I also fear the code will bitrot quickly if so? Quite likely. But given consumer support doesn't seem to be on the way for any reasonable time horizon, that may not be that big of a problem. We might also want to hav

Re: [PATCH 1/3] libstdc++: Merge bind_front and bind_back binders

2025-09-04 Thread Tomasz Kaminski
On Wed, Sep 3, 2025 at 11:20 PM Patrick Palka wrote: > On Wed, 3 Sep 2025, Tomasz Kamiński wrote: > > > The _Bind_front and _Bind_back class templates are now merged into a > single > > _Binder implementation that accepts _Back as a template parameter. This > makes > > the bind_back implementatio

Re: [PATCH 05/14] lto: Use toplevel_node in lto_symtab_encoder

2025-09-04 Thread Jan Hubicka
> This patch replaces symtab_node with toplevel_node in lto_symtab_encoder > and modifies all places where lto_symtab_encoder is used to handle > (ignore) asm_node. > > gcc/ChangeLog: > > * ipa-icf.cc (sem_item_optimizer::write_summary): Use > toplevel_node. > (sem_item_optimize

Re: [PATCH 3/3] ibstdc++: Reuse _Bind_back_t functor in ranges::_Partial

2025-09-04 Thread Tomasz Kaminski
On Wed, Sep 3, 2025 at 11:37 PM Patrick Palka wrote: > > On Wed, 3 Sep 2025, Tomasz Kamiński wrote: > > > This patch refactors ranges::_Partial to be implemented using > _Bind_back_t. > > This allows it to benefit from the changes in r16-3398-g250dd5b5604fbc, > > specifically making the closure t

Re: [PATCH 04/14] lto: Simplify control variable in loop of balanced partitioning

2025-09-04 Thread Jan Hubicka
> Minor simplification as preparation for next patch. > > gcc/lto/ChangeLog: > > * lto-partition.cc (lto_balanced_map): Simplify. OK, balanced partitining needs either cleanups or obsoletting by the cached algorithm. It was really meant as quick and fast implementation of something that wor

Re: [PATCH 03/14] cgraph: Add toplevel_node

2025-09-04 Thread Jan Hubicka
> asm_node and symbol_node will now inherit from toplevel_node. > This is now useful for lto partitioning, in future it should be also > useful for toplevel extended assembly. > > gcc/ChangeLog: > > * cgraph.h (enum symtab_type): Replace with toplevel_type. > (enum toplevel_type): New

Re: [PATCH v3] libstdc++: Implement LWG4222 'expected' constructor from a single value missing a constraint

2025-09-04 Thread Jonathan Wakely
On Wed, 3 Sept 2025 at 19:31, Jonathan Wakely wrote: > > On Wed, 3 Sept 2025 at 19:26, Jonathan Wakely wrote: > > > > On Wed, 3 Sept 2025 at 19:09, Jonathan Wakely wrote: > > > > > > On Tue, 19 Aug 2025 at 16:17, Patrick Palka wrote: > > > > > > > > LGTM! Perhaps we want to backport this, not sur

Re: [PATCH 02/14] lto: Keep lto file data

2025-09-04 Thread Jan Hubicka
> We use lto_file_data in 1to1 partitioning, so we need to not zero it > out. Nothing depends on lto_file_data being NULL. > > gcc/ChangeLog: > > * cgraph.cc (cgraph_node::release_body): Keep lto_file_data. > (cgraph_node::remove): likewise. > * lto-section-in.cc (lto_free_funct

[PATCH] tree-optimization/121768 - bogus double reduction detected

2025-09-04 Thread Richard Biener
The following changes how we detect double reductions, in particular not setting vect_double_reduction_def on the outer PHIs when the inner loop doesn't satisfy double reduction constraints. It also simplifies the setup a bit by not having to detect wheter we process an inner loop of a double redu

Re: [PATCH v2 0/3] libstdc++: Reuse bind_back in ranges::_Partial

2025-09-04 Thread Tomasz Kaminski
On Thu, Sep 4, 2025 at 2:49 PM Jonathan Wakely wrote: > On Thu, 4 Sept 2025 at 10:04, Tomasz Kamiński wrote: > > > > v2: Moves change from std::indirect to std::__indirect to first patch, > > Ah, invoke ... I was confused what this had to do with std::indirect! :-) > I have no idea why indirect

Re: [PATCH v1 1/4] RISC-V: Combine vec_duplicate + vmadd.vv to vmadd.vx on GR2VR cost

2025-09-04 Thread Robin Dapp
From: Pan Li This patch would like to combine the vec_duplicate + vmadd.vv to the vmadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if t

[PATCH] libstdc++: Use _Drop_iter<_CharT> for formattable concept checking [PR121765]

2025-09-04 Thread Tomasz Kamiński
When producing output, the libstdc++ format implementation only uses _Sink_iter specializations. Since users cannot construct basic_format_context, this is the only iterator type actually used. The __format_padded helper relies on this property to efficiently pad sequences from tuples and ranges.

[committed] arm: wrong code from vset_lane_* [PR121775]

2025-09-04 Thread Richard Earnshaw
Insufficient validation of the operands in vec_set__internal means that the optimizers can transform the exanded code into something that is invalid. We then emit code based on the incorrect RTL assuming that it is still valid. A valid pattern can only have a single bit set in the immediate opera

[pushed] testsuite, darwin: Suppress unwind frames in scantest-lto.c.

2025-09-04 Thread Iain Sandoe
Part of what is needed to fix the BZ, tested on x86_64-darwin and powerpc64le-linux, pushed to trunk, thanks, Iain --- 8< --- Currently, for Darwin unwind and EH frames are emitted without use of .cfi_xxx instructions; the emitted frames also contain the string 'ascii'. For the purpose of this t

[PATCH] aarch64: Use SVE for V2DImode integer min/max operations

2025-09-04 Thread Kyrylo Tkachov
Hi all, Unlike Advanced SIMD, SVE has instruction to perform smin, smax, umin, umax on 64-bit elements. Thus, we can use them with the fixed-width V2DImode expander. Most of the machinery is already there on the define_insn side, supporting V2DImode operands of the SVE pattern. We just need to

Re: [PATCH 01/14] lto: Fix reversed sorting of node order.

2025-09-04 Thread Jan Hubicka
> Sorting by node order in lto partitioning is incorrectly reversed. > For default balanced partitioning this caused all noreorder symbols > to be partitioned into a single partition where they were sorted again, > but correctly. Yep, up to now the order was not really that important, but processi

tree-parloops: Enable runtime thread detection with -ftree-parallelize-loops=0

2025-09-04 Thread Sebastian Pop
This patch adds runtime thread count detection to auto-parallelization. -ftree-parallelize-loops=0 option generates parallelized loops without specifying a fixed thread count, deferring this decision to program execution time where it is controlled by the OMP_NUM_THREADS environment variable. The

Fix ICE with auto-fdo and -fpartial-profiling

2025-09-04 Thread Jan Hubicka
Hi, with -fpartial-profling we ICE building perlbench and gcc from spec2k17 since afdo_annotate_cfg applies knowlede about zero profiles too early. This patch moves it after the early exit when profile is 0 everywhere and also fixes formatting issue in the next block. Bootstrapped/regtesed x86_64

Re: [Patch, fortran] PR87362 - [PDT] ICE on variable declaration with undefined PDT parameter

2025-09-04 Thread Jerry D
On 9/4/25 6:22 AM, Paul Richard Thomas wrote: Hi All, Although PR87362 is marked as fixed, the error becomes rather more explicit with this patch, which I actually developed for PR102457. Regtests on FC42/x86_64 - OK for mainline Paul Yes, OK Swing Away. Jerry

Re: [PATCHv8] libstdc++: Add NTTP bind_front, -back, not_fn (P2714) [PR119744]

2025-09-04 Thread Jonathan Wakely
On Fri, 01 Aug 2025 at 19:01 -0400, Patrick Palka wrote: Three small review comments below: On Fri, 1 Aug 2025, Nathan Myers wrote: Changes in v8: * Adjust template indentation to match rest of file * Change std Mandates conditions from "requires" to static_asserts. * Make _Bind_fn_t defini

Re: [PATCH] vect: Set prolog bound to 0 for VLA alignment [PR121523].

2025-09-04 Thread Richard Biener
On Thu, Sep 4, 2025 at 9:05 AM Robin Dapp wrote: > > > Given we use a poly_int64 for bound_epilog elsewhere now the best thing to > > do would be to have a poly_int64 for bound_prolog as well. For the scaling > > we'd use estimated_poly_value (align_in_elems) then (I guess for alignment > > the d

Re: [PATCH 09/14 v2] lto: Add toplevel assembly heuristics

2025-09-04 Thread Sam James
Michal Jires writes: > I did handle node->iterate_referring, but forgot cnode->callers. > > Only change are contents of the newly separated > mark_symbol_referenced_from_asm Thanks, I'll try the new patch now. With the workaround I mentioned earlier, I managed to build but got this when booting

Re: [PATCH v5 2/2] RISC-V: Allow VLS types using up to LMUL 8

2025-09-04 Thread Robin Dapp
The layout will be different between VLEN=128 and VLEN=256 (and also any larger VLEN) Give a practical example: vec1 allocated into v8, and v9, the reg layout will be: VLEN = 128 v8 = [0, 1, 2, 3] v9 = [4, 5, 6, 7] VLEN=256 v8 = [0, 1, 2, 3, 4, 5, 6, 7] v9 = [?, ?, ?, ?, ?, ?, ?, ?] Then you c

[PATCH] doc: Document missing isinf optab [PR 101852]

2025-09-04 Thread Wilco Dijkstra
Document missing isinf optab. gcc: PR middle-end/101852 * doc/md.texi: Document isinf optab. --- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 973c0dd302964966a91fa8dbab85930d6dbeec9e..a9c3354891551101d25ba2e6656711dbd9c5dd09 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/m

[WIP] C++ vs. -ftrivial-auto-var-init=

2025-09-04 Thread Jakub Jelinek
On Thu, Sep 04, 2025 at 02:45:16PM +0200, Jakub Jelinek via Gcc wrote: > On Wed, Sep 03, 2025 at 03:38:53PM +0200, Jakub Jelinek via Gcc wrote: > > But there is one thing the paper doesn't care about, which looks like a show > > stopper to me, in particular the stuff -Wtrivial-auto-var-init warning

Re: [PATCH] RISC-V: Add pattern for vector-scalar single-width floating-point add

2025-09-04 Thread Paul-Antoine Arras
Here is an updated patch that fixes scan dumps in the testsuite: diff --git gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/floating-point-add-2.c index 042dd0d5ccc..00b9222e765 100644 --- gcc/testsuite/gcc.target/riscv/rvv/a

Re: [RFC PATCH 3/7] kcfi: Add core Kernel Control Flow Integrity infrastructure

2025-09-04 Thread Qing Zhao
> On Sep 4, 2025, at 00:24, Kees Cook wrote: > >> At the same time, A wrapper type is created for the original function type, >> whose typename is >> “__kcfi_wrapper_type_id”. >> >> I am confused: >> >> 1. Why the additional wrapper type is needed? why the original function >> type + “kcf

Re: [PATCH v3 1/2] libstdc++: Implement constant_wrapper, cw from P2781R9.

2025-09-04 Thread Tomasz Kaminski
On Thu, Sep 4, 2025 at 2:20 PM Luc Grosheintz wrote: > This is a partial implementation of P2781R9. It adds std::cw and > std::constant_wrapper, but doesn't modify __integral_constant_like for > span/mdspan. > > libstdc++-v3/ChangeLog: > > * include/bits/version.def (constant_wrapper): Ad

Re: [PATCH v3 1/2] libstdc++: Implement constant_wrapper, cw from P2781R9.

2025-09-04 Thread Luc Grosheintz
On 9/4/25 2:20 PM, Luc Grosheintz wrote: This is a partial implementation of P2781R9. It adds std::cw and std::constant_wrapper, but doesn't modify __integral_constant_like for span/mdspan. libstdc++-v3/ChangeLog: * include/bits/version.def (constant_wrapper): Add. * include/

[Patch, fortran] PR87362 - [PDT] ICE on variable declaration with undefined PDT parameter

2025-09-04 Thread Paul Richard Thomas
Hi All, Although PR87362 is marked as fixed, the error becomes rather more explicit with this patch, which I actually developed for PR102457. Regtests on FC42/x86_64 - OK for mainline Paul Change.Logs Description: Binary data diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc index 1e91b57

RE: [PATCH v1 1/4] RISC-V: Combine vec_duplicate + vmadd.vv to vmadd.vx on GR2VR cost

2025-09-04 Thread Li, Pan2
> It's not really a big deal but at least a bit surprising. If it's just that > then pre-approved. Thanks Robin, I will rename it and commit if no surprise from test. Pan -Original Message- From: Robin Dapp Sent: Thursday, September 4, 2025 8:34 PM To: Li, Pan2 ; Robin Dapp ; gcc-pa

Re: [PATCH v1 1/4] RISC-V: Combine vec_duplicate + vmadd.vv to vmadd.vx on GR2VR cost

2025-09-04 Thread Robin Dapp
So before we had vmacc_vx and now madd can be included. Is this somehow different to pred_mul_plus (without vx)? "mul then plus" sounds like there is some operand order that differs from the regular order but the multiplication is always first IIRC? The difference is just which operand is bei

  1   2   >