Re: [PATCH v5] [aarch64] Make better use of overflowing operations in max/min(a, add/sub(a, b)) [PR116815]

2025-11-04 Thread Uros Bizjak
On Wed, Nov 5, 2025 at 6:09 AM Dhruv Chawla wrote: > > On 28/10/25 22:03, Alex Coplan wrote: > > External email: Use caution opening links or attachments > > > > > > Hi Dhruv, > > > > Sorry for the long wait on this one. Comments below ... > > Hi Alex, > > Thanks for the review. I have attached a

PING: [PATCH] Add TARGET_VOLATILE_MEM_OK_IN_INSN

2025-11-04 Thread H.J. Lu
On Tue, Oct 28, 2025 at 10:31 AM H.J. Lu wrote: > > On Tue, Oct 28, 2025 at 5:53 AM H.J. Lu wrote: > > > > On Tue, Oct 28, 2025 at 12:04 AM Jeff Law wrote: > > > > > > > > > > > > On 10/27/25 2:37 AM, Richard Biener wrote: > > > ? > > > >> > > > >> There are 2 reasons: > > > >> > > > >> 1. RISC

[PATCH v5] [aarch64] Make better use of overflowing operations in max/min(a, add/sub(a, b)) [PR116815]

2025-11-04 Thread Dhruv Chawla
On 28/10/25 22:03, Alex Coplan wrote: External email: Use caution opening links or attachments Hi Dhruv, Sorry for the long wait on this one. Comments below ... Hi Alex, Thanks for the review. I have attached a new version of the patch to this email. On 18/08/2025 21:31, [email protected]

[PATCH v3] match.pd: Fold (y << x) x -> 0 or 1

2025-11-04 Thread Dhruv Chawla
On 27/08/25 18:27, Richard Biener wrote: External email: Use caution opening links or attachments On Mon, 25 Aug 2025, [email protected] wrote: From: Dhruv Chawla For ==, < and <=, the fold is to 0. For !=, > and >=, the fold is to 1. This only applies when C != 0. So -50 << 1 < 1 is true

[PING^2][RFC PATCH v4 0/3] Source filename tracking in GCOV

2025-11-04 Thread Dhruv Chawla
On 27/10/25 12:04, Dhruv Chawla wrote: External email: Use caution opening links or attachments On 16/10/25 13:53, [email protected] wrote: External email: Use caution opening links or attachments From: Dhruv Chawla This patch series is a respin of the RFC originally posted at https://gcc.

Re:[pushed] [PATCH] LoongArch: Avoid unnecessary zero-initialization using LSX for scalar popcount

2025-11-04 Thread Lulu Cheng
Pushed to r16-5036. I'm so sorry it took so long to merge. Thanks! 在 2025/2/22 下午3:34, Xi Ruoyao 写道: Now for __builtin_popcountl we are getting things like vrepli.b$vr0,0 vinsgr2vr.d $vr0,$r4,0 vpcnt.d $vr0,$vr0 vpickve2gr.du $r4,$vr0,0 sl

[PATCH] LoongArch: When loading an immediate value, promote mode to word_mode.

2025-11-04 Thread Lulu Cheng
This optimization can eliminate redundant immediate load instructions during CSE optimization. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_legitimize_move): Optimize. gcc/testsuite/ChangeLog: * gcc.target/loongarch/sign-extend-6.c: New test. --- gcc/confi

[PATCH 1/2] LoongArch: Implement sge and sgeu.

2025-11-04 Thread Lulu Cheng
The original implementation of the function loongarch_extend_comparands only prevented op1 from being loaded into the register when op1 was const0_rtx. It has now been modified so that op1 is not loaded into the register as long as op1 is an immediate value. This allows slt{u}i to be generated in

[PATCH 2/2] LoongArch: Redundant sign extension instruction optimization.

2025-11-04 Thread Lulu Cheng
When the mode of the destination operand selected by the condition is SImode, explicit sign extension is applied to both selected source operands, and the destination operand is marked as sign-extended. This method can eliminate some of the sign extension instructions caused by conditional selecti

Re: [to-be-committed][RISC-V][SH][PR rtl-optimization/67731] Improve logical IOR of single bit bitfields

2025-11-04 Thread Oleg Endo
On Sat, 2025-11-01 at 10:34 -0600, Jeff Law wrote: > This is Shreya's work except for the SH testcase which I added after > realizing her work would also fix the testcases for that port. I > bootstrapped and regression tested this on sh4-linux-gnu, x86_64 & > risc-v. It also was tested acro

Re: [PATCH] cobol: Implement the XML PARSE statement; implement POSIX

2025-11-04 Thread Joseph Myers
On Tue, 4 Nov 2025, James K. Lowden wrote: > Maybe you're just saying that the test comes late, so the user won't > find out until late. Technology, right? Yes, it comes late (in libgcobol/configure not gcc/configure). -- Joseph S. Myers [email protected]

[to-be-committed][RISC-V][PR 121136] Improve various tests which only need to examine upper bits in a GPR

2025-11-04 Thread Jeff Law
This addresses the first level issues seen in generating better performing code for testcases derived from pr121136. It likely regresses code size in some cases as in many cases it selects code sequences that should be better performing, though larger to encode. Improving -Os code generation

Re: [PATCH] cobol: Implement the XML PARSE statement; implement POSIX

2025-11-04 Thread James K. Lowden
> On Thu, 23 Oct 2025, Siddhesh Poyarekar wrote: > > > IMO the best way forward here would be adding a configure check for > > libxml2 specifically for libgcobol (and maybe gcc/cobol if the > > frontend needs it too). Siddhesh, thanks. In the specific, you're talking about AC_SEARCH_LI

Re: [PATCH] vect: Reduce group size of consecutive strided accesses.

2025-11-04 Thread Robin Dapp
> This seems a bit "dangerous" to do early. In fact your changes below ... > >> + } >>gap = DR_GROUP_GAP (first_stmt_info); >>single_element_p = (stmt_info == first_stmt_info >> && !DR_GROUP_NEXT_ELEMENT (stmt_info)); >> @@ -9876,7 +9912,14 @@ vector

Re: [RFC 3/9] Implement recording/getting of mask/length for BB SLP

2025-11-04 Thread Christopher Bazley
On 28/10/2025 13:29, Richard Biener wrote: +/* Materialize mask number INDEX for a group of scalar stmts in SLP_NODE that + operate on NVECTORS vectors of type VECTYPE, where 0 <= INDEX < NVECTORS. + Masking is only required for the tail, therefore NULL_TREE is returned for + every value of

Re: [Patch, fortran] PR122501 and 122524 - PDT constructors in ASSOCIATE blocks.

2025-11-04 Thread Jerry D
On 11/4/25 6:23 AM, Paul Richard Thomas wrote: Hi All, It turned out that attempting to pick out specific interfaces for PDT constructors in primary.cc was way too early. This caused a problem in ASSOCIATE blocks simply because the associate name and its selector are not usable until resolution.

[PATCH, committed] Fortran: fix frontend memleak with DO CONCURRENT [PR122564]

2025-11-04 Thread Harald Anlauf
Dear All, I pushed the attached fix for a frontend memleak as obvious after regtesting on x86_64-pc-linux-gnu as r16-5032-g4cad566793d0a2 . The PR also mentions more valgrind issues suggesting lacking initialization of C++ stuff which I have no idea how to fix. Thanks, Harald From 4cad566793d0

[Ada] Fix incorrect legality check in instantiation of child generic unit

2025-11-04 Thread Eric Botcazou
The problem arises when the generic unit has a formal access type parameter, because the manual resolution implemented in Find_Actual_Type does not pick the correct entity for the designated type. The fix replaces it with a bona fide resolution and cleans up the associated code in the callers.

[Ada] Fix explicit raise on subtype of lock-free protected type

2025-11-04 Thread Eric Botcazou
The problem is that the Uses_Lock_Free flag is not propagated to the subtype. Tested on x86-64/Linux, applied on the mainline. 2025-11-04 Eric Botcazou PR ada/84320 * sem_ch3.adb (Analyze_Subtype_Declaration) : Propagate the Uses_Lock_Free flag for protected types.

Re: [PATCH v4 1/4] gcc/: Rename warn_parm_array_mismatch() => warn_parms_array_mismatch()

2025-11-04 Thread Joseph Myers
On Tue, 4 Nov 2025, Alejandro Colomar wrote: > Hi Joseph, > > On Mon, Nov 03, 2025 at 08:59:11PM +, Joseph Myers wrote: > > On Sun, 2 Nov 2025, Alejandro Colomar wrote: > > > > > This function acts on entire parameter declaration lists, and iterates > > > over them. Use plural in the name,

Re: [PATCH] c++/modules: Allow ignoring some TU-local exposure errors in GMF [PR121574]

2025-11-04 Thread Jason Merrill
On 11/4/25 8:50 AM, Nathaniel Shead wrote: On Fri, Oct 31, 2025 at 08:56:30AM +0300, Jason Merrill wrote: On 10/30/25 3:00 PM, Nathaniel Shead wrote: One unfortunate side effect of this is that even with -pedantic-errors, unless the user specifies '-Wtemplate-names-tu-local' when building the m

Re: [PATCH] c++/modules: Allow ignoring some TU-local exposure errors in GMF [PR121574]

2025-11-04 Thread Jason Merrill
On 10/30/25 3:00 PM, Nathaniel Shead wrote: One unfortunate side effect of this is that even with -pedantic-errors, unless the user specifies '-Wtemplate-names-tu-local' when building the module interface there will be no diagnostic at all from instantiating a template that exposes global TU-loca

[PATCH v6 4/7] x86: Add x86_64 Kernel Control Flow Integrity implementation

2025-11-04 Thread Kees Cook
Implement x86_64-specific KCFI backend: - Implies -mindirect-branch-register since KCFI needs call target in a register for typeid hash loading. - Function preamble generation with type IDs positioned at -(4+prefix_nops) offset from function entry point. - Function-aligned KCFI preambles usi

[PATCH v6 5/7] aarch64: Add AArch64 Kernel Control Flow Integrity implementation

2025-11-04 Thread Kees Cook
Implement AArch64-specific KCFI backend. - Trap debugging through ESR (Exception Syndrome Register) encoding in BRK instruction immediate values. - Scratch register allocation using w16/w17 (x16/x17) following AArch64 procedure call standard for intra-procedure-call registers, which already

[PATCH v6 6/7] arm: Add ARM 32-bit Kernel Control Flow Integrity implementation

2025-11-04 Thread Kees Cook
Implement ARM 32-bit KCFI backend: - Use eor instructions for 32-bit immediate loading. - Trap debugging through UDF instruction immediate encoding following AArch64 BRK pattern for encoding registers with useful contents. - Scratch register allocation uses ip by default since it is most com

[PATCH v6 1/7] typeinfo: Introduce KCFI typeinfo mangling API

2025-11-04 Thread Kees Cook
To support the KCFI typeid and future type-based allocators, which need to convert unique types into unique 32-bit values, add a mangling system based on the Itanium C++ mangling ABI, adapted for C types. Introduce __builtin_typeinfo_hash for the hash, and __builtin_typeinfo_name for testing and de

[PATCH v6 2/7] kcfi: Add core Kernel Control Flow Integrity infrastructure

2025-11-04 Thread Kees Cook
Implements the Linux Kernel Control Flow Integrity ABI, which provides a function prototype based forward edge control flow integrity protection by instrumenting every indirect call to check for a hash value before the target function address. If the hash at the call site and the hash at the target

[PATCH v6 3/7] kcfi: Add regression test suite

2025-11-04 Thread Kees Cook
Add test suite for KCFI (Kernel Control Flow Integrity) ABI, covering core functionality, optimization and code generation, addressing, architecture-specific KCFI sequence emission, and integration with patchable function entry. The arch-specific patterns themselves are added with the subsequent a

[PATCH v6 7/7] riscv: Add RISC-V Kernel Control Flow Integrity implementation

2025-11-04 Thread Kees Cook
Implement RISC-V-specific KCFI backend. Nothing is conceptually rv64 specific, but using an alternative set of instructions for rv32 would be needed, and at present the only user of KCFI on riscv is the rv64 build of the Linux kernel. - Scratch register allocation using t1/t2 (x6/x7) following RIS

[PATCH v6 0/7] Introduce Kernel Control Flow Integrity ABI [PR107048]

2025-11-04 Thread Kees Cook
[Added Uros Bizjak to CC for x86 backend review, I'd only added x86_64 maintainers before, not i386] Hi, This series implements[1][2] the Linux Kernel Control Flow Integrity ABI, which provides a function prototype based forward edge control flow integrity protection by instrumenting every indire

Re: Cleanup max of profile_count

2025-11-04 Thread Dimitar Dimitrov
On Wed, Oct 15, 2025 at 11:11:23AM +0200, Jan Hubicka wrote: > Hi, ... > diff --git a/gcc/bb-reorder.cc b/gcc/bb-reorder.cc > index 641b4928ffb..e4efdee0b16 100644 > --- a/gcc/bb-reorder.cc > +++ b/gcc/bb-reorder.cc > @@ -2389,8 +2389,10 @@ edge_order (const void *ve1, const void *ve2) >/* Sinc

[committed] i386: TEST insn should be merged with ADC/SBB insn [PR122390]

2025-11-04 Thread Uros Bizjak
The attached testcase is currently compiled to: f1: cmpl%esi, %edi adcl%esi, %edi testl %edi, %edi js .L4 ... TEST insn should be merged with ADC/SBB insn. The patch provides missing combined insn patterns. PR target/122390 gcc/ChangeLog:

[PATCH] vrp: Infer ranges from loads from constant aggregates with initializers (v2)

2025-11-04 Thread Martin Jambor
Hi, this patch adds the ability to infer ranges from loads from global constant static aggregates which have static initializers. Even when the load has one or more ARRAY_REFs with an unknown index and thus we do not know the particular constant that is being loaded, we can traverse the correpond

[patch,avr] Make attribute "retain" work

2025-11-04 Thread Georg-Johann Lay
Currently, attribute "retain" is ignored since it avoids some quirks in crtstuff -- which avr doesn't even use. This renders attribute "used" pretty much useless: A function will survive till asm, but without the "R" section flag, the linker will kill the code with --gc-sections. defaults.h requ

Re: [PATCH 1/8] builtin: Add builtin types and function declarations for integer atomic fetch min/max

2025-11-04 Thread Jakub Jelinek
On Tue, Nov 04, 2025 at 02:54:44PM +, Wilco Dijkstra wrote: > > There are 2 options. > > Lower the type-generic builtin into a CAS loop and pattern recognize it at > > some late time (e.g. the widening_mul pass, certainly after IPA) into an IFN > > if the corresponding optab is supported. > > O

[PATCH] forwprop: allow subvectors in simplify_vector_constructor ()

2025-11-04 Thread Artemiy Volkov
This is an attempt to fix https://gcc.gnu.org/pipermail/gcc-patches/2025-October/697879.html in the middle-end; the motivation in that patch was to teach gcc to compile: int16x8_t foo (int16x8_t x) { return vcombine_s16 (vget_high_s16 (x), vget_low_s16 (x)); } into one instruction: foo:

Re: [PATCH] [RFC][v2] extra SSA immediate use iterator checking

2025-11-04 Thread Richard Biener
On Tue, 4 Nov 2025, Andrew MacLeod wrote: > > On 11/4/25 08:43, Richard Biener wrote: > > The following implements additional checking around > > SSA immediate use iteration. Specifically this prevents > > > > - any nesting of FOR_EACH_IMM_USE_STMT inside another iteration > > via FOR_EACH

[PATCH] vect: Relax gather/scatter scale handling.

2025-11-04 Thread Robin Dapp
Hi, Similar to the signed/unsigned patch before this one relaxes the gather/scatter restrictions on scale factors. The basic idea is that a natively unsupported scale factor can still be reached by emitting a multiplication before the actual gather operation. As before, we need to make sure that

Re: [PATCH 1/8] builtin: Add builtin types and function declarations for integer atomic fetch min/max

2025-11-04 Thread Wilco Dijkstra
Hi Jakub/Matthew, >> 1) Should we still implement the libatomic functions?  And still with the >> unsigned/signed distinction and all sizes? >> - I'd expect so, mostly for the `-fno-inline-atomics` flag. > > Do you really need that?  Can't you just emit a CAS loop for the non-inline > atomics?  Be

Re: [PATCH] OpenMP/Fortran: Rebind labels after metadirective body [PR122369]

2025-11-04 Thread Paul-Antoine Arras
On 04/11/2025 13:21, Tobias Burnus wrote: Hi PA, Paul-Antoine Arras wrote: What about the attached testcase? Why not? However, I was more thinking of defining two format labels or two branch targets with the same value, which gives a different error. But also mixing branch target and format

[PATCH] Add 'num_children' method to relevant pretty-printers

2025-11-04 Thread Tom Tromey
A user pointed out that, in DAP mode, gdb would hang while trying to display a certain vector. See https://sourceware.org/bugzilla/show_bug.cgi?id=33594 This is caused by a combination of things: the vector is uninitialized, DAP requires a count of the number of children of a variable, and l

Re: [PATCH] [RFC][v2] extra SSA immediate use iterator checking

2025-11-04 Thread Andrew MacLeod
On 11/4/25 08:43, Richard Biener wrote: The following implements additional checking around SSA immediate use iteration. Specifically this prevents - any nesting of FOR_EACH_IMM_USE_STMT inside another iteration via FOR_EACH_IMM_USE_STMT or FOR_EACH_IMM_USE_FAST when iterating on th

[Patch, fortran] PR122501 and 122524 - PDT constructors in ASSOCIATE blocks.

2025-11-04 Thread Paul Richard Thomas
Hi All, It turned out that attempting to pick out specific interfaces for PDT constructors in primary.cc was way too early. This caused a problem in ASSOCIATE blocks simply because the associate name and its selector are not usable until resolution. This patch detects the presence of more than on

[PATCH][PR120375] arc: emit clobber of CC for -mcpu=em x >> 31

2025-11-04 Thread Luis Silva
From: Loeka Rogge Address PR target/120375 Devices without a barrel shifter end up using a sequence of instructions. These can use the condition codes and/or loop count register, so those need to be marked as 'clobbered'. These clobbers were previously added only after split1, which is too late.

Re: [RFC 3/9] Implement recording/getting of mask/length for BB SLP

2025-11-04 Thread Christopher Bazley
On 04/11/2025 13:57, Christopher Bazley wrote: On 28/10/2025 13:29, Richard Biener wrote: Isn't SLP_TREE_CAN_USE_PARTIAL_VECTORS_P redundant given SLP_TREE_CAN_USE_MASK_P || SLP_TREE_CAN_USE_LEN_P should be exactly this? SLP_TREE_CAN_USE_PARTIAL_VECTORS_P might be sth for an SLP instance (o

Re: [RFC 3/9] Implement recording/getting of mask/length for BB SLP

2025-11-04 Thread Christopher Bazley
On 28/10/2025 13:29, Richard Biener wrote: Isn't SLP_TREE_CAN_USE_PARTIAL_VECTORS_P redundant given SLP_TREE_CAN_USE_MASK_P || SLP_TREE_CAN_USE_LEN_P should be exactly this? SLP_TREE_CAN_USE_PARTIAL_VECTORS_P might be sth for an SLP instance (or a subgraph with multiple entries (instances)) if w

[PATCH v1] aarch64: Add support for FEAT_SSVE_BitPerm

2025-11-04 Thread Alfie Richards
Hi All, Requires my patch updating the AArch64 cli arch options (https://gcc.gnu.org/pipermail/gcc-patches/2025-November/699488.html). Reg tested on AArch64. Okay for master after prereq lands? Alfie -- >8 -- Adds support for the FEAT_SSVE_BitPerm AArch64 extension. FEAT_SSVE_BitPerm makes t

[PATCH] [RFC][v2] extra SSA immediate use iterator checking

2025-11-04 Thread Richard Biener
The following implements additional checking around SSA immediate use iteration. Specifically this prevents - any nesting of FOR_EACH_IMM_USE_STMT inside another iteration via FOR_EACH_IMM_USE_STMT or FOR_EACH_IMM_USE_FAST when iterating on the same SSA name - modification (for now unlin

Re: [PATCH] OpenMP/Fortran: Rebind labels after metadirective body [PR122369]

2025-11-04 Thread Tobias Burnus
Hi PA, Paul-Antoine Arras wrote: What about the attached testcase? Why not? However, I was more thinking of defining two format labels or two branch targets with the same value, which gives a different error. But also mixing branch target and format label works and in terms of metadirectives

Re: [PATCH] fortran: support .NIL. in conditional arguments

2025-11-04 Thread Tobias Burnus
Hi Yuao, Yuao Ma wrote: === @@ -6709,0 +6733,6 @@ conv_dummy_value (gfc_se * parmse, gfc_expr * e, gfc_symbol * fsym, + else if (e->expr_type == EXPR_CONDITIONAL) + { + gcc_assert (parmse && TREE_CODE (parmse->expr) == COND_EXPR); + tree c

Re: [PATCH V2] RISC-V: Add Andes 25 series pipeline description.

2025-11-04 Thread Kito Cheng
On Tue, Nov 4, 2025 at 8:57 PM Robin Dapp wrote: > > > Sifive core has that optimization for part of the cores like x280, but not > > for p470/p670, and seems like Tenstorrent Ascalon also doing that > > optimization as well? (they set that on both LLVM and GCC). > > Does having that optimization

Re: [PATCH V2] RISC-V: Add Andes 25 series pipeline description.

2025-11-04 Thread Robin Dapp
> Sifive core has that optimization for part of the cores like x280, but not > for p470/p670, and seems like Tenstorrent Ascalon also doing that > optimization as well? (they set that on both LLVM and GCC). Does having that optimization imply that it is indeed as fast or faster than a scalar load

Re: [PATCH V2] RISC-V: Add Andes 25 series pipeline description.

2025-11-04 Thread Kito Cheng
Robin Dapp 於 2025年11月4日 週二,15:36寫道: > > On 11/3/25 6:06 PM, KuanLin Chen wrote: > >> I'll rename it in the next version. > >> I'm curious why use_zero_stride_load should be 'false'. It seems to be > >> the trigger of 'define_insn_and_split > >> "*pred_strided_broadcast"'. > >> I would appreciate i

Re: [PATCH 1/8] builtin: Add builtin types and function declarations for integer atomic fetch min/max

2025-11-04 Thread Jakub Jelinek
On Tue, Nov 04, 2025 at 11:33:05AM +, Matthew Malcomson wrote: > We've been meaning to send this email for a while after the Cauldron. > IIUC you discussed this with Ramana there -- my understanding of what he > told me is that your main concern is with the explosion of builtins. Yeah. > Simi

[PATCH v4 3/3] AArch64: Update test to reflect new message

2025-11-04 Thread Tejas Belagod
Update test error message as svbool_t is now treated as a GNU vector. gcc/testsuite/ChangeLog * gcc.target/aarch64/sve/acle/general-c/svcount_1.c: Update message. --- gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/svcount_1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) di

[PATCH v4 0/3] AArch64: Support C/C++ operations on svbool_t

2025-11-04 Thread Tejas Belagod
Hi, Thanks for the all the reviews so far. Here is v4 of the patch series: https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696741.html This incorporates review comments from Tamar, Jason and Jakub. I'll be doing Tamar's EQ -> NE optimisation suggestion in a follow-up patch. Tested and b

[PATCH v4 1/3] AArch64: Support C/C++ operations on svbool_t

2025-11-04 Thread Tejas Belagod
Support a subset of C/C++ operations (bitwise, conditional etc.) on svbool_t. gcc/ChangeLog: * c-family/c-common.cc (c_build_vec_convert): Support vector boolean types for __builtin_convertvector (). * c/c-typeck.cc (build_binary_op): Support vector boolean types.

[PATCH v4 2/3] AArch64: Update existing test with svbool_t operations

2025-11-04 Thread Tejas Belagod
Update existing compile test with tests to cover C/C++ operations on svbool_t type objects. gcc/testsuite/ChangeLog: * g++.dg/ext/sve-sizeless-1.C: Add new tests. * g++.dg/ext/sve-sizeless-2.C: Add new tests. * g++.target/aarch64/sve/acle/general-c++/gnu_vectors_1.C: Add n

Re: [PATCH 03/14] cgraph: Add toplevel_node

2025-11-04 Thread Michal Jireš
On 10/31/25 10:05 PM, Andrew Pinski wrote: On Wed, Aug 27, 2025 at 6:56 AM Michal Jires wrote: asm_node and symbol_node will now inherit from toplevel_node. This is now useful for lto partitioning, in future it should be also useful for toplevel extended assembly. gcc/ChangeLog: * c

[PATCH] vect: Complete implementation for MULT_EXPR vector lowering [PR122065]

2025-11-04 Thread Avinash Jayakar
Hi, This is a follow-up to the previous patch I raised for fixing PR122065. Here I handle cases when vector constant is uniform, but may not be a power of 2. Bootstrapped and regtests on powerpc64le-linux. Kindly review. Thanks and regards, Avinash Jayakar Use sequences of shifts and add/sub if

[PATCH 1/2] Add gather_imm_use_stmts helper

2025-11-04 Thread Richard Biener
The following adds a helper function to gather SSA use stmts without duplicates. It steals the only padding bit in gimple to be a "infrastructure local flag" which should be used only temporarily and kept cleared. I did not add accessor functions for the flag to not encourage (ab-)uses. I have u

[PATCH 2/2] Use gather_imm_use_stmts instead of FOR_EACH_IMM_USE_STMT in forwprop

2025-11-04 Thread Richard Biener
The following fixes forwprop using FOR_EACH_IMM_USE_STMT to iterate over stmts and then eventually removing the active stmt, releasing its defs. This can cause debug stmt insertion with a RHS referencing the SSA name we iterate over, adding to its immediate use list but also adjusting all other de

Re: [PATCH 1/8] builtin: Add builtin types and function declarations for integer atomic fetch min/max

2025-11-04 Thread Matthew Malcomson
On 9/5/25 10:41, Jakub Jelinek wrote: External email: Use caution opening links or attachments On Fri, Sep 05, 2025 at 10:30:49AM +0100, Matthew Malcomson wrote: Ok -- TBH I don't have any extra details on this argument right now and your point on it's feasibility seems quite convincing. (Ti

Re: [PATCH] OpenMP/Fortran: Rebind labels after metadirective body [PR122369]

2025-11-04 Thread Paul-Antoine Arras
On 03/11/2025 21:20, Tobias Burnus wrote: Paul-Antoine Arras wrote: Here is a revamped patch as well as some inline replies. Thanks, On 31/10/2025 17:41, Tobias Burnus wrote: Error: Label 4567 referenced at (1) is never defined Added testcase. Fixed in the attached patch. Can you also add

RE: [PATCH] x86-64: Inline memmove with overlapping unaligned loads and stores

2025-11-04 Thread Kumar, Venkataramanan
[AMD Official Use Only - AMD Internal Distribution Only] Hi HJ, > -Original Message- > From: Hongtao Liu > Sent: Monday, November 3, 2025 11:36 AM > To: H.J. Lu > Cc: GCC Patches ; Uros Bizjak > ; Hongtao Liu > Subject: Re: [PATCH] x86-64: Inline memmove with overlapping unaligned > lo

Re: [PATCH 1/3] [RFC] extra SSA immediate use iterator checking

2025-11-04 Thread Richard Biener
On Mon, 3 Nov 2025, Andrew MacLeod wrote: > > On 11/3/25 13:24, Andrew MacLeod wrote: > > > > On 11/3/25 09:43, Richard Biener wrote: > >> The following implements a prototype for additional checking around > >> SSA immediate use iteration.  Specifically this guards immediate > >> use list modifi

Re: [PATCH 1/3] [RFC] extra SSA immediate use iterator checking

2025-11-04 Thread Richard Biener
On Mon, 3 Nov 2025, Andrew MacLeod wrote: > > On 11/3/25 09:43, Richard Biener wrote: > > The following implements a prototype for additional checking around > > SSA immediate use iteration. Specifically this guards immediate > > use list modifications inside a FOR_EACH_IMM_USE_STMT iteration >

Re: [PATCH] [RFC] libgomp: Removing one barrier in non-nested thread loop

2025-11-04 Thread Matthew Malcomson
Hi all, I'd like to ping on the idea in this patch. I was hoping to post the "finished" patch as the next ping, but hit a few issues (all getting resolved just fine -- but taking time and I wanted to get the idea vetted as early as I could). Biggest questions to raise are: 1) Memory model se

Re: [PATCH 3/3] Fix unsafe operations in FOR_EACH_IMM_USE_STMT

2025-11-04 Thread Richard Biener
On Mon, 3 Nov 2025, Andrew Pinski wrote: > On Mon, Nov 3, 2025 at 7:21 AM Richard Biener wrote: > > > > The following fixes forwprop using FOR_EACH_IMM_USE_STMT to iterate > > over stmts and then eventually removing the active stmt, releasing > > its defs. This can cause debug stmt insertion wit

Re: [PATCH v4] c++: Don't constrain template visibility using no-linkage variables [PR122253]

2025-11-04 Thread Jason Merrill
On 11/4/25 5:48 AM, Nathaniel Shead wrote: On Sat, Nov 01, 2025 at 03:41:50PM +0300, Jason Merrill wrote: On 11/1/25 5:12 AM, Nathaniel Shead wrote: On Sat, Nov 01, 2025 at 01:10:43PM +1100, Nathaniel Shead wrote: On Thu, Oct 30, 2025 at 07:15:01PM +0200, Jason Merrill wrote: On 10/28/25 4:53

Re: [PATCH 2/3] Fix unsafe stmt modifications in FOR_EACH_IMM_USE_STMT

2025-11-04 Thread Richard Biener
On Mon, 3 Nov 2025, Andrew Pinski wrote: > On Mon, Nov 3, 2025 at 7:21 AM Richard Biener wrote: > > > > The following fixes path isolation changing the immediate use list of > > an SSA name that is currently iterated over via FOR_EACH_IMM_USE_STMT. > > This happens when it duplicates a BB within

Re: [PATCH] optabs: Make widen_lshift an IFN.

2025-11-04 Thread Robin Dapp
> On Tue, Nov 4, 2025 at 9:23 AM Robin Dapp wrote: >> >> > We shouldn't have created the IFN in the first palace if it isn't >> > supported. >> > So I think whatever did that misses the internal-fn-supported check >> > instead. >> >> We do check whether the IFN is supported, it's a standard >>

Re: [PATCH] optabs: Make widen_lshift an IFN.

2025-11-04 Thread Richard Biener
On Tue, Nov 4, 2025 at 9:23 AM Robin Dapp wrote: > > > We shouldn't have created the IFN in the first palace if it isn't supported. > > So I think whatever did that misses the internal-fn-supported check instead. > > We do check whether the IFN is supported, it's a standard direct_optab_handler >

Re: [PATCH] vect: Reduce group size of consecutive strided accesses.

2025-11-04 Thread Richard Biener
On Mon, Nov 3, 2025 at 9:10 AM Robin Dapp wrote: > > Hi, > > new try even though Richi didn't like the previous attempt ;) > I played with vect_transform_slp_perm_load without really getting > anywhere and figured maybe a now clearer version is acceptable. > > Consecutive load permutations like {0

Re: [PATCH] optabs: Make widen_lshift an IFN.

2025-11-04 Thread Robin Dapp
> We shouldn't have created the IFN in the first palace if it isn't supported. > So I think whatever did that misses the internal-fn-supported check instead. We do check whether the IFN is supported, it's a standard direct_optab_handler test with the proper optab and its mode. But that doesn't

Re: [PATCH v1] Match: Refactor min based unsigned SAT_MUL pattern by widen mul helper [NFC]

2025-11-04 Thread Richard Biener
On Mon, Nov 3, 2025 at 12:36 PM wrote: > > From: Pan Li > > There are 3 kinds of widen_mul during the unsigned SAT_MUL pattern, aka > * widen_mul directly, like _3 w* _4 > * convert and the widen_mul, like (uint64_t)_3 *w (uint64_t)_4 > * convert and then mul, like (uint64_t)_3 * (uint64_t)_4 > >

Re: [PATCH v3] vect: Relax gather/scatter detection by swapping offset sign.

2025-11-04 Thread Richard Biener
On Thu, Oct 30, 2025 at 2:20 PM Robin Dapp wrote: > > Hi, > > This patch adjusts vect_gather_scatter_fn_p to always check an offset > type with swapped signedness (vs. the original offset argument). > If the target supports the gather/scatter with the new offset type as > well as the conversion of