Re: [PATCH] x86: Properly find the maximum stack slot alignment

2025-04-21 Thread Uros Bizjak
On Sun, Apr 20, 2025 at 11:26 PM H.J. Lu wrote: > > Don't assume that stack slots can only be accessed by stack or frame > registers. We first find all registers defined by stack or frame > registers. Then check memory accesses by such registers, including > stack and frame registers. I've been

[PATCH v2] gcc-15/changes: Document LoongArch changes.

2025-04-21 Thread Lulu Cheng
--- htdocs/gcc-15/changes.html | 20 1 file changed, 20 insertions(+) diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index a02ba17a..b94802f5 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -842,6 +842,26 @@ asm (".text; %cc0:

Re: PING: [PATCH] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote: > > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote: > > > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote: > > > > > > For all different modes of all 0s/1s vectors, we can use the single widest > > > all 0s/1s vector register for all 0s/1s vector

Re: [PATCH] gimple-fold: Improve optimize_memcpy_to_memset by walking back until aliasing says the ref is a may clobber. [PR118947]

2025-04-21 Thread Richard Biener
On Thu, Apr 17, 2025 at 7:51 PM Andrew Pinski wrote: > > The case here is we have: > ``` > char buf[32] = {}; > void* ret = aaa(); > __builtin_memcpy(ret, buf, 32); > ``` > > And buf does not escape. But we don't prop the zeroing from buf to the > memcpy statement > because optimize_

Re: [PATCH] except: Don't use the cached value of the gcc_except_table section for comdat functions [PR119507]

2025-04-21 Thread Richard Biener
On Sat, Apr 19, 2025 at 7:10 AM Andrew Pinski wrote: > > On Fri, Mar 28, 2025 at 9:58 PM Andrew Pinski > wrote: > > > > This has been broken since GCC started to put the comdat functions' > > gcc_except_table into their > > own section; r0-118218-g3e6011cfebedfb. What would happen is after a >

Re: [PATCH] gimple: Canonical order for invariants [PR118902]

2025-04-21 Thread Richard Biener
On Thu, Apr 17, 2025 at 7:37 PM Andrew Pinski wrote: > > So unlike constants, address invariants are currently put first if > used with a SSA NAME. > It would be better if address invariants are consistent with constants > and this patch changes that. > gcc.dg/tree-ssa/pr118902-1.c is an example w

Re: [PATCH] avoid-store-forwarding: Fix reg init on load-elimination [PR119160]

2025-04-21 Thread Richard Biener
On Fri, Apr 18, 2025 at 4:51 PM Jeff Law wrote: > > > > On 4/18/25 2:43 AM, Philipp Tomsich wrote: > > Applied to trunk (16.0.0), thank you! > > Should this be backported to the GCC-15 release branch as well? > We don't have this on by default on the branch and it's a new option, so > one could ma

Re: [PATCH] Add assert to array_slice::begin/end

2025-04-21 Thread Richard Biener
On Sat, Apr 19, 2025 at 5:04 PM Andrew Pinski wrote: > > So while debugging PR 118320, I found it was useful to have > an assert inside array_slice::begin/end that the array slice isvalid > rather than getting an segfault. This adds an assert that is only > enabled for checking. > > OK? Bootstrapp

Re: [PATCH] cobol: Allow for undefined NAME_MAX [PR119217]

2025-04-21 Thread Richard Biener
On Fri, Apr 18, 2025 at 8:10 PM Jakub Jelinek wrote: > > On Fri, Apr 18, 2025 at 06:04:29PM +0200, Rainer Orth wrote: > > That's one option, but maybe it's better the other way round: instead of > > excluding known-bad targets, restrict cobol to known-good ones > > (i.e. x86_64-*-linux* and aarch6

[PATCH] Skip g++.dg/eh/pr119507.C on arm eabi

2025-04-21 Thread Andrew Pinski
arm eabi emits the exception table using the handlerdata directive and does not use a comdat section for comdat functions. So this testcase should be skipped for arm eabi. Pushed as obvious after a quick test. gcc/testsuite/ChangeLog: * g++.dg/eh/pr119507.C: Skip for arm eabi. Signed-of

[PATCH v2] [PR119765] testsuite: adjust amd64-abi-9.c to check both ms and sysv ABIs

2025-04-21 Thread Peter Damianov
This test was failing because it was checking that eax was being cleared. For sysv abi, eax contains the number of XMM registers used in the call, but msabi just passes the float arguments twice, both in xmm and general purpose registers. This patch adds tests for both sysv and msabi functions be

RE: [EXTERNAL]Re: [PATCH]RISCV :Added MIPS P8700 Subtarget

2025-04-21 Thread Umesh Kalappa
Thank you @Jeff Law for the suggestions and >> Just quickly scanning the insn reservations, I suspect you're missing many >> cases and the compiler will trip assertion failures if you are missing cases. Sure will look at it . >> You might want to look at these values more closely. If you have

Re: Improve vectorizer costs of min, max, abs, absu and const_expr on x86

2025-04-21 Thread Hongtao Liu
On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote: > > Hi, > this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR, > MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and > ABSU_EXPR > but it was only correct for FP variant (wehre it corresponds to andss clea

Re: PING: [PATCH] x86: Add a pass to remove redundant all 0s/1s vector load

2025-04-21 Thread Hongtao Liu
On Mon, Apr 21, 2025 at 4:30 PM H.J. Lu wrote: > > On Mon, Apr 21, 2025 at 11:29 AM Hongtao Liu wrote: > > > > On Sat, Apr 19, 2025 at 1:25 PM H.J. Lu wrote: > > > > > > On Sun, Dec 1, 2024 at 7:50 AM H.J. Lu wrote: > > > > > > > > For all different modes of all 0s/1s vectors, we can use the si

Re: Improve vectorizer costs of min, max, abs, absu and const_expr on x86

2025-04-21 Thread Hongtao Liu
On Tue, Apr 22, 2025 at 10:30 AM Hongtao Liu wrote: > > On Tue, Apr 22, 2025 at 12:46 AM Jan Hubicka wrote: > > > > Hi, > > this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR, > > MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and > > ABSU_EXPR > > but i

[PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-04-21 Thread Andrew Pinski
This implements a simple copy propagation for aggregates in the similar fashion as we already do for copy prop of zeroing. Right now this only looks at the previous vdef statement but this allows us to catch a lot of cases that show up in C++ code. Also adds a variant of pr22237.c which was found

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-21 Thread Jan Hubicka
> On Mon, Apr 21, 2025 at 7:24 AM H.J. Lu wrote: > > > > On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > > > > > PR target/102294 > > > > PR target/119596 > > > > * config/i386/x86-tune-costs.h (generic_memcpy): Updated. > > > > (generic_memset): Likewise. > > > >

Re: [PATCH] cobol: Allow for undefined NAME_MAX [PR119217]

2025-04-21 Thread Richard Biener
On Mon, Apr 21, 2025 at 11:16 AM Sam James wrote: > > Sam James writes: > > > Richard Biener writes: > > > >> On Fri, Apr 18, 2025 at 8:10 PM Jakub Jelinek wrote: > >>> > >>> On Fri, Apr 18, 2025 at 06:04:29PM +0200, Rainer Orth wrote: > >>> > That's one option, but maybe it's better the other

Re: [PATCH 43/61] Disable ssa-dom-cse-2.c for MIPS lp64

2025-04-21 Thread Jeff Law
On 2/3/25 2:37 AM, Richard Biener wrote: On Fri, Jan 31, 2025 at 6:57 PM Aleksandar Rakic wrote: From: Matthew Fortune The optimisation to reduce the result to constant 28 still happens but only much later in combine. OK. And pushed to the trunk. jeff

Re: [PATCH 30/61] MSA: Make MSA and microMIPS R5 unsupported

2025-04-21 Thread Jeff Law
On 1/31/25 10:13 AM, Aleksandar Rakic wrote: From: Matthew Fortune There are no platforms nor simulators for MSA and microMIPS R5 so turning off this support for now. gcc/ChangeLog: * config/mips/mips.cc (mips_option_override): Error out for -mmicromips -mmsa. OK and pushed

Re: [PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 6:34 PM Jan Hubicka wrote: > > > On Mon, Apr 21, 2025 at 7:24 AM H.J. Lu wrote: > > > > > > On Sun, Apr 20, 2025 at 6:31 PM Jan Hubicka wrote: > > > > > > > > > PR target/102294 > > > > > PR target/119596 > > > > > * config/i386/x86-tune-costs.h (generic

[PATCH v2] x86: Properly find the maximum stack slot alignment

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 3:06 PM Uros Bizjak wrote: > > On Sun, Apr 20, 2025 at 11:26 PM H.J. Lu wrote: > > > > Don't assume that stack slots can only be accessed by stack or frame > > registers. We first find all registers defined by stack or frame > > registers. Then check memory accesses by s

Re: [PATCH 2/2] c++/modules: Remove unnecessary lazy_load_pendings

2025-04-21 Thread Jason Merrill
On 4/21/25 6:22 AM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- This call is not necessary, as we don't access the bodies of any classes that we instantiate here. gcc/cp/ChangeLog: * name-lookup.cc (lookup_imported_hidden_fri

Re: [PATCH 1/2] c++/modules: Find non-exported reachable decls when instantiating friend classes [PR119863]

2025-04-21 Thread Jason Merrill
On 4/21/25 6:21 AM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? And 15 (I guess after the release has been made)? OK, yes. -- >8 -- In r15-9029-geb26b667518c95, we started checking for conflicting declarations with any reachable decl attached to th

[pushed 3/3] c++: reorder constexpr checks

2025-04-21 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- My coming proposed change to stop setting TREE_STATIC on constexpr heap pseudo-variables led to a diagnostic regression because we would get the generic "not constant" diagnostic before the "allocated storage" diagnostic. So let's move the g

[pushed 1/3] c++: static constexpr strictness [PR99456]

2025-04-21 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- r11-7740 limited constexpr rejection of conversion from pointer to integer to manifestly constant-evaluated contexts; it should instead check whether we're in strict mode. The comment for that commit noted that making this change regressed

[PATCH RFA (fold)] c++: remove TREE_STATIC from constexpr heap vars [PR119162]

2025-04-21 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, OK for trunk? -- 8< -- While working on PR119162 it occurred to me that it would be simpler to detect the problem of a value referring to a heap allocation if we stopped setting TREE_STATIC on them so they naturally are not considered to have a constant address. With

[pushed 2/3] c++: new size folding [PR118775]

2025-04-21 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- r15-7893 added a workaround for a case where we weren't registering (long)&a as invalid in a constant-expression, because build_new_1 had folded away the CONVERT_EXPR that we rely on to diagnose that problem. In general we want to defer mos

Re: [PATCH] sanitizer: Store no_sanitize attribute value in uint32 instead of unsigned

2025-04-21 Thread Kees Cook
On Thu, Apr 10, 2025 at 05:17:51PM -0700, Keith Packard wrote: > A target using 16-bit ints won't have enough bits to hold the whole > flag_sanitize set. Be explicit about using uint32 for the attribute data. > > Signed-off-by: Keith Packard > --- > gcc/c-family/c-attribs.cc | 4 ++-- > 1 file c

Re: [PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-04-21 Thread Andrew Pinski
On Mon, Apr 21, 2025 at 9:52 AM Andrew Pinski wrote: > > This implements a simple copy propagation for aggregates in the similar > fashion as we already do for copy prop of zeroing. > > Right now this only looks at the previous vdef statement but this allows us > to catch a lot of cases that show

[PATCH] c++: Fix OpenMP support with C++20 modules [PR119864]

2025-04-21 Thread Nathaniel Shead
I don't really know how OpenMP works, hopefully this makes sense. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? And for 15 (I guess after release)? -- >8 -- In r15-2799-gf1bfba3a9b3f31, a new kind of global constructor was added. Unfortunately this broke C++20 modules, as both

Re: [PATCH] cobol: Allow for undefined NAME_MAX [PR119217]

2025-04-21 Thread Sam James
Richard Biener writes: > On Fri, Apr 18, 2025 at 8:10 PM Jakub Jelinek wrote: >> >> On Fri, Apr 18, 2025 at 06:04:29PM +0200, Rainer Orth wrote: >> > That's one option, but maybe it's better the other way round: instead of >> > excluding known-bad targets, restrict cobol to known-good ones >> >

Re: [PATCH] cobol: Allow for undefined NAME_MAX [PR119217]

2025-04-21 Thread Sam James
Sam James writes: > Richard Biener writes: > >> On Fri, Apr 18, 2025 at 8:10 PM Jakub Jelinek wrote: >>> >>> On Fri, Apr 18, 2025 at 06:04:29PM +0200, Rainer Orth wrote: >>> > That's one option, but maybe it's better the other way round: instead of >>> > excluding known-bad targets, restrict co

Re: [PATCH 08/61] Testsuite: Accept jrc for clear cache intrinsic

2025-04-21 Thread Jeff Law
On 1/31/25 10:13 AM, Aleksandar Rakic wrote: From: Matthew Fortune Cherry-picked e8186b2f4b5e843a83775a10f923916c4c9253a5 from https://github.com/MIPS/gcc Signed-off-by: Matthew Fortune Signed-off-by: Faraz Shahbazker Signed-off-by: Aleksandar Rakic --- gcc/testsuite/gcc.target/mips/cl

Re: [PATCH 21/61] Testsuite: Modify the gcc.dg/memcpy-4.c test

2025-04-21 Thread Jeff Law
On 1/31/25 10:13 AM, Aleksandar Rakic wrote: From: Andrew Bennett Firstly, remove the MIPS specific bit of the test. Secondly, create a MIPS specific version in the gcc.target/mips. This will only execute for a MIPS ISA less than R6. Cherry-picked c8b051cdbb1d5b166293513b0360d3d67cf31eb9 fro

[PATCH 2/2] c++/modules: Remove unnecessary lazy_load_pendings

2025-04-21 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- This call is not necessary, as we don't access the bodies of any classes that we instantiate here. gcc/cp/ChangeLog: * name-lookup.cc (lookup_imported_hidden_friend): Remove unnecessary lazy_load_pendings.

[PATCH 1/2] c++/modules: Find non-exported reachable decls when instantiating friend classes [PR119863]

2025-04-21 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? And 15 (I guess after the release has been made)? -- >8 -- In r15-9029-geb26b667518c95, we started checking for conflicting declarations with any reachable decl attached to the same originating module. This exposed the issue in the

RFC: Add TARGET_STORE_BY_PIECES_ICODE

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 6:34 PM Jan Hubicka wrote: ... > We originally put CLEAR_RATIO < MOVE_RATIO based on observation that > mov $0, mem > is longer in encoding than > mov mem, mem > and there was a plan to implement optimization to avoid long immediates > in moves, but it did not materiali

Improve vectorizer costs of min, max, abs, absu and const_expr on x86

2025-04-21 Thread Jan Hubicka
Hi, this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and ABSU_EXPR but it was only correct for FP variant (wehre it corresponds to andss clearing sign bit). Integer abs/absu is open coded as conditinal move

Re: [PATCH] gimple: Canonical order for invariants [PR118902]

2025-04-21 Thread Andrew Pinski
On Mon, Apr 21, 2025 at 1:42 AM Richard Biener wrote: > > On Thu, Apr 17, 2025 at 7:37 PM Andrew Pinski > wrote: > > > > So unlike constants, address invariants are currently put first if > > used with a SSA NAME. > > It would be better if address invariants are consistent with constants > > and

RFC v2: Add TARGET_STORE_BY_PIECES_ICODE

2025-04-21 Thread H.J. Lu
On Mon, Apr 21, 2025 at 9:38 PM H.J. Lu wrote: > > On Mon, Apr 21, 2025 at 6:34 PM Jan Hubicka wrote: > ... > > We originally put CLEAR_RATIO < MOVE_RATIO based on observation that > > mov $0, mem > > is longer in encoding than > > mov mem, mem > > and there was a plan to implement optimizati