[COMMITTED] PR tree-optimization/120661 - Snap subrange boundries to bitmask constraints.

2025-06-17 Thread Andrew MacLeod
d be good to sit down and see if it needs reworking.. Im pretty sure there is some work being done over and over we can improve on, but this is good enough for now. Bootstraps on x86_64-pc-linux-gnu with no regressions.   Pushed. Andrew From 9808af57ef1d4cbafb40b2446fd6808cbf20b36d Mon Sep

Re: [RFC] RISC-V: Change the default branch cost.

2025-06-17 Thread Andrew Waterman
On Tue, Jun 17, 2025 at 5:43 AM Jeff Law wrote: > > > > On 6/16/25 10:08 PM, Dongyan Chen wrote: > > Hi, I've come across a question regarding the branch cost of gcc. In the > > link > > https://gcc.godbolt.org/z/hnddevd5h, gcc fails to recognize the optimization > > branch judgment, while llvm d

Re: [PATCH] crc: Fix up ICE from optimize_crc_loop [PR120677]

2025-06-16 Thread Andrew Pinski
x86_64-linux and i686-linux, ok for trunk/15.2? > LGTM but I can't approve it. Thanks, Andrew > 2025-06-17 Jakub Jelinek > > PR tree-optimization/120677 > * gimple-crc-optimization.cc (crc_optimization::optimize_crc_loop): > Insert before gs

Re: [r16-1429 Regression] FAIL: g++.target/i386/vect-pragma-target-2.C -std=gnu++98 (test for excess errors) on Linux/x86_64

2025-06-12 Thread Andrew Pinski
On Wed, Jun 11, 2025, 10:17 PM haochen.jiang wrote: > On Linux/x86_64, > > dcb9af06212e8bb36e84a1b8498c625c29abeb6f is the first bad commit > commit dcb9af06212e8bb36e84a1b8498c625c29abeb6f > Author: Gwenole Beauchesne > Date: Mon Jun 2 14:44:55 2025 -0700 > > c/c++: Handle '#pragma GCC ta

Re: Gimple lowering question

2025-06-11 Thread Andrew Pinski
On Wed, Jun 11, 2025, 9:24 AM Andrew MacLeod wrote: > > On 6/11/25 11:02, Andrew MacLeod wrote: > > > > On 6/10/25 17:05, Richard Biener wrote: > >> > >> > >>> Am 10.06.2025 um 22:18 schrieb Andrew MacLeod : > >>> > >>>

Re: Gimple lowering question

2025-06-11 Thread Andrew MacLeod
On 6/11/25 11:02, Andrew MacLeod wrote: On 6/10/25 17:05, Richard Biener wrote: Am 10.06.2025 um 22:18 schrieb Andrew MacLeod :  I had a question asked of me, and now I'm passing the buck.     extern void *memcpy(void *, const void *, unsigned int);     extern int memcmp(const

Re: Gimple lowering question

2025-06-11 Thread Andrew MacLeod
On 6/10/25 17:05, Richard Biener wrote: Am 10.06.2025 um 22:18 schrieb Andrew MacLeod :  I had a question asked of me, and now I'm passing the buck. extern void *memcpy(void *, const void *, unsigned int); extern int memcmp(const void *, const void *, unsigned int); ty

Gimple lowering question

2025-06-10 Thread Andrew MacLeod
d in the object file.   This is a reduced testcase to demonstrate a much larger problem. I don't see this happening on my x86 box.  the memcpy's are not lowered to MEMs there under any circumstances I can find. This is true for at least gcc13 through trunk.  It was not true back in the heyday of gcc8.  Andrew

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Andrew MacLeod
On 6/10/25 13:52, Jakub Jelinek wrote: On Tue, Jun 10, 2025 at 10:51:25AM -0400, Andrew MacLeod wrote: Edge range should be fine, and really that assert doesnt really need to be there. Where the issue could arise is in gimple-range-fold.cc in fold_using_range::range_of_range_op()  where we

Re: [PATCH] forwprop: Change optimize_agr_copyprop into forward walk instead of backwards

2025-06-10 Thread Andrew Pinski
On Tue, Jun 10, 2025 at 3:47 AM Richard Biener wrote: > > On Tue, Jun 10, 2025 at 2:02 AM Andrew Pinski wrote: > > > > On Mon, Jun 9, 2025 at 2:49 AM Richard Biener > > wrote: > > > > > > On Sun, Jun 8, 2025 at 7:52 PM Andrew Pinski > > >

[PATCH] PR tree-optimization/119039 - Simplify switches utilizing subranges.

2025-06-10 Thread Andrew MacLeod
n the set focuses on improvements resulting from better utilizing the bitmask when it is present. Have I missed anything tricky about switches? Bootstraps on x86_64-pc-linux-gnu with no regressions.  OK for trunk? Andrew From f3709725b4656a3b75334c89d14d0f1da40e4be5 Mon Sep 17 00:00:00 2001

[COMMITTED] Check if constant is a member before returning it.

2025-06-10 Thread Andrew MacLeod
gressions.  Pushed Andrew From ea2cfcbf652c9531aae2af6352c9519d36795cf1 Mon Sep 17 00:00:00 2001 From: Andrew MacLeod Date: Tue, 10 Jun 2025 12:11:18 -0400 Subject: [PATCH] Check if constant is a member before returning it. set_range_from_bitmask checks the new bitmask, and if it is a constant, simpl

Re: [PATCH] doc: allow extend.texi to be processed by makeinfo 4.13

2025-06-10 Thread Andrew Pinski
On Thu, Jun 5, 2025 at 11:50 PM Jan Beulich wrote: > > As per documentation, even 4.7 ought to suffice. At least 4.13 objects > to there being nothing ahead of the first comma in @xref{}. > --- > The text inserted it merely a guess; I'm open to better suggestions. > > Noticed with gcc15, so may wa

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Andrew MacLeod
On 6/10/25 10:07, Jakub Jelinek wrote: On Tue, Jun 10, 2025 at 09:59:33AM -0400, Andrew MacLeod wrote: Yes, there are places , particularly fold_using_range in gimple-range-fold.cc,  which expects there to be 2 edges to a GCOND stmt. it always expects 2 successors.   There are not many places

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Andrew MacLeod
always expects 2 successors.   There are not many places like that, many simply look at the specified edge of a gcond...   but the few there are  could also be adjusted pretty easily to check if the edge is there before doing anything... Andrew

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Andrew MacLeod
ory the program is the "same" so the previously calculated values in the cache should still be ok.   New values would be calculated using the new edge configuration. Most of the rest of ranger just walks what is there now.    As long as the use/def chains lead somewhere useful, and the blocks those statements are in still exist,  it should be fine.   Also as long as the rules of not changing the meaning/value of an existing ssa-name are also followed.  Im sure it possible to get into trouble, but as long as it is controlled it should manage. Andrew

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-10 Thread Andrew MacLeod
k the RTL on the edges either). It only walks SSA use->def chains, I don't think it ever walks "all stmts" of a block (or edge as you say). yes, Ranger will just walk the use-def chains, and never looks at anything queued on an edge or elsewhere. Andrew

Re: [Patch] [+wwwdocs] gcn: Add experimental MI300 (gfx942) support

2025-06-10 Thread Andrew Stubbs
mark is probably not very readable either, so we can live with it. This is fine for now. Like you said, the whole cache/atomic thing is a mess that could use a rework. OK. Andrew

Re: [PATCH] forwprop: Change optimize_agr_copyprop into forward walk instead of backwards

2025-06-09 Thread Andrew Pinski
On Mon, Jun 9, 2025 at 2:49 AM Richard Biener wrote: > > On Sun, Jun 8, 2025 at 7:52 PM Andrew Pinski wrote: > > > > While thinking about how to implement the rest of the copy prop and makes > > sure not > > to introduce some compile time problems, optimize_agr_c

[PATCH] forwprop: Change proping memset into memcpy into a forwprop rather than a backwalk

2025-06-08 Thread Andrew Pinski
/ChangeLog: * gcc.dg/pr118946-1.c: New test. Signed-off-by: Andrew Pinski --- gcc/testsuite/gcc.dg/pr118946-1.c | 15 ++ gcc/tree-ssa-forwprop.cc | 278 -- 2 files changed, 167 insertions(+), 126 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr118946-1.c

[PUSHED] cselim: Move else_vdef definition to the usage

2025-06-08 Thread Andrew Pinski
: * tree-ssa-phiopt.cc (cond_if_else_store_replacement): Move definitin of else_vdef to right before the usage. Reformat slightly. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-phiopt.cc | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/gcc/tree

Re: [PATCH 2/2] phi-opt: Do limited form of cselim from phiopt [PR120533]

2025-06-08 Thread Andrew Pinski
On Sun, Jun 8, 2025 at 2:03 AM Richard Biener wrote: > > On Sat, Jun 7, 2025 at 12:32 AM Andrew Pinski > wrote: > > > > So currently cselim is limited to targets which have conditional move > > and also happens later in the pipeline. This adds the limited form of >

[PUSHED] cselim: Use get_virtual_phi instead of a loop in cond_if_else_store_replacement

2025-06-08 Thread Andrew Pinski
e-ssa-phiopt.cc (cond_if_else_store_replacement): Use get_virtual_phi instead of inlining it. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-phiopt.cc | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index 2e4f9d

[PATCH] forwprop: Change optimize_agr_copyprop into forward walk instead of backwards

2025-06-08 Thread Andrew Pinski
-forwprop.cc (optimize_agr_copyprop): Change into a forward looking (looking at vdef's uses) instead of a back looking (vuse's def). Signed-off-by: Andrew Pinski --- gcc/tree-ssa-forwprop.cc | 121 +++ 1 file changed, 60 inserti

Re: [PATCH] Improve copy prop for aggregates and combine with zeroing case

2025-06-08 Thread Andrew Pinski
On Sat, Jun 7, 2025 at 12:34 PM Andrew Pinski wrote: > > On Fri, Jun 6, 2025 at 11:50 AM Andrew Pinski > wrote: > > > > This improves copy prop for aggregates by working over statements that > > don't modify the access > > just like how it is done for cop

[PATCH] math-opt: Remove special case of COND_EXPR

2025-06-07 Thread Andrew Pinski
/120477 gcc/ChangeLog: * tree-ssa-math-opts.cc (maybe_optimize_guarding_check): Remove special case for COND_EXPR. (arith_overflow_check_p): Likewise. (match_arith_overflow): Likewise, changing into a gcc_unreachable. Signed-off-by: Andrew Pinski --- gcc/tree-ssa

Re: [PATCH] Improve copy prop for aggregates and combine with zeroing case

2025-06-07 Thread Andrew Pinski
On Fri, Jun 6, 2025 at 11:50 AM Andrew Pinski wrote: > > This improves copy prop for aggregates by working over statements that don't > modify the access > just like how it is done for copying zeros. > To speed up things, we should only have one loop back on the vuse instead

Re: [PATCH] expand: Improve store_field for `{}` stores of non mode size [PR110459]

2025-06-07 Thread Andrew Pinski
On Fri, Jun 6, 2025 at 12:02 PM Andrew Pinski wrote: > > On Thu, Jun 5, 2025 at 11:39 PM Richard Biener > wrote: > > > > On Fri, Jun 6, 2025 at 12:14 AM Andrew Pinski > > wrote: > > > > > > Currently we expand `{}` and store zeros to the stack and t

[PATCH v2] expand: Improve expand_constructor for BLKmode mode and zeros constructors [PR110459]

2025-06-07 Thread Andrew Pinski
BITS_PER_WORD, just return the constant 0. gcc/testsuite/ChangeLog: * g++.target/aarch64/array-return-1.C: New test. * g++.target/i386/array-return-1.C: New test. Signed-off-by: Andrew Pinski --- gcc/expr.cc | 13 +++ .../g++.target/aarch64

Re: [PATCH] expand: Use less costly from sign and zero extensions for values where value range says they don't have MSB set [PR120434]

2025-06-06 Thread Andrew Pinski
sive-optimizations (should be cleaned up to say kind of expensiveness, e.g compile time/compile memory usage). Thanks, Andrew > + && SCALAR_INT_MODE_P (mode) > + && (GET_MODE_SIZE (as_a (mode)) > + > GET_MODE_SIZE (as_

[PUSHED] Fix index of some warnings [PR120572]

2025-06-06 Thread Andrew Pinski
(Wmusttail-local-addr, Wno-maybe-musttail-local-addr): Fix opindex. * common.opt.urls: Regenerate. Signed-off-by: Andrew Pinski --- gcc/common.opt.urls | 2 +- gcc/doc/invoke.texi | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/common.opt.urls b/gcc

[PATCH 2/2] phi-opt: Do limited form of cselim from phiopt [PR120533]

2025-06-06 Thread Andrew Pinski
: Likewise. * gcc.dg/tree-ssa/phiprop-2.c: Move the check for MIN_EXPR to phiopt1. Signed-off-by: Andrew Pinski --- gcc/testsuite/gcc.dg/tree-ssa/phiprop-2.c| 5 +- gcc/testsuite/gcc.dg/tree-ssa/pr35286.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c | 2 +- gcc/testsuite/gcc.dg

[PATCH 1/2] cselim: change how to detect no load/stores after store in single_trailing_store_in_bb

2025-06-06 Thread Andrew Pinski
(single_trailing_store_in_bb): Add vphi argument. Check for single use of the vdef of the store instead of a loop and check vdef's single use statement is the same as vphi. (cond_if_else_store_replacement): Update call to single_trailing_store_in_bb. Signed-off-by: Andrew Pinski --- gcc/tre

Re: [PATCH] expand: Improve store_field for `{}` stores of non mode size [PR110459]

2025-06-06 Thread Andrew Pinski
On Thu, Jun 5, 2025 at 11:39 PM Richard Biener wrote: > > On Fri, Jun 6, 2025 at 12:14 AM Andrew Pinski > wrote: > > > > Currently we expand `{}` and store zeros to the stack and then do a full > > mode load back. This is a waste, instead we should just use the ze

[PATCH] Improve copy prop for aggregates and combine with zeroing case

2025-06-06 Thread Andrew Pinski
geLog: * gcc.dg/tree-ssa/copy-prop-arg-1.c: New test. * gcc.dg/tree-ssa/copy-prop-arg-2.c: New test. Signed-off-by: Andrew Pinski --- .../gcc.dg/tree-ssa/copy-prop-arg-1.c | 37 .../gcc.dg/tree-ssa/copy-prop-arg-2.c | 35 gcc/tree-ssa-forwpr

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-06 Thread Andrew MacLeod
On 6/6/25 11:07, Jakub Jelinek wrote: On Fri, Jun 06, 2025 at 10:54:55AM -0400, Andrew MacLeod wrote: I don't remember details about the order of things...  Is there any chance that you might query an SSA_NAME whose DEF was in  a block which has been converted to RTL?   Ranger will quer

Re: [PATCH] expand, ranger: Use ranger during expansion [PR120434]

2025-06-06 Thread Andrew MacLeod
t_range_query (fun)->create_relation_oracle (false);   <>   get_range_query (fun)->destroy_relation_oracle ();   fun->x_range_query = NULL; As long as all queries come in dominator order, it should work just fine as an option. Andrew On 6/6/25 09:33, Jakub Jelinek wrote: Hi! As the foll

Re: [AutoFDO] Profile merging for clone test

2025-06-05 Thread Andrew Pinski
On Thu, Jun 5, 2025 at 11:01 PM Kugan Vivekanandarajah wrote: > > Hi Andrew, > > > On 6 Jun 2025, at 8:18 am, Andrew Pinski wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Wed, Jun 4, 2025 at 12:02 AM Kugan Vive

[PATCH] cselim: Update the vop manually for cond_if_else_store replacement

2025-06-05 Thread Andrew Pinski
(cond_if_else_store_replacement_1): Add vphi argument. Manually update the vphi and new_stmt vdef/lhs. (cond_if_else_store_replacement): Update call to cond_if_else_store_replacement_1. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-phiopt.cc | 15 --- 1 file changed, 12

Re: [AutoFDO] Profile merging for clone test

2025-06-05 Thread Andrew Pinski
angeLog: > > * gcc.dg/tree-prof/clone-merge-1.c: New test. This new testcase fails if you don't have autofdo setup. I think it needs: /* { dg-require-profiling "-fauto-profile" } */ Thanks, Andrew Pinski > > Is this OK? > > Thanks, > Kugan >

[PATCH] expand: Improve store_field for `{}` stores of non mode size [PR110459]

2025-06-05 Thread Andrew Pinski
gcc/ChangeLog: * expr.cc (store_field): For `{}` exp where bitsize is known to be less than BITS_PER_WORD, use zero cst. gcc/testsuite/ChangeLog: * g++.target/aarch64/array-return-1.C: New test. * g++.target/i386/array-return-1.C: New test. Signed-off-by: Andrew

Re: [PATCH] ranger: Add support for float <-> int casts [PR120231]

2025-06-05 Thread Andrew MacLeod
fied, while the latter PASSed both versions. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK by me. Andrew

Re: [PATCH 2/4] Use ranger for table based CTZ detection

2025-06-05 Thread Andrew MacLeod
r a new one, or the existing one, but are guaranteed a context ranger. Does that seem reasonable? Andrew

Re: [Patch] gcn: Update --with-arch= for newer archs

2025-06-05 Thread Andrew Stubbs
pdate - in install.texi: https://gcc.gnu.org/install/configure.html#with-multilib-list [We had only updated: https://gcc.gnu.org/install/specific.html#amdgcn-x-amdhsa ] OK for mainline - and for backporting to GCC 15? OK Andrew

[PATCH] aarch64: Add testcase for vld2 which was fixed by r16-1113 [PR89606]

2025-06-04 Thread Andrew Pinski
add a testcase. Tested for aarch64-linux-gnu. PR tree-optimization/89606 gcc/testsuite/ChangeLog: * gcc.target/aarch64/vld2-1.c: New test. Signed-off-by: Andrew Pinski --- gcc/testsuite/gcc.target/aarch64/vld2-1.c | 45 +++ 1 file changed, 45 insertions(

Re: [PATCH] widening_mul: Make better use of overflowing operations in codegen of min/max(a, add/sub(a, b))

2025-06-04 Thread Andrew Pinski
On Wed, Jun 4, 2025 at 6:27 AM Richard Biener wrote: > > On Thu, May 29, 2025 at 10:04 AM wrote: > > > > From: Dhruv Chawla > > > > This patch folds the following patterns: > > - max (a, add (a, b)) -> [sum, ovf] = addo (a, b); !ovf ? sum : a > > - max (a, sub (a, b)) -> [sum, ovf] = subo (a, b)

Re: [PATCH] ranger: Add support for float <-> float casts [PR120231]

2025-06-04 Thread Andrew MacLeod
Fine with me.  I don't think Aldy got to many of the cast conversions. Andrew On 6/3/25 03:31, Jakub Jelinek wrote: Hi! I've noticed we don't even support say float -> double and other scalar floating point to scalar floating point conversions in the ranger, we just end u

Re: [PATCH] ranger: Some parameter formatting fixes

2025-06-04 Thread Andrew MacLeod
rhaps the functions changed name or something. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK.  I suspect it was from cut'n'paste when floats were first created...  but regardless, should be fixed. Andrew

Re: [PATCH] switch-conversion: Mark CSWTCH as mergeable [PR120451]

2025-06-03 Thread Andrew Pinski
On Tue, Jun 3, 2025 at 10:43 PM H.J. Lu wrote: > > On Tue, Jun 3, 2025 at 10:51 AM Andrew Pinski > wrote: > > > > When we have a smallish CSWTCH, it could be placed in the rodata.cst16 > > section so it can be merged with other constants across TUs. > > > >

[PATCH] opt: Detect the wrong case of flags option

2025-06-03 Thread Andrew Pinski
. gcc/ChangeLog: * opt-functions.awk (opt_args): Print an error if there is no match for NAME but there is a match for different case name. Signed-off-by: Andrew Pinski --- gcc/opt-functions.awk | 10 ++ 1 file changed, 10 insertions(+) diff --git a/gcc/opt

Re: [PATCH] c: Enable -Wjump-misses-init for -Wc++-compat

2025-06-03 Thread Andrew Pinski
On Tue, Jun 3, 2025 at 11:18 AM Martin Uecker wrote: > > Am Dienstag, dem 03.06.2025 um 10:56 -0700 schrieb Andrew Pinski: > > On Tue, Jun 3, 2025 at 10:45 AM Martin Uecker wrote: > > > > > > > > > This version only contains the fix for -Wc++-compat. >

Re: [PATCH] c: Enable -Wjump-misses-init for -Wc++-compat

2025-06-03 Thread Andrew Pinski
s_init) Warning LangEnabledby(C ObjC,Wc++-compat) 1 | #error incorrect case of 'LangEnabledBy' during parsing of C ObjC Var(warn_jump_misses_init) Warning LangEnabledby(C ObjC,Wc++-compat) | ^ ``` Which should be a good hint of what is going wrong. Thanks, Andrew > >

Re: [PATCH] c: Enable -Wjump-misses-init for -Wc++-compat

2025-06-03 Thread Andrew Pinski
and should be backported to the open branches too as it was working with GCC 4.7.0 and was only broken with r0-116778-gf2bc201f53e2b8 which introduced the typo. Thanks, Andrew Pinski > > Bootstrapped and regression tested for x86_64. > > Martin > > > c: Enable -Wjump-misses-init

Re: [PATCH v3 2/2]AArch64: propose -mmax-vectorization as an option to override vector costing

2025-06-03 Thread Andrew Pinski
supporting --param in the attributes/pragmas, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116092 . It might be easy to add the support there too. Note this is not to stop adding an -m option which is considered more stable than the --param option but rather to let you know there is a bug about allowing

[PATCH] switch-conversion: Mark CSWTCH as mergeable [PR120451]

2025-06-02 Thread Andrew Pinski
decls. PR tree-optimization/120451 gcc/ChangeLog: * tree-switch-conversion.cc (switch_conversion::build_one_array): Mark the newly created decl as mergable. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cswtch-6.c: New test. Signed-off-by: Andrew Pinski --- gcc

[PATCH] phiprop: Add testcase for already fixed case [PR116824]

2025-06-02 Thread Andrew Pinski
-by: Andrew Pinski --- gcc/testsuite/gcc.dg/tree-ssa/phiprop-2.c | 28 +++ 1 file changed, 28 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phiprop-2.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phiprop-2.c b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-2.c new file

[PATCH] c/c++: Handle '#pragma GCC target optimize' early [PR48026]

2025-06-02 Thread Andrew Pinski
get/i386/vect-pragma-target-2.C: New test. * gcc.target/i386/vect-pragma-target-1.c: New test. * gcc.target/i386/vect-pragma-target-2.c: New test. Signed-off-by: Gwenole Beauchesne Co-authored-by: Andrew Pinski --- gcc/c-family/c-pragma.cc | 4 +- ...

Re: [PATCH] c++, coroutines: Lookup coroutine_handle template [PR120495].

2025-06-02 Thread Andrew Pinski
claration. Yes and this was mentioned while I was fixing PR 115605 (https://inbox.sourceware.org/gcc-patches/5d1efe6a-d2f2-4b37-aa92-ce3a73739...@redhat.com/). I didn't get some time to look into changing that yet. Iain, if you look into fixing lookup_template_class, check the testcases for PR

Re: [Patch] libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async [PR120444]

2025-06-02 Thread Andrew Stubbs
On 02/06/2025 15:40, Tobias Burnus wrote: Hi Andrew, Andrew Stubbs wrote: The hsa_memory_copy API is known to be slow, so for smaller data sizes it's probably better to have one hsa_memory_copy replace the whole memset than use three API calls, even with setting up some host-side memo

Re: [Patch] libgomp: Add OpenMP's omp_target_memset/omp_target_memset_async [PR120444]

2025-06-02 Thread Andrew Stubbs
y, e.g. via requires unified_shared_memory/self_maps). For nvptx, cuMemsetD8 is used and for AMD GPUs hsa_amd_memory_fill. However, the latter only supports 4byte aligned data, working in multiples of 4byte. @Sandra: Any .texi comments? (Or generic comments.) @Thomas, Jakub, anyone: Any comment? @Andrew, an

[PATCH v2] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-06-01 Thread Andrew Pinski
prop1. * gcc.dg/tree-ssa/pr57361-1.c: New test. Signed-off-by: Andrew Pinski --- gcc/testsuite/g++.dg/opt/pr66119.C| 2 +- .../execute/builtins/pr22237-1-lib.c | 27 ++ .../execute/builtins/pr22237-1.c | 57 gcc/testsuite/gcc.dg/tree-ssa/200

Re: [PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder

2025-06-01 Thread Andrew Pinski
On Sun, Jun 1, 2025 at 3:54 AM Richard Biener wrote: > > On Sat, May 31, 2025 at 8:41 PM Andrew Pinski > wrote: > > > > Currently gimple_folder::convert_and_fold calls create_tmp_var; that means > > while in ssa form, > > the pass which calls fold_stmt will a

[PATCH v2] aarch64:sve: Use make_ssa_name instead of create_tmp_var in the folder

2025-06-01 Thread Andrew Pinski
ad of create_tmp_var for the temporary. Add comment about callback argument. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64-sve-builtins.cc | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch6

[PATCH] forwprop: Manually rename the virtual mem op for complex and vector loads prop

2025-05-31 Thread Andrew Pinski
definition is before the first use. (pass_forwprop::execute): Likewise for complex loads. (pass_data_forwprop): Remove TODO_update_ssa. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-forwprop.cc | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/gcc/tree-ssa

[PATCH] aarch64:sve: Use create_tmp_reg_or_ssa_name instead of create_tmp_var in the folder

2025-05-31 Thread Andrew Pinski
geLog: * config/aarch64/aarch64-sve-builtins.cc (gimple_folder::convert_and_fold): Use create_tmp_reg_or_ssa_name instead of create_tmp_var for the temporary. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64-sve-builtins.cc | 2 +- 1 file changed, 1 insertion(

[PATCH] DCE: Only set TODO_update_ssa when cfg has changed

2025-05-31 Thread Andrew Pinski
-ssa-dce.cc (perform_tree_ssa_dce): Set TODO_update_ssa only when cfg has changed Signed-off-by: Andrew Pinski --- gcc/tree-ssa-dce.cc | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc index ba9cd6536ae..f5e67c4409a 100644

[PATCH] CCP: Manually rename the virtual mem op when inserting clobbers

2025-05-31 Thread Andrew Pinski
cp.cc (insert_clobber_before_stack_restore): Update the virtual op on the inserted clobber and the stack restore function. (do_ssa_ccp): Don't add TODO_update_ssa to the todo. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-ccp.cc | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff -

[PATCH] Have TODO_verify_* not set by any pass

2025-05-30 Thread Andrew Pinski
s_data_early_vrp): Remove TODO_verify_all. (pass_data_fast_vrp): Likewise. Signed-off-by: Andrew Pinski --- gcc/function.h| 1 - gcc/gimple-harden-conditionals.cc | 6 ++ gcc/gimple-harden-control-flow.cc | 3 +-- gcc/ipa-strub.cc

[PUSHED] Fix typo in comment in execute_all_ipa_transforms.

2025-05-30 Thread Andrew Pinski
small typo, missing n at the end of function. Pushed as obvious after a bootstrap/test. gcc/ChangeLog: * passes.cc (execute_all_ipa_transforms): Fix typo in commenet. Signed-off-by: Andrew Pinski --- gcc/passes.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff

Re: [PATCH] widening_mul: Make better use of overflowing operations in codegen of min/max(a, add/sub(a, b))

2025-05-30 Thread Andrew Pinski
ild_minmax_replacement_statements (stmt, ovf, new_lhs, > + type, use_stmt)); > + } See above about reusing/removing this statement. > else > { > gimple_assign_set_rhs1 (use_stmt, ovf); > @@ -4854,11 +4978,16 @@ match_arith_overflow (gimple_stmt_iterator *gsi, > gimple *stmt, > } > else > { > - gcc_checking_assert (gimple_assign_rhs_code (use_stmt) > - == COND_EXPR); > - tree cond = build2 (ovf_use == 1 ? NE_EXPR : EQ_EXPR, > - boolean_type_node, ovf, > - build_int_cst (type, 0)); > + tree_code rhs_code = gimple_assign_rhs_code (use_stmt); > + gcc_checking_assert (rhs_code == COND_EXPR || rhs_code == > MAX_EXPR > + || rhs_code == MIN_EXPR); > + tree cond = NULL_TREE; > + if (rhs_code != COND_EXPR) > + cond = build_minmax_replacement_statements (stmt, ovf, > new_lhs, > + type, use_stmt); > + else > + cond = build2 (ovf_use == 1 ? NE_EXPR : EQ_EXPR, > + boolean_type_node, ovf, build_int_cst (type, > 0)); Note COND_EXPR no longer has the possibility of a condition part of it (since r13-707-g68e0063397ba82) so this code will be removed soon (PR120477 records this dead code). So I think my mention about reusing the lhs and replacing statements applies here too. Also are you sure this produces the correct resolve for MIN/MAX? Because I think we would produce: + _7 = .(ADD|SUB)_OVERFLOW (a, b); + _8 = REALPART_EXPR <_7>; + _9 = IMAGPART_EXPR <_7>; + _10 = _9 != 0; (or _9 == 0) + _11 = _10 ? _8 : a; _res = MIN/MAX(_11, _8) Which will give the same value but I suspect you wanted to remove the MIN/MAX here too. Thanks, Andrew > gimple_assign_set_rhs1 (use_stmt, cond); > } > } > -- > 2.44.0 >

[PATCH] scc_copy: conditional return TODO_cleanup_cfg.

2025-05-29 Thread Andrew Pinski
::replace_scc_by_value): Return true if something was done. (scc_copy_prop::propagate): Return true if something was changed. (pass_sccopy::execute): Return TODO_cleanup_cfg if a prop happened. Signed-off-by: Andrew Pinski --- gcc/gimple-ssa-sccopy.cc | 20 1 file

Re: [PATCH 1/2] forwprop: Change test in loop of optimize_memcpy_to_memset

2025-05-29 Thread Andrew Pinski
On Tue, May 27, 2025 at 5:14 AM Richard Biener wrote: > > On Tue, May 27, 2025 at 5:02 AM Andrew Pinski > wrote: > > > > This was noticed in the review of copy propagation for aggregates > > patch, instead of checking for a NULL or a non-ssa name of vuse, > >

Re: [PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-05-28 Thread Andrew Pinski
On Mon, May 26, 2025 at 1:40 PM Andrew Pinski wrote: > > Note this is redundant store removal - I'm not sure operand_equal_p > > is good enough to catch all cases of effective type changes done? > > Esp. as infering the old effective type from the read side (src2) > &g

Re: [PATCH] Improve copy prop for aggregates and combine with zeroing case

2025-05-28 Thread Andrew Pinski
On Fri, May 23, 2025 at 10:12 PM Andrew Pinski wrote: > > This improves copy prop for aggregates by working over statements that don't > modify the access > just like how it is done for copying zeros. > To speed up things, we should only have one loop back on the vuse in

Re: [PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-05-28 Thread Andrew Pinski
On Mon, May 26, 2025 at 1:40 PM Andrew Pinski wrote: > > On Mon, May 26, 2025 at 5:36 AM Richard Biener > wrote: > > > > On Sun, May 18, 2025 at 10:58 PM Andrew Pinski > > wrote: > > > > > > This implements a simple copy propagation for aggregates

[PATCH 1/2] forwprop: Change test in loop of optimize_memcpy_to_memset

2025-05-26 Thread Andrew Pinski
(optimize_memcpy_to_memset): Change check from NULL/non-ssa name to default name. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-forwprop.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index 4c048a9a298

[PATCH 2/2] forwprop: Add stats for memcpy->memset

2025-05-26 Thread Andrew Pinski
hen the statement changed. Signed-off-by: Andrew Pinski --- gcc/tree-ssa-forwprop.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index e457a69ed48..81ea7d4195e 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -132

Re: [PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-05-26 Thread Andrew Pinski
On Mon, May 26, 2025 at 1:40 PM Andrew Pinski wrote: > > On Mon, May 26, 2025 at 5:36 AM Richard Biener > wrote: > > > > On Sun, May 18, 2025 at 10:58 PM Andrew Pinski > > wrote: > > > > > > This implements a simple copy propagation for aggregates

Re: [PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-05-26 Thread Andrew Pinski
On Mon, May 26, 2025 at 5:36 AM Richard Biener wrote: > > On Sun, May 18, 2025 at 10:58 PM Andrew Pinski > wrote: > > > > This implements a simple copy propagation for aggregates in the similar > > fashion as we already do for copy prop of zeroing. > > >

Re: [PATCH] testsuite: Fix pr101145inf*.c testcases [PR117494]

2025-05-26 Thread Andrew Pinski
On Mon, May 26, 2025 at 4:57 AM Christophe Lyon wrote: > > ,, > > On Mon, 26 May 2025 at 12:54, Andrew Pinski (QUIC) > wrote: > > > > > -Original Message- > > > From: Christophe Lyon > > > Sent: Monday, May 26, 2025 3:09 AM >

RE: [PATCH] testsuite: Fix pr101145inf*.c testcases [PR117494]

2025-05-26 Thread Andrew Pinski (QUIC)
> -Original Message- > From: Christophe Lyon > Sent: Monday, May 26, 2025 3:09 AM > To: Andrew Pinski (QUIC) > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] testsuite: Fix pr101145inf*.c testcases > [PR117494] > > Hi Andrew, > > On Sun, 17 Nov 2024

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-25 Thread Andrew Pinski
GS=u,k > >> + shift > >> +fi > >> + > >> +if [ "$use_brbe" = true ] ; then > >> + if grep -q hypervisor /proc/cpuinfo ; then > >> +echo >&2 "Warning: branch profiling may not be functional in VMs" > >> +

Re: [PATCH] match: Undo maybe_push_res_to_seq in some cases [PR120331]

2025-05-23 Thread Andrew Pinski
On Fri, May 23, 2025 at 2:39 AM Richard Biener wrote: > > On Thu, May 22, 2025 at 3:11 AM Jeff Law wrote: > > > > > > > > On 5/18/25 10:38 AM, Andrew Pinski wrote: > > > While working on improving forwprop and removal of > > > forward_propagate_in

[PATCH] Improve copy prop for aggregates and combine with zeroing case

2025-05-23 Thread Andrew Pinski
test. * gcc.dg/tree-ssa/copy-prop-arg-2.c: New test. * gcc.dg/tree-ssa/copy-prop-arg-3.c: New test. Signed-off-by: Andrew Pinski --- .../gcc.dg/tree-ssa/copy-prop-arg-1.c | 37 + .../gcc.dg/tree-ssa/copy-prop-arg-2.c | 35 .../gcc.dg/tree-ssa/copy-prop-arg-3.c

Re: [PATCH 2/2] VR-VALUES: Rewrite test_for_singularity using range_op_handler

2025-05-23 Thread Andrew MacLeod
On 9/29/23 16:17, Jeff Law wrote: On 9/5/23 01:12, Andrew Pinski wrote: On Mon, Sep 4, 2023 at 11:06 PM Jeff Law via Gcc-patches wrote: On 9/1/23 11:30, Andrew Pinski via Gcc-patches wrote: So it turns out there was a simplier way of starting to improve VRP to start to fix PR 110131

Re: [PATCH] combine: gen_lowpart_no_emit vs CLOBBER [PR120090]

2025-05-21 Thread Andrew Pinski
On Wed, May 21, 2025 at 3:21 PM Jeff Law wrote: > > > > On 5/5/25 3:27 PM, Andrew Pinski wrote: > > The problem here is simplify-rtx.cc expects gen_lowpart_no_emit > > to return NULL on failure but combine's hook was returning CLOBBER. > > After r16-160-ge6f89d

Re: [PATCH] testsuite: aarch64: arm: Fix -mcpu=unset support in shared effective targets

2025-05-21 Thread Andrew Pinski
On Tue, May 20, 2025, 1:47 PM Christophe Lyon wrote: > Many tests became unsupported on aarch64 when -mcpu=unset was added to > several arm_* effective targets, because this flag is only supported > on arm. > > Since these effective targets are used on arm and aarch64, the patch > adds -mcpu=unse

[PATCH 2/2] aarch64: Improve rtx_cost for constants in COMPARE [PR120372]

2025-05-20 Thread Andrew Pinski
be handled by the cmp instruction. gcc/testsuite/ChangeLog: * gcc.target/aarch64/imm_choice_comparison-2.c: New test. Signed-off-by: Andrew Pinski --- gcc/config/aarch64/aarch64.cc | 7 ++ .../aarch64/imm_choice_comparison-2.c | 90 +++ 2 files

[PATCH 1/2] expand: Use rtx_cost directly instead of gen_move_insn for canonicalize_comparison.

2025-05-20 Thread Andrew Pinski
choice if dump is enabled. Signed-off-by: Andrew Pinski --- gcc/expmed.cc | 23 +++ 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 72dbafe5d9f..d5da199d033 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -6408,18 +6408,25

[PATCH] middle-end: Fix complex lowering of cabs with no LHS [PR120369]

2025-05-20 Thread Andrew Pinski
. Signed-off-by: Andrew Pinski --- gcc/testsuite/gcc.dg/torture/pr120369-1.c | 9 + gcc/tree-complex.cc | 4 2 files changed, 13 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/torture/pr120369-1.c diff --git a/gcc/testsuite/gcc.dg/torture/pr120369-1.c b/gcc

Re: [r16-372 Regression] FAIL: gfortran.dg/specifics_1.f90 -O3 -g execution test on Linux/x86_64

2025-05-18 Thread Andrew Pinski
On Sun, May 18, 2025 at 11:19 PM haochen.jiang wrote: > > On Linux/x86_64, > > 064cac730f88dc71c6da578f9ae5b8e092ab6cd4 is the first bad commit > commit 064cac730f88dc71c6da578f9ae5b8e092ab6cd4 > Author: Jan Hubicka > Date: Sun May 4 10:52:35 2025 +0200 > > Improve maybe_hot handling in inl

Re: AArch64: Enable early scheduling for -O3 and higher (PR118351)

2025-05-18 Thread Andrew Pinski
d higher. > > Is this something you may want to add to the release notes? It is there already: The first scheduling pass (-fschedule-insns) is no longer enabled by default at -O2 for AArch64 targets. The pass is still enabled by default at -O3 and -Ofast. Thanks, Andrew Pinski > > Gerald

[PATCH] gimple-fold: Implement simple copy propagation for aggregates [PR14295]

2025-05-18 Thread Andrew Pinski
. * gcc.c-torture/execute/builtins/pr22237-1-lib.c: New test. * gcc.c-torture/execute/builtins/pr22237-1.c: New test. * gcc.dg/tree-ssa/pr57361.c: Disable forwprop1. * gcc.dg/tree-ssa/pr57361-1.c: New test. Signed-off-by: Andrew Pinski --- gcc/testsuite/g++.dg/opt

[PATCH] match: Undo maybe_push_res_to_seq in some cases [PR120331]

2025-05-18 Thread Andrew Pinski
331 * gimple-match-exports.cc (maybe_undo_push): New function. (gimple_simplify): Call maybe_undo_push if resimplify was successfull. Signed-off-by: Andrew Pinski --- gcc/gimple-match-exports.cc | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --gi

[PATCH] match: Remove valueize_condition argument from gimple_extra template

2025-05-18 Thread Andrew Pinski
match-exports.o-warn): Remove. * gimple-match-exports.cc (gimple_extract): Remove valueize_condition argument. (gimple_extract_op): Update call to gimple_extract. (gimple_simplify): Likewise. Also remove valueize_condition lambda. Signed-off-by: Andrew Pinski ---

[PATCH] phiopt: Use mark_lhs_in_seq_for_dce instead of doing it inline

2025-05-17 Thread Andrew Pinski
Make non-static. * gimple-fold.h (mark_lhs_in_seq_for_dce): Declare. * tree-ssa-phiopt.cc (match_simplify_replacement): Use mark_lhs_in_seq_for_dce instead of manually looping. Signed-off-by: Andrew Pinski --- gcc/gimple-fold.cc | 2 +- gcc/gimple-fold.h

Re: Proposal: File-backed allocations support for ASan reducing dependency on system memory.

2025-05-17 Thread Andrew Pinski
l non-trivial changes, functionality improvements, etc. should go through the upstream tree first and then be merged back to the GCC tree. Thanks, Andrew Pinski > > Thanks, > Archit

Re: [PATCH] gimplify: Add -Wuse-before-shadow [PR92386]

2025-05-16 Thread Andrew Pinski
ust C if this is a > concern?). Maybe then the gimplifier is not the right place to do this then. -Wshadow is handled in warn_if_shadowing inside c-decl.cc which is called from pushdecl. Maybe you could do something inside there where you search the current statement list (cur_stmt_list) for previou

Re: [PATCH v2 1/2] tree-simplify: unify simple_comparison ops in vec_cond for bit and/or [PR119196]

2025-05-16 Thread Andrew Pinski
On Fri, May 16, 2025 at 9:49 AM Andrew Pinski wrote: > > On Fri, May 16, 2025 at 9:32 AM Icen Zeyada wrote: > > > > Merge simple_comparison patterns under a single vec_cond_expr for bit_and > > and bit_ior in the simplify pass. > > > > Ensure that when both

Re: [PATCH v2 1/2] tree-simplify: unify simple_comparison ops in vec_cond for bit and/or [PR119196]

2025-05-16 Thread Andrew Pinski
be unused? Other than that it might make sense to extend `(a?0:-1) lop (b?0:-1)` too. I am not sure but xor might show up; though I don't think it is as important as &/| as you handle today. Thanks, Andrew > + > (for cnd (cond vec_cond) > /* (a != b) ? (a - b) : 0 -> (a - b) */ > (simplify > -- > 2.43.0 >

  1   2   3   4   5   6   7   8   9   10   >