Re: [PATCH] i386: Fix vect-pragma-target-[12].c testcase for -march=XYZ [PR120643]

2025-07-03 Thread Richard Biener
On Thu, Jul 3, 2025 at 9:34 PM Andrew Pinski wrote: > > These 2 testcases were originally designed for the default -march= of > x86_64 so if you pass -march=native (on a target with AVX512 enabled), > they will fail. It fix this, we add `-mno-sse3 -mprefer-vector-width=512` Did you mean to use -m

Re: [PATCH] fold: Change comparison of error_mark_node to use error_operand_p in tree_expr_nonnegative_warnv_p [PR118948]

2025-07-03 Thread Richard Biener
On Thu, Jul 3, 2025 at 9:08 PM Andrew Pinski wrote: > > This is an obvious fix for this small regression. Basically after > r15-328-g5726de79e2154a, > there is a call to tree_expr_nonnegative_warnv_p where the type of the > expression is now > error_mark_node. Though there was only a check if th

Re: [PATCH] Update alignment for argument on stack

2025-07-03 Thread Richard Sandiford
"H.J. Lu" writes: > On Thu, Jul 3, 2025 at 11:02 PM Richard Sandiford > wrote: >> >> "H.J. Lu" writes: >> > Since a backend may ignore user type alignment for arguments passed on >> > stack, update alignment for arguments passed on stack when copying MEM's >> > memory attributes. >> > >> > gcc/

Re: [PATCH] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-07-03 Thread Richard Sandiford
Konstantinos Eleftheriou writes: > On Wed, May 7, 2025 at 11:29 AM Richard Sandiford > wrote: >> But I thought the code was allowing multiple stores to be forwarded to >> a single (wider) load. E.g. 4 individual byte stores at address X, X+1, >> X+2 and X+3 could be forwarded to a 4-byte load at

[SNAPv3] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-03 Thread Nathan Myers
This is a snapshot of work on P2714 "Bind front and back to NTTP callables", posted for reference. Questions: 1. Jonathan asks if __type_forward_like_t does the same job as __like_t in bits/move.h. 2. Could the "if constexpr" statements be better expressed as requires clauses via the A=>B == !A||B

Re: [PATCH 2/2] RISC-V: prefetch: fix LRA ICE [PR118241]

2025-07-03 Thread Jeff Law
On 7/3/25 5:19 PM, Vineet Gupta wrote: Provide a fallback alternaive register contraint for LRA in the light of the tightened "Q" constraint. Cures the following ICE ... | gcc/testsuite/gcc.target/riscv/pr118241-b.cc:31:19: error: unable to generate reloads for: | 31 | void m() { a.l(); }

Re: [PATCH 1/2] RISC-V: prefetch: const offset needs to have 5 bits zero, not 4

2025-07-03 Thread Jeff Law
On 7/3/25 5:19 PM, Vineet Gupta wrote: Spotted this by chance as I saw a similar fixup in comments. From comments, I think this is needed, but I've not hit any issues due to this. gcc/ChangeLog: * config/riscv/predicates.md (prefetch_operand): mack 5 bits. Signed-off-by: Vineet Gup

Re: [PATCH] LoongArch: Prevent subreg of subreg in CRC

2025-07-03 Thread Lulu Cheng
在 2025/7/4 上午11:14, Xi Ruoyao 写道: On Fri, 2025-07-04 at 09:47 +0800, Lulu Cheng wrote: 在 2025/7/2 下午3:31, Xi Ruoyao 写道: The register_operand predicate can match subreg, then we'd have a subreg of subreg and it's invalid.  Use lowpart_subreg to avoid the nested   subreg. gcc/ChangeLog:

Re: [PATCH] LoongArch: Prevent subreg of subreg in CRC

2025-07-03 Thread Lulu Cheng
在 2025/7/4 上午11:25, Xi Ruoyao 写道: On Fri, 2025-07-04 at 11:14 +0800, Xi Ruoyao wrote: On Fri, 2025-07-04 at 09:47 +0800, Lulu Cheng wrote: 在 2025/7/2 下午3:31, Xi Ruoyao 写道: The register_operand predicate can match subreg, then we'd have a subreg of subreg and it's invalid.  Use lowpart_subreg

Re: [PATCH] LoongArch: Prevent subreg of subreg in CRC

2025-07-03 Thread Xi Ruoyao
On Fri, 2025-07-04 at 11:14 +0800, Xi Ruoyao wrote: > On Fri, 2025-07-04 at 09:47 +0800, Lulu Cheng wrote: > > > > 在 2025/7/2 下午3:31, Xi Ruoyao 写道: > > > The register_operand predicate can match subreg, then we'd have a > > > subreg > > > of subreg and it's invalid.  Use lowpart_subreg to avoid th

Re: [PATCH] LoongArch: Prevent subreg of subreg in CRC

2025-07-03 Thread Xi Ruoyao
On Fri, 2025-07-04 at 09:47 +0800, Lulu Cheng wrote: > > 在 2025/7/2 下午3:31, Xi Ruoyao 写道: > > The register_operand predicate can match subreg, then we'd have a subreg > > of subreg and it's invalid.  Use lowpart_subreg to avoid the nested > >   subreg. > > > > gcc/ChangeLog: > > > > * config

Re: [PATCH] mmix: Define MAX_FIXED_MODE_SIZE [PR120935]

2025-07-03 Thread Hans-Peter Nilsson
On Thu, 3 Jul 2025, Pietro Monteiro wrote: > Use TImode instead of the default DImode. Fixes ICE when building libstc++. I'll have to look into this. There might be a delay. Thanks for the patch though! brgds, H-P > > Additionally, this change fixes: > > c-c++-common/pr111309-1.c -Wc++-com

[PATCH] LoongArch: testsuite: Adapt bstrpick_alsl_paired.c for GCC 16 change

2025-07-03 Thread Xi Ruoyao
In GCC 16 the compiler is smarter and it optimizes away the unneeded zero-extension during the expand pass. Thus we can no longer match and_alsl_reversed. Drop the scan-rtl-dump for and_alsl_reversed and add scan-assembler-not against bstrpick.d to detect the unneeded zero-extension in case it re

Re: [PATCH] LoongArch: Prevent subreg of subreg in CRC

2025-07-03 Thread Lulu Cheng
在 2025/7/2 下午3:31, Xi Ruoyao 写道: The register_operand predicate can match subreg, then we'd have a subreg of subreg and it's invalid. Use lowpart_subreg to avoid the nested subreg. gcc/ChangeLog: * config/loongarch/loongarch.md (crc_combine): Avoid nested subreg. gcc/tests

[PATCH v3 3/3] RISC-V: Add test for vec_duplicate + vsadd.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-07-03 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vsadd.vv combine to vsadd.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto. * gc

[PATCH v3 2/3] RISC-V: Add test for vec_duplicate + vsadd.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-07-03 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vsadd.vv combine to vsadd.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto.

[PATCH v3 1/3] RISC-V: Combine vec_duplicate + vsadd.vv to vsadd.vx on GR2VR cost

2025-07-03 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vsadd.vv to the vsadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if th

[PATCH v3 0/3] RISC-V: Combine vec_duplicate + vsadd.vv to vsadd.vx on GR2VR cost

2025-07-03 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vsadd.vv into vsadd.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: Case 0:

[PATCH] mmix: Define MAX_FIXED_MODE_SIZE [PR120935]

2025-07-03 Thread Pietro Monteiro
Use TImode instead of the default DImode. Fixes ICE when building libstc++. Additionally, this change fixes: c-c++-common/pr111309-1.c -Wc++-compat (test for excess errors) c-c++-common/pr111309-1.c -Wc++-compat execution test gcc.dg/pr105094.c (test for excess errors) gcc.dg/torture/pr11648

[PATCH 1/2] RISC-V: prefetch: const offset needs to have 5 bits zero, not 4

2025-07-03 Thread Vineet Gupta
Spotted this by chance as I saw a similar fixup in comments. >From comments, I think this is needed, but I've not hit any issues due to this. gcc/ChangeLog: * config/riscv/predicates.md (prefetch_operand): mack 5 bits. Signed-off-by: Vineet Gupta --- gcc/config/riscv/predicates.md | 4

[PATCH 2/2] RISC-V: prefetch: fix LRA ICE [PR118241]

2025-07-03 Thread Vineet Gupta
Provide a fallback alternaive register contraint for LRA in the light of the tightened "Q" constraint. Cures the following ICE ... | gcc/testsuite/gcc.target/riscv/pr118241-b.cc:31:19: error: unable to generate reloads for: | 31 | void m() { a.l(); } | | ^ |(insn 26 25 27

[pushed] c++: trivial lambda pruning [PR120716]

2025-07-03 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- In this testcase there is nothing in the lambda except a static_assert which mentions a variable from the enclosing scope but does not odr-use it, so we want prune_lambda_captures to remove its capture. Since the lambda is so empty, there's

[pushed] c++: ICE with 'this' in lambda signature [PR120748]

2025-07-03 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- This testcase was crashing from infinite recursion in the diagnostic machinery, trying to print the lambda signature, which referred to the __this capture field in the lambda, which wanted to print the lambda again. But we don't want the si

Re: [SNAP] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-03 Thread Jonathan Wakely
On Thu, 3 Jul 2025 at 23:14, Nathan Myers wrote: > > This is a snapshot of work on P2714 "Bind front and back to NTTP > callables", posted for reference. Not tested. > > libstdc++-v3/ChangeLog: > PR libstdc++/119744 > * include/bits/version.def: Redefine __cpp_lib_bind_front etc. >

[SNAPv2] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-03 Thread Nathan Myers
This is a snapshot of work on P2714 "Bind front and back to NTTP callables", posted for reference. Not tested. libstdc++-v3/ChangeLog: PR libstdc++/119744 * include/bits/version.def: Redefine __cpp_lib_bind_front etc. * include/bits/version.h: Ditto. * include/std/f

Re: [PATCH] Update alignment for argument on stack

2025-07-03 Thread H.J. Lu
On Thu, Jul 3, 2025 at 11:02 PM Richard Sandiford wrote: > > "H.J. Lu" writes: > > Since a backend may ignore user type alignment for arguments passed on > > stack, update alignment for arguments passed on stack when copying MEM's > > memory attributes. > > > > gcc/ > > > > PR target/120839 > > *

Re: [SNAP] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-03 Thread Jonathan Wakely
On Thu, 3 Jul 2025 at 23:14, Nathan Myers wrote: > > This is a snapshot of work on P2714 "Bind front and back to NTTP > callables", posted for reference. Not tested. > > libstdc++-v3/ChangeLog: > PR libstdc++/119744 > * include/bits/version.def: Redefine __cpp_lib_bind_front etc. >

[SNAP] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-03 Thread Nathan Myers
This is a snapshot of work on P2714 "Bind front and back to NTTP callables", posted for reference. Not tested. libstdc++-v3/ChangeLog: PR libstdc++/119744 * include/bits/version.def: Redefine __cpp_lib_bind_front etc. * include/bits/version.h: Ditto. * include/std/f

[PATCH v1 2/2] RISC-V: Implement TARGET_ARG_EXTENDED_ON_STACK.

2025-07-03 Thread Palmer Dabbelt
When we split an argument between the stack and a registers we might end up with a misaligned access, so use this newly implemented hook to instead bias the codegen towards the registers rather than the stack. PR/82106 gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_arg_info

[PATCH v1 1/2] Add TARGET_ARG_EXTENDED_ON_STACK

2025-07-03 Thread Palmer Dabbelt
We currently handle arguments that are split between the stack and registers by storing the registers to the stack and then treating the argument as if it was entirely passed on the stack. Allow targets to override this behavior and instead treat the argument as if it was passed entirely in regist

[PATCH v1 0/2] Allow targets to avoid materializing split parameters via stack extension [PR/82106]

2025-07-03 Thread Palmer Dabbelt
This is really Jim's code, but it's been sitting around in Bugzilla for a while so I've picked it up. All I really did here is add a target hook and mangle some comments, but I think I understand enough about what's going on to try and get things moving forward. So I'm writing up a pretty big cov

[committed] c++: Fix a pasto in the PR120471 fix [PR120940]

2025-07-03 Thread Jakub Jelinek
Hi! No idea how this slipped in, I'm terribly sorry. Strangely nothing in the testsuite has caught this, so I've added a new test for that. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk and for 15.2, 14.4, 13.4 and 12.5. 2025-07-03 Jakub Jelinek PR c++/120

Re: [PATCH] fortran: Add the preliminary code of MOVE_ALLOC arguments

2025-07-03 Thread Steve Kargl
On Thu, Jul 03, 2025 at 10:12:52PM +0200, Mikael Morin wrote: > From: Mikael Morin > > Regression-tested on aarch64-unknown-linux-gnu. > OK for master? > Yes. Almost looks obvious once someone finds and fixes the issue. Thanks for the patch. -- Steve

[PATCH] fortran: Add the preliminary code of MOVE_ALLOC arguments

2025-07-03 Thread Mikael Morin
From: Mikael Morin Regression-tested on aarch64-unknown-linux-gnu. OK for master? -- >8 -- Add the preliminary code produced for the evaluation of the FROM and TO arguments of the MOVE_ALLOC intrinsic before using their values. Before this change, the preliminary code was ignored and dropped, l

[PATCH] i386: Fix vect-pragma-target-[12].c testcase for -march=XYZ [PR120643]

2025-07-03 Thread Andrew Pinski
These 2 testcases were originally designed for the default -march= of x86_64 so if you pass -march=native (on a target with AVX512 enabled), they will fail. It fix this, we add `-mno-sse3 -mprefer-vector-width=512` to the options to force a specific arch to the testcase. Tested on a skylake-avx512

Re: [PATCH 1/1] contrib: add vmtest-tool to test BPF programs

2025-07-03 Thread Piyush Raj
On 02/07/25 21:44, Jose E. Marchesi wrote: One can also pass a precompiled BPF object with the desired optimization options to the script to check it with the verifier. Yes, but AFAIK building objects that can be actually loaded in the kernel and verified requires in practice including kernel h

[PATCH] fold: Change comparison of error_mark_node to use error_operand_p in tree_expr_nonnegative_warnv_p [PR118948]

2025-07-03 Thread Andrew Pinski
This is an obvious fix for this small regression. Basically after r15-328-g5726de79e2154a, there is a call to tree_expr_nonnegative_warnv_p where the type of the expression is now error_mark_node. Though there was only a check if the expression was error_mark_node. Bootstrapped and tested on x8

Re: [PATCH] libstdc++: Members missing in std::numeric_limits

2025-07-03 Thread Mateusz Zych
Hello! I've prepared a patch, which adds all members missing from std::numeric_limits<> specializations for integer-class types. Jonathan, please let me know whether you like these changes and do not see any bugs or issues with them. From my side, I just want to say that: - Since all std::num

[PATCH] Add myself as an aarch64 port reviewer

2025-07-03 Thread Andrew Pinski
As mentioned in https://inbox.sourceware.org/gcc/ea828262-8f8f-4362-9ca8-312f7c20e...@nvidia.com/T/#m6e7e8e11656189598c759157d5d49cbd0ac9ba7c. Adding myself as an aarch64 port reviewer. ChangeLog: * MAINTAINERS: Add myself as an aarch64 port reviewer. Signed-off-by: Andrew Pinski ---

Re: [PATCH] c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]

2025-07-03 Thread Jakub Jelinek
On Thu, Jul 03, 2025 at 08:12:22PM +0200, Richard Biener wrote: > > So, instead at least for now, the following patch keeps doing the > > optimization, just doesn't perform it in pointer arithmetics. > > pointer_int_sum itself actually adds the multiplication by size_exp, > > so ptr + expr is turne

Re: [PATCH] c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]

2025-07-03 Thread Richard Biener
> Am 03.07.2025 um 16:11 schrieb Jakub Jelinek : > > Hi! > > The following testcase is miscompiled with -fsanitize=undefined but we > introduce UB into the IL even without that flag. > > The optimization ptr +- (expr +- cst) when expr/cst have undefined > overflow into (ptr +- cst) +- expr i

[Ada] Remove left-overs of front-end exception mechanism

2025-07-03 Thread Eric Botcazou
Tested on x86-64/Linux, applied on the mainline. 2025-07-03 Eric Botcazou * gcc-interface/Makefile.in (gnatlib-sjlj): Delete. (gnatlib-zcx): Do not modify Frontend_Exceptions constant. * libgnat/system-linux-loongarch.ads (Frontend_Exceptions): Delete. -- Eric Botcaz

Re: [PATCH] middle-end: Fix complex lowering of cabs with no LHS [PR120369]

2025-07-03 Thread Andrew Pinski
On Tue, May 20, 2025 at 6:44 PM Andrew Pinski wrote: > > This was introduced by r15-1797-gd8fe4f05ef448e . I had missed that > the LHS of the cabs call could be NULL. This seems to only happen at -O0, > I tried to produce one that happens at -O1 but needed many different > options to prevent the r

[to-be-committed][RISC-V] Add basic instrumentation to fusion detection

2025-07-03 Thread Jeff Law
This is primarily Shreya's work from a few months back. I just fixed the formatting, cobbled together the cover letter/ChangeLog. We were looking to evaluate some changes from Artemiy that improve GCC's ability to discover fusible instruction pairs. There was no good way to get any static d

Re: [PATCH v4 1/6] c-family: add btf_type_tag and btf_decl_tag attributes

2025-07-03 Thread David Faust
On 7/2/25 00:42, Richard Biener wrote: > On Tue, Jun 10, 2025 at 11:40 PM David Faust wrote: >> >> Add two new c-family attributes, "btf_type_tag" and "btf_decl_tag" >> along with a simple shared handler for them. >> >> gcc/c-family/ >> * c-attribs.cc (c_common_attribute_table): Add btf

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Steve Kargl
On Thu, Jul 03, 2025 at 02:43:43PM +0200, Michael Matz wrote: > Hello, > > On Thu, 3 Jul 2025, Yuao Ma wrote: > > > This patch adds the required function for Fortran trigonometric functions to > > work with glibc versions prior to 2.26. It's based on glibc source commit > > 632d895f3e5d98162f77b9

Re: [PATCH] x86-64: Add --enable-x86-64-mfentry

2025-07-03 Thread Uros Bizjak
On Thu, Jul 3, 2025 at 12:01 PM H.J. Lu wrote: > > When profiling is enabled with shrink wrapping, the mcount call may not > be placed at the function entry after > > pushq %rbp > movq %rsp,%rbp > > As the result, the profile data may be skewed which makes PGO less > effective. > > Add --enable-x8

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Yuao Ma
Hi all, On 7/3/2025 9:21 PM, Jakub Jelinek wrote: On Thu, Jul 03, 2025 at 02:43:43PM +0200, Michael Matz wrote: Hello, On Thu, 3 Jul 2025, Yuao Ma wrote: This patch adds the required function for Fortran trigonometric functions to work with glibc versions prior to 2.26. It's based on glibc s

Re: [PATCH v6 1/3][Middle-end] Provide more contexts for -Warray-bounds, -Wstringop-*warning messages due to code movements from compiler transformation (Part 1) [PR109071,PR85788,PR88771,PR106762,PR1

2025-07-03 Thread Qing Zhao
Another update on this: > On Jun 30, 2025, at 11:51, Qing Zhao wrote: >> >>> For each single predecessor block, locate the conditional statement >>> in the end of the block. determine whether the STMT is on the taken >>> path of the condition. Add these two information to each event of >>>

Re: [PATCH] Add string_slice class.

2025-07-03 Thread Richard Sandiford
Alfie Richards writes: > +/* string_slice inherits from array_slice, specifically to refer to a > substring > + of a character array. > + It includes some string like helpers. */ > +class string_slice : public array_slice > +{ > +public: > + string_slice () : array_slice () {} > + string_s

Re: [PATCH] s390: More vec-perm-const cases.

2025-07-03 Thread Andreas Krebbel
On 7/3/25 4:24 PM, Juergen Christ wrote: On 6/27/25 8:09 PM, Juergen Christ wrote: s390 missed constant vector permutation cases based on the vector pack instruction or changing the size of the vector elements during vector merge. This enables some more patterns that do not need to load a con

Re: [PATCH] Update alignment for argument on stack

2025-07-03 Thread Richard Sandiford
"H.J. Lu" writes: > Since a backend may ignore user type alignment for arguments passed on > stack, update alignment for arguments passed on stack when copying MEM's > memory attributes. > > gcc/ > > PR target/120839 > * emit-rtl.cc (set_mem_attrs): Update alignment for argument on > stack. > > gc

Re: [PATCH] c++: -fno-delete-null-pointer-checks constexpr addr comparison [PR71962]

2025-07-03 Thread Patrick Palka
On Thu, 3 Jul 2025, Jason Merrill wrote: > On 7/2/25 7:58 PM, Patrick Palka wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK > > for trunk? > > > > -- >8 -- > > > > Here the flag -fno-delete-null-pointer-checks causes the trivial address > > comparison in > > > >

[PATCH v4 2/2] gimple-fold: extend vector simplification to match scalar bitwise optimizations [PR119196]

2025-07-03 Thread Icen Zeyada
Generalize existing scalar gimple_fold rules to apply the same bitwise comparison simplifications to vector types. Previously, an expression like (x < y) && (x > y) would fold to `false` if x and y are scalars, but equivalent vector comparisons were left untouched. T

[PATCH v4 0/2] tree-optimization: extend scalar comparison folding to vectors [PR119196]

2025-07-03 Thread Icen Zeyada
New in V4: Check whether the vector is of boolean type in specific comparisons. If it is, determine whether the operation can be expanded using the selected expression. If so, proceed with the optimization; otherwise, skip the optimization. --

[PATCH v4 1/2] tree-simplify: unify simple_comparison ops in vec_cond for bit and/or/xor [PR119196]

2025-07-03 Thread Icen Zeyada
Merge simple_comparison patterns under a single vec_cond_expr for bit_and, bit_ior, and bit_xor in the simplify pass. Ensure that when both operands of a bit_and, bit_or, or bit_xor are simple_comparison results, they reside within the same vec_cond_expr rather than separate ones. This prepares t

Re: [PATCH] tree-optimization/120927 - 510.parest_r segfault with masked epilog

2025-07-03 Thread Richard Sandiford
Richard Biener writes: > The following fixes bad alignment computaton for epilog vectorization > when as in this case for 510.parest_r and masked epilog vectorization > with AVX512 we end up choosing AVX to vectorize the main loop and > masked AVX512 (sic!) to vectorize the epilog. In that case a

Re: [PATCH] s390: More vec-perm-const cases.

2025-07-03 Thread Juergen Christ
> On 6/27/25 8:09 PM, Juergen Christ wrote: > > s390 missed constant vector permutation cases based on the vector pack > > instruction or changing the size of the vector elements during vector > > merge. This enables some more patterns that do not need to load a > > constant vector for permutation

Re: [PATCH] libstdc++: Update LWG 4166 changes to concat_view::end() [PR120934]

2025-07-03 Thread Jonathan Wakely
On Thu, 3 Jul 2025 at 15:19, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15? yes for both, thanks. > > -- >8 -- > > In r15-4555-gf191c830154565 we proactively implemented the initial > proposed resolution for LWG 4166 which was later revealed to be > insuf

[PATCH] libstdc++: Update LWG 4166 changes to concat_view::end() [PR120934]

2025-07-03 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15? -- >8 -- In r15-4555-gf191c830154565 we proactively implemented the initial proposed resolution for LWG 4166 which was later revealed to be insufficient, since we must also require equality_comparable of the underlying iterators befor

Re: [PATCH v2 4/5] libstdc++: Implement mdspan and tests.

2025-07-03 Thread Luc Grosheintz
Thank you for the nice review! I've locally implemented everything and I'll send a v3 later today or tomorrow; after squashing the commits correctly; and retesting everything. Meanwhile a couple of comments below. On 7/1/25 16:42, Tomasz Kaminski wrote: On Fri, Jun 27, 2025 at 11:37 AM Luc Gros

[PATCH] c-family: Tweak ptr +- (expr +- cst) FE optimization [PR120837]

2025-07-03 Thread Jakub Jelinek
Hi! The following testcase is miscompiled with -fsanitize=undefined but we introduce UB into the IL even without that flag. The optimization ptr +- (expr +- cst) when expr/cst have undefined overflow into (ptr +- cst) +- expr is sometimes simply not valid, without careful analysis on what ptr poi

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Andreas Schwab
On Jul 03 2025, Michael Matz wrote: > Yes. And then the above is multiplied by PI, passed to cos/sin and that > one then tries to figure out the multiple of PI (i.e. the 'x' above) again > via range reduction (not a _terribly_ slow one anymore in a good > implementation, because of the limited

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Joseph Myers
On Thu, 3 Jul 2025, Michael Matz wrote: > Yes. And then the above is multiplied by PI, passed to cos/sin and that > one then tries to figure out the multiple of PI (i.e. the 'x' above) again > via range reduction (not a _terribly_ slow one anymore in a good > implementation, because of the lim

[PATCH] tree-optimization/120927 - 510.parest_r segfault with masked epilog

2025-07-03 Thread Richard Biener
The following fixes bad alignment computaton for epilog vectorization when as in this case for 510.parest_r and masked epilog vectorization with AVX512 we end up choosing AVX to vectorize the main loop and masked AVX512 (sic!) to vectorize the epilog. In that case alignment analysis for the epilog

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Michael Matz
Hello, On Thu, 3 Jul 2025, Joseph Myers wrote: > > > Isn't the whole raison d'etre for the trig-pi functions that the internal > > > argument reduction against multiples of pi becomes trivial and hence (a) > > > performant, and (b) doesn't introduce rounding artifacts? Expressing the > > > tr

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Joseph Myers
On Thu, 3 Jul 2025, Jakub Jelinek wrote: > > Isn't the whole raison d'etre for the trig-pi functions that the internal > > argument reduction against multiples of pi becomes trivial and hence (a) > > performant, and (b) doesn't introduce rounding artifacts? Expressing the > > trig-pi functions

Re: [PATCH] c++: -fno-delete-null-pointer-checks constexpr addr comparison [PR71962]

2025-07-03 Thread Jason Merrill
On 7/2/25 7:58 PM, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- Here the flag -fno-delete-null-pointer-checks causes the trivial address comparison in inline int a, b; static_assert(&a != &b); to be rejected as non-constan

[PATCH v2 1/1] contrib: add bpf-vmtest-tool to test BPF programs

2025-07-03 Thread Piyush Raj
This patch adds the bpf-vmtest-tool subdirectory under contrib which tests BPF programs under a live kernel using a QEMU VM. It automatically builds the specified kernel version with eBPF support enabled and stores it under "~/.bpf-vmtest-tool", which is reused for future invocations. It can also

[PATCH v2 0/1] contrib: add bpf-vmtest-tool to test BPF programs

2025-07-03 Thread Piyush Raj
This patch adds initial version of bpf-vmtest-tool script to test BPF programs on live kernel For now, the tool is standalone, but it is intended to be integrated with the DejaGnu testsuite to run BPF testcases in future patches. Current Limitations: - Only x86_64 is supported. Support for addit

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Jakub Jelinek
On Thu, Jul 03, 2025 at 02:43:43PM +0200, Michael Matz wrote: > Hello, > > On Thu, 3 Jul 2025, Yuao Ma wrote: > > > This patch adds the required function for Fortran trigonometric functions to > > work with glibc versions prior to 2.26. It's based on glibc source commit > > 632d895f3e5d98162f77b9

[PUSHED] OpenMP: Add omp_get_initial_device/omp_get_num_devices builtins: Fix test cases

2025-07-03 Thread Thomas Schwinge
With this fix-up for commit 387209938d2c476a67966c6ddbdbf817626f24a2 "OpenMP: Add omp_get_initial_device/omp_get_num_devices builtins", we progress: PASS: c-c++-common/gomp/omp_get_num_devices_initial_device.c (test for excess errors) PASS: c-c++-common/gomp/omp_get_num_devices_initial_

Re: [PATCH] testsuite: Skip check-function-bodies sometimes

2025-07-03 Thread Jakub Jelinek
On Thu, Jul 03, 2025 at 02:55:37PM +0200, Stefan Schulze Frielinghaus wrote: > Ok for mainline? ChangeLog is missing. And I think I'd appreciate another pair of eyes, Rainer/Mike, what do you think about this? > If a check-function-bodies test is compiled using -fstack-protector*, > -fhardened, -

Re: [PATCH] testsuite: Restore dg-do run on pr116906 and pr78185 tests

2025-07-03 Thread Christophe Lyon
ping^2 ? On Wed, 18 Jun 2025 at 12:11, Christophe Lyon wrote: > > ping? > > On Mon, 26 May 2025 at 17:26, Christophe Lyon > wrote: > > > > On Mon, 26 May 2025 at 17:14, Christophe Lyon > > wrote: > > > > > > Commit r15-7152-g57b706d141b87c removed > > > /* { dg-do run { target*-*-linux* *-*-gnu

[PATCH] testsuite: Skip check-function-bodies sometimes

2025-07-03 Thread Stefan Schulze Frielinghaus
From: Stefan Schulze Frielinghaus My understand is that during check_compile compiler_flags contains all the options passed to gcc and current_compiler_flags contains options passed via dg-options and dg-additional-options. I did a couple of experiments and printf-style debugging which endorsed

Re: [PATCH] libquadmath: add quad support for trig-pi functions

2025-07-03 Thread Michael Matz
Hello, On Thu, 3 Jul 2025, Yuao Ma wrote: > This patch adds the required function for Fortran trigonometric functions to > work with glibc versions prior to 2.26. It's based on glibc source commit > 632d895f3e5d98162f77b9c3c1da4ec19968b671. > > I've built it successfully on my end. Documentation

[COMMITTED] testsuite: Fix gcc.dg/ipa/pr120295.c on Solaris

2025-07-03 Thread Rainer Orth
gcc.dg/ipa/pr120295.c FAILs on Solaris: FAIL: gcc.dg/ipa/pr120295.c (test for excess errors) Excess errors: ld: warning: symbol 'glob' has differing types: (file /var/tmp//ccsDR59c.o type=OBJT; file /lib/libc.so type=FUNC); /var/tmp//ccsDR59c.o definition taken Fixed by renaming

Re: [PATCH] s390: More vec-perm-const cases.

2025-07-03 Thread Andreas Krebbel
On 6/27/25 8:09 PM, Juergen Christ wrote: s390 missed constant vector permutation cases based on the vector pack instruction or changing the size of the vector elements during vector merge. This enables some more patterns that do not need to load a constant vector for permutation. Bootstrapped

Re: [PATCH v9 0/9] AArch64: CMPBR support

2025-07-03 Thread Richard Sandiford
Karl Meakin writes: > This patch series adds support for the CMPBR extension. It includes the > new `+cmpbr` option and rules to generate the new instructions when > lowering conditional branches. Thanks for the update, LGTM. I've pushed the series to trunk. Richard > Changelog: > * v9: > -

Re: [PATCH v2 1/5] libstdc++: Check prerequisites of layout_*::operator().

2025-07-03 Thread Luc Grosheintz
On 7/3/25 12:45, Jonathan Wakely wrote: On Thu, 3 Jul 2025 at 11:12, Tomasz Kaminski wrote: On Thu, Jul 3, 2025 at 12:08 PM Luc Grosheintz wrote: The reasoning for this approach was: 1. The mapping::operator() and mdspan::operator[] have the same precondition; and mdspan::operator[

Re: [PATCH v3] tree-optimization/120780: Support object size for containing objects

2025-07-03 Thread Siddhesh Poyarekar
On 2025-07-03 03:13, Jakub Jelinek wrote: On Thu, Jul 03, 2025 at 08:33:45AM +0200, Richard Biener wrote: On Wed, Jul 2, 2025 at 11:32 PM Siddhesh Poyarekar wrote: MEM_REF cast of a subobject to its containing object has negative offsets, which objsz sees as an invalid access. Support this u

[PATCH v1 1/1] libiberty: add common methods for type-sensitive doubly linked lists

2025-07-03 Thread Matthieu Longo
Those methods's implementation is relying on duck-typing at compile time. The structure corresponding to the node of a doubly linked list needs to define attributes 'prev' and 'next' which are pointers on the type of a node. The structure wrapping the nodes and others metadata (first, last, size) n

[PATCH v1 0/1] libiberty: add common methods for type-sensitive doubly linked lists

2025-07-03 Thread Matthieu Longo
This patch was originally part of [1]. Merging it in GCC is a prerequisite of merging it inside binutils. Those methods's implementation is relying on duck-typing at compile time. The structure corresponding to the node of a doubly linked list needs to define attributes 'prev' and 'next' which

Re: [PATCH v2 1/5] libstdc++: Check prerequisites of layout_*::operator().

2025-07-03 Thread Jonathan Wakely
On Thu, 3 Jul 2025 at 11:12, Tomasz Kaminski wrote: > > > > On Thu, Jul 3, 2025 at 12:08 PM Luc Grosheintz > wrote: >> >> >> >> On 7/1/25 22:56, Jonathan Wakely wrote: >> > On Tue, 1 Jul 2025 at 11:32, Tomasz Kaminski wrote: >> >> >> >> Hi, >> >> More of the review will be later, but I have not

[PATCH v1 3/3] libstdc++: Implement aligned_accessor from mdspan.

2025-07-03 Thread Luc Grosheintz
This commit completes the implementation of P2897R7 by implementing and testing the template class aligned_accessor. libstdc++-v3/ChangeLog: * include/bits/version.def (aligned_accessor): Add. * include/bits/version.h: Regenerate. * include/std/mdspan (aligned_accessor): N

[PATCH v1 1/3] libstdc++: Implement is_sufficiently_aligned.

2025-07-03 Thread Luc Grosheintz
This commit implements and tests the function is_sufficiently_aligned from P2897R7. libstdc++-v3/ChangeLog: * include/bits/align.h (is_sufficiently_aligned): New function. * include/bits/version.def (is_sufficiently_aligned): Add. * include/bits/version.h: Regenerate.

[PATCH v1 0/3] Implement aligned_accessor [P2897R7].

2025-07-03 Thread Luc Grosheintz
This patch series implements the aligned_accessor paper P2897R7 in three parts: - Implement `is_sufficiently_aligned` which is part of . - Prepare the accessor tests for reuse. - Implement aligned_accessor. A couple of remarks: - The paper P2897R7 and spec N5008 don't specify that the al

[PATCH v1 2/3] libstdc++: Prepare test code for default_accessor for reuse.

2025-07-03 Thread Luc Grosheintz
All test code of default_accessor can be reused. This commit moves the reuseable code into a file generic.cc and prepares the tests for reuse with aligned_accessor. The AllocatorTrait creates a unified interface for creating both default_accessor and aligned_accessor typenames. libstdc++-v3/Chang

Re: [PATCH] x86: Emit label only for __mcount_loc section

2025-07-03 Thread H.J. Lu
On Thu, Jul 3, 2025 at 6:07 PM Uros Bizjak wrote: > > On Thu, Jul 3, 2025 at 11:54 AM H.J. Lu wrote: > > > > commit ecc81e33123d7ac9c11742161e128858d844b99d (HEAD) > > Author: Andi Kleen > > Date: Fri Sep 26 04:06:40 2014 + > > > > Add direct support for Linux kernel __fentry__ patchin

[PATCH] x86-64: Remove redundant TLS calls

2025-07-03 Thread H.J. Lu
For TLS calls: 1. UNSPEC_TLS_GD: (parallel [ (set (reg:DI 0 ax) (call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr"))) (const_int 0 [0]))) (unspec:DI [(symbol_ref:DI ("e") [flags 0x50]) (reg/f:DI 7 sp)] UNSPEC_TLS_GD) (clobber (reg:DI 5 di))]

Re: [PATCH v2 1/5] libstdc++: Check prerequisites of layout_*::operator().

2025-07-03 Thread Tomasz Kaminski
On Thu, Jul 3, 2025 at 12:08 PM Luc Grosheintz wrote: > > > On 7/1/25 22:56, Jonathan Wakely wrote: > > On Tue, 1 Jul 2025 at 11:32, Tomasz Kaminski > wrote: > >> > >> Hi, > >> More of the review will be later, but I have noticed that you have > added preconditions checks > >> to the layouts, an

Re: [PATCH v2 1/5] libstdc++: Check prerequisites of layout_*::operator().

2025-07-03 Thread Luc Grosheintz
On 7/1/25 22:56, Jonathan Wakely wrote: On Tue, 1 Jul 2025 at 11:32, Tomasz Kaminski wrote: Hi, More of the review will be later, but I have noticed that you have added preconditions checks to the layouts, and then avoid checking them inside the operator[] of the mdspan. This is general s

Re: [PATCH] x86: Emit label only for __mcount_loc section

2025-07-03 Thread Uros Bizjak
On Thu, Jul 3, 2025 at 11:54 AM H.J. Lu wrote: > > commit ecc81e33123d7ac9c11742161e128858d844b99d (HEAD) > Author: Andi Kleen > Date: Fri Sep 26 04:06:40 2014 + > > Add direct support for Linux kernel __fentry__ patching > > emitted a label, 1, for __mcount_loc section: > > 1: call mco

Fix overlfow in ipa-cp heuristics

2025-07-03 Thread Jan Hubicka
Hi, ipa-cp converts sreal times to int, while point of sreal is to accomodate very large values that can happen for loops with large number of iteraitons and also when profile is inconsistent. This happens with afdo in testsuite where loop preheader is estimated to have 0 excutions while loop body

Enable ipa-cp cloning for cold wrappers of hot functions

2025-07-03 Thread Jan Hubicka
Hi, ipa-cp cloning disables itself for all functions not passing opt_for_fn (node->decl, optimize_size) which disables it for cold wrappers of hot functions where we want to propagate. Since we later want to time saved to be considered hot, we do not need to make this early test. The patch also f

[PATCH] x86-64: Add --enable-x86-64-mfentry

2025-07-03 Thread H.J. Lu
When profiling is enabled with shrink wrapping, the mcount call may not be placed at the function entry after pushq %rbp movq %rsp,%rbp As the result, the profile data may be skewed which makes PGO less effective. Add --enable-x86-64-mfentry to enable -mfentry by default to use __fentry__, added

Re: [PATCH v8 0/9] AArch64: CMPBR support

2025-07-03 Thread Karl Meakin
On 02/07/2025 18:45, Karl Meakin wrote: This patch series adds support for the CMPBR extension. It includes the new `+cmpbr` option and rules to generate the new instructions when lowering conditional branches. Changelog: * v8: - Support far branches for the `CBB` and `CBH` instructions, an

[PATCH] x86: Emit label only for __mcount_loc section

2025-07-03 Thread H.J. Lu
commit ecc81e33123d7ac9c11742161e128858d844b99d (HEAD) Author: Andi Kleen Date: Fri Sep 26 04:06:40 2014 + Add direct support for Linux kernel __fentry__ patching emitted a label, 1, for __mcount_loc section: 1: call mcount .section __mcount_loc, "a",@progbits .quad 1b .previous If _

Re: [Fortran, Patch, PR120843, v3] Fix reject valid, because of inconformable coranks

2025-07-03 Thread Andre Vehreschild
Hi Jerry, thanks for the review and the ok. Committed as gcc-16-1967-g15413e05eb9. And special thanks for the kind words in the private mail you send me. It's very much appreciated that you even applied a translator to translate it to German. Thank you very much. I have set myself a reminder to

[committed] libstdc++: Fix regression in std::uninitialized_fill for C++98 [PR120931]

2025-07-03 Thread Jonathan Wakely
A typo in r15-4473-g3abe751ea86e34 made it ill-formed to use std::uninitialized_fill with iterators that aren't pointers (or pointers wrapped in our __normal_iterator) if the value type is a narrow character type. libstdc++-v3/ChangeLog: PR libstdc++/120931 * include/bits/stl_unin

  1   2   >