[PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Roger Sayle
A number of testcases currently fail on nvptx with the ICE: during RTL pass: final openmp-simd-2.c: In function 'foo': openmp-simd-2.c:28:1: internal compiler error: in get_personality_function, at expr.cc:14037 28 | } | ^ 0x98a38f get_personality_function(tree_node*) /home/roger

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-22 Thread Richard Biener
On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: > > On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: > > Thanks for the comments and suggestions. > > > > > On May 15, 2024, at 10:00, David Malcolm > > > wrote: > > > > > > On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote: > > > > O

Re: [PATCH] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 3:58 AM liuhongt wrote: > > According to IEEE standard, for conversions from floating point to > integer. When a NaN or infinite operand cannot be represented in the > destination format and this cannot otherwise be indicated, the invalid > operation exception shall be sign

Re: [PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 9:21 AM Roger Sayle wrote: > > > A number of testcases currently fail on nvptx with the ICE: > > during RTL pass: final > openmp-simd-2.c: In function 'foo': > openmp-simd-2.c:28:1: internal compiler error: in get_personality_function, > at expr.cc:14037 >28 | } >

Re: [PATCH] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 09:46:41AM +0200, Richard Biener wrote: > On Wed, May 22, 2024 at 3:58 AM liuhongt wrote: > > > > According to IEEE standard, for conversions from floating point to > > integer. When a NaN or infinite operand cannot be represented in the > > destination format and this cann

[PATCH] MATCH: Look through VIEW_CONVERT when folding VEC_PERM_EXPRs.

2024-05-22 Thread Manolis Tsamis
The match.pd patterns to merge two vector permutes into one fail when a potentially no-op view convert expressions is between the two permutes. This change lifts this restriction. gcc/ChangeLog: * match.pd: Allow no-op view_convert between permutes. gcc/testsuite/ChangeLog: * gc

[PATCH] web/115183 - fix typo in C++ docs

2024-05-22 Thread Richard Biener
The following fixes a reported typo. Pushed. * doc/invoke.texi (C++ Modules): Fix typo. --- gcc/doc/invoke.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 218901c0b20..0625a5ede6f 100644 --- a/gcc/doc/invoke.texi ++

[PATCH v2 1/8] [APX NF]: Support APX NF add

2024-05-22 Thread Kong, Lingling
> I wonder if we can use "define_subst" to conditionally add flags clobber > for !TARGET_APX_NF targets. Even the example for "Define Subst" uses the insn > w/ and w/o the clobber, so I think it is worth considering this approach. > > Uros. Good Suggestion, I defined new subst for no flags, and B

[PATCH v2 2/8] [APX NF] Support APX NF for {sub/and/or/xor/neg}

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (nf_and_applied): New subst_attr. (nf_x64_and_applied): Ditto. (*sub_1_nf): New define_insn. (*anddi_1_nf): Ditto. (*and_1_nf): Ditto. (*qi_1_nf): Ditto. (*

Re: [PATCH v2 1/8] [APX NF]: Support APX NF add

2024-05-22 Thread Uros Bizjak
On Wed, May 22, 2024 at 10:29 AM Kong, Lingling wrote: > > > I wonder if we can use "define_subst" to conditionally add flags clobber > > for !TARGET_APX_NF targets. Even the example for "Define Subst" uses the > > insn > > w/ and w/o the clobber, so I think it is worth considering this approach.

[PATCH v2 3/8] [APX NF] Support APX NF for left shift insns

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (*ashl3_1_nf): New. (*ashlhi3_1_nf): Ditto. (*ashlqi3_1_nf): Ditto. * config/i386/sse.md: New define_split. --- gcc/config/i386/i386.md | 80 +++-- gcc/config/i386/sse.md | 13 +++ 2 file

[PATCH v2 4/8] [APX NF] Support APX NF for right shift insns

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (*ashr3_1_nf): New. (*lshr3_1_nf): Ditto. (*lshrqi3_1_nf): Ditto. (*lshrhi3_1_nf): Ditto. --- gcc/config/i386/i386.md | 82 +++-- 1 file changed, 46 insertions(+), 36 deletions(-) diff --git

[PATCH v2 5/8] [APX NF] Support APX NF for rotate insns

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (ashr3_cvt_nf): New define_insn. (*3_1_nf): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-nf.c: Add NF test for rotate insns. --- gcc/config/i386/i386.md| 53 -- gcc/testsuite/gcc.target

[PATCH v2 6/8] [APX NF] Support APX NF for shld/shrd

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (x86_64_shld_nf): New define_insn. (x86_64_shld_ndd_nf): Ditto. (x86_64_shld_1_nf): Ditto. (x86_64_shld_ndd_1_nf): Ditto. (*x86_64_shld_shrd_1_nozext_nf): Ditto. (x86_shld_nf): Ditto. (x86_shld_ndd_nf): Di

[PATCH v2 7/8] [APX NF] Support APX NF for mul/div

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (*mul3_1_nf): New define_insn. (*mulqi3_1_nf): Ditto. (*divmod4_noext_nf): Ditto. (divmodhiqi3_nf): Ditto. --- gcc/config/i386/i386.md | 47 ++--- 1 file changed, 30 insertions(+), 17 deletion

[PATCH v2 8/8] [APX NF] Support APX NF for lzcnt/tzcnt/popcnt

2024-05-22 Thread Kong, Lingling
gcc/ChangeLog: * config/i386/i386.md (clz2_lzcnt_nf): New define_insn. (*clz2_lzcnt_falsedep_nf): Ditto. (__nf): Ditto. (*__falsedep_nf): Ditto. (_hi_nf): Ditto. (popcount2_nf): Ditto. (*popcount2_falsedep_nf): Ditto. (popcounthi2_nf)

RE: [PATCH v2 2/8] [APX NF] Support APX NF for {sub/and/or/xor/neg}

2024-05-22 Thread Kong, Lingling
Cc Uros. From: Kong, Lingling Sent: Wednesday, May 22, 2024 4:35 PM To: gcc-patches@gcc.gnu.org Cc: Liu, Hongtao ; Kong, Lingling Subject: [PATCH v2 2/8] [APX NF] Support APX NF for {sub/and/or/xor/neg} gcc/ChangeLog: * config/i386/i386.md (nf_and_applied): New subst_attr.

[committed] libstdc++: Ensure std::variant relops convert to bool [PR115145]

2024-05-22 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- Ensure that the result of comparing the variant alternatives is converted to bool immediately rather than copied. libstdc++-v3/ChangeLog: PR libstdc++/115145 * include/std/variant (operator==, operator!=, operator<) (operato

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
Jakub Jelinek 于2024年2月10日周六 17:41写道: > > Hi! > > In the previous patch I haven't touched the gcc diagnostic routines, > using HOST_SIZE_T_PRINT* for those is obviously undesirable because we > want the strings to be translatable. We already have %w[diox] for > HOST_WIDE_INT arguments, this patch

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote: > > --- gcc/gcc.cc.jj 2024-02-09 14:54:09.141489744 +0100 > > +++ gcc/gcc.cc 2024-02-09 22:04:37.655678742 +0100 > > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m > > if (*p1++ != '<' || p[-2] != '>')

Re: [PATCH] libstdc++: Implement std::formatter without [PR115099]

2024-05-22 Thread Jonathan Wakely
Pushed to trunk. Backport to gcc-14 to follow. On Fri, 17 May 2024 at 14:45, Jonathan Wakely wrote: > > Does anybody see any issue with the drive-by fixes to constraint > std::formatter to only work for pointers and integers (since > we don't know how to format pthread_t if it's an arbitrary stru

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
Jakub Jelinek 于2024年5月22日周三 17:14写道: > > On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote: > > > --- gcc/gcc.cc.jj 2024-02-09 14:54:09.141489744 +0100 > > > +++ gcc/gcc.cc 2024-02-09 22:04:37.655678742 +0100 > > > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m > >

Re: [PATCH] Fix mixed input kind permute optimization

2024-05-22 Thread Richard Sandiford
Richard Sandiford writes: > Richard Biener writes: >> When change_vec_perm_layout runs into a permute combining two >> nodes where one is invariant and one internal the partition of >> one input can be -1 but the other might not be. The following >> supports this case by simply ignoring inputs w

RE: [PATCH 2/4]AArch64: add new tuning param and attribute for enabling conditional early clobber

2024-05-22 Thread Tamar Christina
> > Sorry for the bike-shedding, but how about something like "avoid_pred_rmw"? > (I'm open to other suggestions.) Just looking for something that describes > either the architecture or the end result that we want to achieve. > And preferable something fairly short :) > > avoid_* would be consis

[PATCH 4/4]AArch64: enable new predicate tuning for Neoverse cores.

2024-05-22 Thread Tamar Christina
Hi All, This enables the new tuning flag for Neoverse V1, Neoverse V2 and Neoverse N2. It is kept off for generic codegen. Note the reason for the +sve even though they are in aarch64-sve.exp is if the testsuite is ran with a forced SVE off option, e.g. -march=armv8-a+nosve then the intrinsics en

[PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Tamar Christina
Hi All, This patch adds new alternatives to the patterns which are affected. The new alternatives with the conditional early clobbers are added before the normal ones in order for LRA to prefer them in the event that we have enough free registers to accommodate them. In case register pressure is

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread Jakub Jelinek
On Wed, May 22, 2024 at 05:23:33PM +0800, YunQiang Su wrote: > Jakub Jelinek 于2024年5月22日周三 17:14写道: > > > > On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote: > > > > --- gcc/gcc.cc.jj 2024-02-09 14:54:09.141489744 +0100 > > > > +++ gcc/gcc.cc 2024-02-09 22:04:37.655678742 +0100 >

Re: [PATCH] Fix mixed input kind permute optimization

2024-05-22 Thread Richard Biener
On Wed, 22 May 2024, Richard Sandiford wrote: > Richard Sandiford writes: > > Richard Biener writes: > >> When change_vec_perm_layout runs into a permute combining two > >> nodes where one is invariant and one internal the partition of > >> one input can be -1 but the other might not be. The fo

[PATCH] tree-optimization/115144 - improve sinking destination choice

2024-05-22 Thread Richard Biener
When sinking code closer to its uses we already try to minimize the distance we move by inserting at the start of the basic-block. The following makes sure to sink closest to the control dependence check of the region we want to sink to as well as make sure to ignore control dependences that are o

Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
Jakub Jelinek 于2024年5月22日周三 17:33写道: > > On Wed, May 22, 2024 at 05:23:33PM +0800, YunQiang Su wrote: > > Jakub Jelinek 于2024年5月22日周三 17:14写道: > > > > > > On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote: > > > > > --- gcc/gcc.cc.jj 2024-02-09 14:54:09.141489744 +0100 > > > > > +

[PATCH] RISC-V: Add Zfbfmin extension

2024-05-22 Thread Xiao Zeng
1 In the previous patch, the libcall for BF16 was implemented: 2 Riscv provides Zfbfmin extension, which completes the "Scalar BF16 Converts":

Re: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This patch adds new alternatives to the patterns which are affected. The new > alternatives with the conditional early clobbers are added before the normal > ones in order for LRA to prefer them in the event that we have enough free > registers to accommodate

[PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-05-22 Thread YunQiang Su
If `find_a_program` cannot find `as/ld/objcopy` and we are a cross toolchain, the final fallback is `as/ld` of system. In fact, we can have a try with -as/ld/objcopy before fallback to native as/ld/objcopy. This patch is derivatived from Debian's patch: gcc-search-prefixed-as-ld.diff gcc

[PATCH v2 2/2] driver: Search -as/ld/objcopy before non-triple ones

2024-05-22 Thread YunQiang Su
When looking for as/ld/objcopy, `find_a_program/file_at_path` only try to find the raw name, but won't find the one with - prefix. This patch is derivatived from Debian's patch: gcc-search-prefixed-as-ld.diff gcc * gcc.cc(for_each_path): Add more space for -. (file_at_path): S

Re: [Patch, aarch64, middle-end] v3: Move pair_fusion pass from aarch64 to middle-end

2024-05-22 Thread Alex Coplan
Hi Ajit, You need to remove the header dependencies that are no longer required for aarch64-ldp-fusion.o in t-aarch64 (not forgetting to update the ChangeLog). A few other minor nits below. LGTM with those changes, but you'll need Richard S to approve. Thanks a lot for doing this. On 22/05/202

Re: [committed][wwwdocs] gcc-12/changes.html: Document RISC-V changes

2024-05-22 Thread Gerald Pfeifer
On Fri, 17 May 2024, Palmer Dabbelt wrote: > Ya, I guess it's kind of an odd phrasing. Maybe it should be something like Yes, this would have helped me understand. Thank you. >The vector and scalar crypto extensions are now accepted in ISA strings >via the -march argument. Note that ena

Re: [PATCH v2] testsuite: Verify r0-r3 are extended with CMSE

2024-05-22 Thread Richard Earnshaw (lists)
On 06/05/2024 12:50, Torbjorn SVENSSON wrote: > Hi, > > Forgot to mention when I sent the patch that I would like to commit it to the > following branches: > > - releases/gcc-11 > - releases/gcc-12 > - releases/gcc-13 > - releases/gcc-14 > - trunk > Well you can [commit it to the release branc

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Richard Biener
On Tue, 21 May 2024, Richard Biener wrote: > The gcc.dg/vect/slp-12a.c case is interesting as we currently split > the 8 store group into lanes 0-5 which we SLP with an unroll factor > of two (on x86-64 with SSE) and the remaining two lanes are using > interleaving vectorization with a final unrol

RE: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, May 22, 2024 10:48 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 3/4]AArch64: add new alternative with early clobber to > p

Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch extracts the ix86 implementation for expanding a SYMBOL > into its corresponding dllimport, far-address, or refptr symbol. > It will be reused in the aarch64-w64-mingw32 target. > The implementation is copied as is from i386/i386.cc with > minor changes to follow

Re: [PATCH v2] testsuite: Verify r0-r3 are extended with CMSE

2024-05-22 Thread Torbjorn SVENSSON
Hello Richard, Thanks for the reply. From my point of view, at least the -fshort-enums part should be on all branches. Just to be clean, maybe it's easier to backport the entire patch? Unless you have an objection, I would like to go ahead and just backport it to all branches. Kind regards

Re: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns

2024-05-22 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Wednesday, May 22, 2024 10:48 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH 3/4]AArch64: add new alte

[PATCH] LoongArch: Guard REGNO with REG_P in loongarch_expand_conditional_move [PR115169]

2024-05-22 Thread Xi Ruoyao
gcc/ChangeLog: PR target/115169 * config/loongarch/loongarch.cc (loongarch_expand_conditional_move): Guard REGNO with REG_P. --- Bootstrapped with --enable-checking=all. Ok for trunk and 14? gcc/config/loongarch/loongarch.cc | 17 - 1 file changed, 12 in

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Richard Sandiford
Richard Biener writes: > On Tue, 21 May 2024, Richard Biener wrote: > >> The gcc.dg/vect/slp-12a.c case is interesting as we currently split >> the 8 store group into lanes 0-5 which we SLP with an unroll factor >> of two (on x86-64 with SSE) and the remaining two lanes are using >> interleaving v

Re: [PATCH v1 3/6] Rename functions for reuse in AArch64

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch renames functions related to dllimport/dllexport > and selectany functionality. These functions will be reused > in the aarch64-w64-mingw32 target. > > gcc/ChangeLog: > > * config/i386/cygming.h (mingw_pe_record_stub): > Rename functions in mingw fold

[PATCH] [tree-optimization/110279] fix testcase pr110279-1.c

2024-05-22 Thread Di Zhao OS
The test case is for targets that support FMA. Previously the "target" selector is missed in dg-final command. Tested on x86_64-pc-linux-gnu. Thanks Di Zhao gcc/testsuite/ChangeLog: * gcc.dg/pr110279-1.c: add target selector. --- gcc/testsuite/gcc.dg/pr110279-1.c | 2 +- 1 file change

Re: [PATCH v2] testsuite: Verify r0-r3 are extended with CMSE

2024-05-22 Thread Richard Earnshaw (lists)
On 22/05/2024 12:14, Torbjorn SVENSSON wrote: > Hello Richard, > > Thanks for the reply. > > From my point of view, at least the -fshort-enums part should be on all > branches. Just to be clean, maybe it's easier to backport the entire patch? Yes, that's a fair point. I was only thinking about

Re: [PATCH v1 4/6] aarch64: Add selectany attribute handling

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > This patch extends the aarch64 attributes list with the selectany > attribute for the aarch64-w64-mingw32 target and reuses the mingw > implementation to handle it. > > * config/aarch64/aarch64.cc: > Extend the aarch64 attributes list. > * config/aarch64/c

Re: [PATCH v1 5/6] Adjust DLL import/export implementation for AArch64

2024-05-22 Thread Richard Sandiford
Evgeny Karpov writes: > The DLL import/export mingw implementation, originally from ix86, requires > minor adjustments to be compatible with AArch64. > > gcc/ChangeLog: > > * config/mingw/mingw32.h (defined): Use the correct DllMainCRTStartup > entry function. > * config/mingw/wi

Re: [PATCH] aarch64: Fold vget_high_* intrinsics to BIT_FIELD_REF [PR102171]

2024-05-22 Thread Richard Sandiford
Pengxuan Zheng writes: > This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold vget_high_* > intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from > arm_neon.h to use the new intrinsics framework. > > PR target/102171 > > gcc/ChangeLog: > > * config/aarch6

RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Jivan Hakobyan
After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on RV32 and instead of a conversation sequence we get calls to appropriate library functions. gcc/testsuite/ChangeLog: * testsuite/gcc.target/riscv/round_32.c: F

[PATCH 1/2][v2] Avoid splitting store dataref groups during SLP discovery

2024-05-22 Thread Richard Biener
The following avoids splitting store dataref groups during SLP discovery but instead forces (eventually single-lane) consecutive lane SLP discovery for all lanes of the group, creating VEC_PERM SLP nodes merging them so the store will always cover the whole group. With this for example int x[1024

[PATCH 2/2][v2] RISC-V: Testsuite updates

2024-05-22 Thread Richard Biener
The gcc.dg/vect/slp-12a.c case is interesting as we currently split the 8 store group into lanes 0-5 which we SLP with an unroll factor of two (on x86-64 with SSE) and the remaining two lanes are using interleaving vectorization with a final unroll factor of four. Thus we're using hybrid SLP withi

[PING] Re: [PATCH 1/2] ivopts: Revert computation of address cost complexity

2024-05-22 Thread Aleksandar Rakic
Hi! I'd like to ping the following patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647966.html a patch for the computation of the complexity for the unsupported addressing modes in ivopts This patch should be a fix for the bug which is described on the following link: https:/

Re: [PATCH] c++: canonicity of fn types w/ complex eh specs [PR115159]

2024-05-22 Thread Patrick Palka
On Tue, 21 May 2024, Jason Merrill wrote: > On 5/21/24 21:55, Patrick Palka wrote: > > On Tue, 21 May 2024, Jason Merrill wrote: > > > > > On 5/21/24 17:27, Patrick Palka wrote: > > > > On Tue, 21 May 2024, Jason Merrill wrote: > > > > > > > > > On 5/21/24 15:36, Patrick Palka wrote: > > > > > >

Re: [PATCH v2] Match: Extract integer_types_ternary_match helper to avoid code dup [NFC]

2024-05-22 Thread Richard Biener
On Mon, May 20, 2024 at 1:00 PM wrote: > > From: Pan Li > > There are sorts of match pattern for SAT related cases, there will be > some duplicated code to check the dest, op_0, op_1 are same tree types. > Aka ternary tree type matches. Thus, extract one helper function to > do this and avoid m

Re: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Wed, May 22, 2024 at 3:17 AM wrote: > > From: Pan Li > > This patch would like to support the __builtin_add_overflow branch form for > unsigned SAT_ADD. For example as below: > > uint64_t > sat_add (uint64_t x, uint64_t y) > { > uint64_t ret; > return __builtin_add_overflow (x, y, &ret) ?

Re: [PATCH v1 1/2] Match: Support __builtin_add_overflow for branchless unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Sun, May 19, 2024 at 8:37 AM wrote: > > From: Pan Li > > This patch would like to support the branchless form for unsigned > SAT_ADD when leverage __builtin_add_overflow. For example as below: > > uint64_t sat_add_u(uint64_t x, uint64_t y) > { > uint64_t ret; > uint64_t overflow = __built

Re: [PATCH v1 1/2] Match: Support branch form for unsigned SAT_ADD

2024-05-22 Thread Richard Biener
On Mon, May 20, 2024 at 1:50 PM Tamar Christina wrote: > > Hi Pan, > > > -Original Message- > > From: pan2...@intel.com > > Sent: Monday, May 20, 2024 12:01 PM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Tamar Christina > > ; richard.guent...@gmail.

Re: [PATCH] c++: canonicity of fn types w/ complex eh specs [PR115159]

2024-05-22 Thread Jason Merrill
On 5/22/24 09:01, Patrick Palka wrote: On Tue, 21 May 2024, Jason Merrill wrote: On 5/21/24 21:55, Patrick Palka wrote: On Tue, 21 May 2024, Jason Merrill wrote: On 5/21/24 17:27, Patrick Palka wrote: On Tue, 21 May 2024, Jason Merrill wrote: On 5/21/24 15:36, Patrick Palka wrote: Bootst

Re: [PATCH 3/4] Avoid splitting store dataref groups during SLP discovery

2024-05-22 Thread Richard Biener
On Tue, 21 May 2024, Richard Sandiford wrote: > Richard Biener writes: > > The following avoids splitting store dataref groups during SLP > > discovery but instead forces (eventually single-lane) consecutive > > lane SLP discovery for all lanes of the group, creating VEC_PERM > > SLP nodes mergin

[PATCH RFC] c++: add module extensions

2024-05-22 Thread Jason Merrill
Tested x86_64-pc-linux-gnu. Any thoughts about the mkdeps output? -- 8< -- There is a trend in the broader C++ community to use a different extension for module interface units, even though they are compiled in the same way as other source files. Let's also support these extensions. .ixx is th

RE: [PATCH v2] Match: Extract integer_types_ternary_match helper to avoid code dup [NFC]

2024-05-22 Thread Li, Pan2
Thanks Richard for comments. > I think it's more useful to add an overload to types_match with three > arguments and then use > (if (INTEGRAL_TYPE_P (type) > && types_match (type, TREE_TYPE (@0), TREE_TYPE (@1)) Sure thing, will try to add overloaded types_match here. Pan -Original M

Re: [PATCH] rs6000: Don't pass -many to the assembler [PR112868]

2024-05-22 Thread Peter Bergner
On 5/21/24 8:27 AM, jeevitha wrote: > The following patch has been bootstrapped and regtested with default > configuration > [--enable-checking=yes] and with --enable-checking=release on > powerpc64le-linux. > > This patch removes passing the -many assembler option for release builds. Now, > GCC

Re: [PATCH v1 2/6] Extract ix86 dllimport implementation to mingw

2024-05-22 Thread Evgeny Karpov
Wednesday, May 22, 2024 1:06 PM Richard Sandiford wrote: > This looks good to me apart from a couple of very minor comments below, but > please get approval from the x86 maintainers as well. In particular, they > might > prefer to handle ix86_legitimize_pe_coff_symbol in some other way. Thanks

RE: [PATCH v1 1/2] Match: Support __builtin_add_overflow for branchless unsigned SAT_ADD

2024-05-22 Thread Li, Pan2
Thanks Richard for comments, will merge the rest form of .SAT_ADD in one middle end patch for fully picture, as well as comments addressing. Pan -Original Message- From: Richard Biener Sent: Wednesday, May 22, 2024 9:16 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai

Re: [PATCH] rs6000: Don't pass -many to the assembler [PR112868]

2024-05-22 Thread Segher Boessenkool
Hi! On Wed, May 22, 2024 at 09:29:13AM -0500, Peter Bergner wrote: > On 5/21/24 8:27 AM, jeevitha wrote: > > The following patch has been bootstrapped and regtested with default > > configuration > > [--enable-checking=yes] and with --enable-checking=release on > > powerpc64le-linux. > > > > Th

Re: [PATCH] [tree-optimization/110279] fix testcase pr110279-1.c

2024-05-22 Thread Jeff Law
On 5/22/24 5:46 AM, Di Zhao OS wrote: The test case is for targets that support FMA. Previously the "target" selector is missed in dg-final command. Tested on x86_64-pc-linux-gnu. Thanks Di Zhao gcc/testsuite/ChangeLog: * gcc.dg/pr110279-1.c: add target selector. Rather than list

Re: [PATCH 4/4] Testsuite updates

2024-05-22 Thread Jeff Law
On 5/22/24 4:58 AM, Richard Biener wrote: RISC-V CI didn't trigger (not sure what magic is required). Both ARM and AARCH64 show that the "Vectorizing stmts using SLP" are a bit fragile because we sometimes cancel SLP becuase we want to use load/store-lanes. The RISC-V tag on the subject li

[x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Roger Sayle
This single line patch fixes a strange quirk/glitch in i386's rtx_costs, which considers an instruction loading a 64-bit constant to be significantly cheaper than loading a 32-bit (or smaller) constant. Consider the two functions: unsigned long long foo() { return 0x0123456789abcdefULL; } unsigned

Re: [PATCH] Fix PR rtl-optimization/115038

2024-05-22 Thread Jeff Law
On 5/20/24 1:13 AM, Eric Botcazou wrote: Hi, this is a regression present on mainline and 14 branch under the form of an ICE in seh_cfa_offset from config/i386/winnt.cc on the attached C++ testcase compiled with -O2 -fno-omit-frame-pointer. The problem directly comes from the -ffold-mem-offs

Re: [x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Uros Bizjak
On Wed, May 22, 2024 at 5:15 PM Roger Sayle wrote: > > This single line patch fixes a strange quirk/glitch in i386's rtx_costs, > which considers an instruction loading a 64-bit constant to be significantly > cheaper than loading a 32-bit (or smaller) constant. > > Consider the two functions: > un

Re: [PATCH] Fix auto deduction for template specialization scopes [114915].

2024-05-22 Thread Jason Merrill
Thanks for the patch! Please review https://gcc.gnu.org/contribute.html for more details of the format patches should have. In particular, you don't seem to have a copyright assignment on file with the FSF, so you'll need to either do that or certify that the contribution is under the DCO.

Re: [PATCH] Fix auto deduction for template specialization scopes [114915].

2024-05-22 Thread Patrick Palka
On Wed, 22 May 2024, Jason Merrill wrote: > Thanks for the patch! > > Please review https://gcc.gnu.org/contribute.html for more details of the > format patches should have. In particular, you don't seem to have a copyright > assignment on file with the FSF, so you'll need to either do that or c

Re: [PATCH] Fix auto deduction for template specialization scopes [114915].

2024-05-22 Thread Jason Merrill
On 5/22/24 12:48, Patrick Palka wrote: On Wed, 22 May 2024, Jason Merrill wrote: Thanks for the patch! Please review https://gcc.gnu.org/contribute.html for more details of the format patches should have. In particular, you don't seem to have a copyright assignment on file with the FSF, so yo

Re: [x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Richard Biener
> Am 22.05.2024 um 17:30 schrieb Uros Bizjak : > > On Wed, May 22, 2024 at 5:15 PM Roger Sayle > wrote: >> >> This single line patch fixes a strange quirk/glitch in i386's rtx_costs, >> which considers an instruction loading a 64-bit constant to be significantly >> cheaper than loading a 32

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Jeff Law
On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on RV32 and instead of a conversation sequence we get calls to appropriate library functions. gcc/testsuite/ChangeLog:  

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Palmer Dabbelt
On Wed, 22 May 2024 11:01:16 PDT (-0700), jeffreya...@gmail.com wrote: On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on RV32 and instead of a conversation sequence we get

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-22 Thread Qing Zhao
> On May 22, 2024, at 03:38, Richard Biener wrote: > > On Tue, May 21, 2024 at 11:36 PM David Malcolm wrote: >> >> On Tue, 2024-05-21 at 15:13 +, Qing Zhao wrote: >>> Thanks for the comments and suggestions. >>> On May 15, 2024, at 10:00, David Malcolm wrote: On Tue

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Jeff Law
On 5/22/24 12:15 PM, Palmer Dabbelt wrote: On Wed, 22 May 2024 11:01:16 PDT (-0700), jeffreya...@gmail.com wrote: On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test started to fail. The reason is that we prevent rounding DF->SI->DF on RV3

Re: RISC-V: Fix round_32.c test on RV32

2024-05-22 Thread Palmer Dabbelt
On Wed, 22 May 2024 12:02:26 PDT (-0700), jeffreya...@gmail.com wrote: On 5/22/24 12:15 PM, Palmer Dabbelt wrote: On Wed, 22 May 2024 11:01:16 PDT (-0700), jeffreya...@gmail.com wrote: On 5/22/24 6:47 AM, Jivan Hakobyan wrote: After 8367c996e55b2 commit several checks on round_32.c test st

Re: [PATCH v2] testsuite: Verify r0-r3 are extended with CMSE

2024-05-22 Thread Torbjorn SVENSSON
Hi, I've now pushed the below change to the following branches with the corresponding commit id. trunk: 9ddad76e98ac8f257f90b3814ed3c6ba78d0f3c7 releases/gcc-14: da3a6b0dda45bc676bb985d7940853b50803e11a releases/gcc-13: 75d394c20b0ad85dfe8511324d61d13e453c9285 releases/gcc-12: d9c89402b54be4c1

Re: [PATCH] aarch64: Fold vget_high_* intrinsics to BIT_FIELD_REF [PR102171]

2024-05-22 Thread Andrew Pinski
On Wed, May 22, 2024 at 5:28 AM Richard Sandiford wrote: > > Pengxuan Zheng writes: > > This patch is a follow-up of r15-697-ga2e4fe5a53cf75 to also fold > > vget_high_* > > intrinsics to BIT_FILED_REF and remove the vget_high_* definitions from > > arm_neon.h to use the new intrinsics framework

Re: [PATCH v4] c++: fix constained auto deduction in templ spec scopes [PR114915]

2024-05-22 Thread Jason Merrill
OK, on the right patch this time I hope. Looks like you still need either FSF copyright assignment or DCO certification per https://gcc.gnu.org/contribute.html#legal On 5/15/24 13:27, Seyed Sajad Kahani wrote: This patch resolves PR114915 by replacing the logic that fills in the missing level

Re: [PATCH-1v2, rs6000] Implement optab_isinf for SFDF and IEEE128

2024-05-22 Thread Peter Bergner
On 5/19/24 10:28 PM, HAO CHEN GUI wrote: > +(define_expand "isinf2" > + [(use (match_operand:SI 0 "gpc_reg_operand")) > + (use (match_operand:SFDF 1 "gpc_reg_operand"))] > + "TARGET_HARD_FLOAT && TARGET_P9_VECTOR" > +{ > + emit_insn (gen_xststdcp (operands[0], operands[1], GEN_INT (0x30))); >

[committed] libstdc++: Guard use of sized deallocation [PR114940]

2024-05-22 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. Backport needed too. -- >8 -- Clang does not enable -fsized-deallocation by default, which means it can't compile our and headers. Make the __cpp_lib_generator macro depend on the compiler-defined __cpp_sized_deallocation macro, and change to use unsized

[committed] libstdc++: Add [[nodiscard]] to some std::locale functions

2024-05-22 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- libstdc++-v3/ChangeLog: * include/bits/locale_classes.h (locale::combine) (locale::name, locale::operator==, locale::operator!=) (locale::operator(), locale::classic): Add nodiscard attribute. * include/bits/l

[committed] libstdc++: Fix effects of combining locales [PR108323]

2024-05-22 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- This fixes a bug in locale::combine where we fail to meet the standard's requirement that the result is unnamed. It also implements two library issues related to the names of combined locales (2295 and 3676). libstdc++-v3/ChangeLog: PR libs

libstdc++: the specialization atomic_ref should use the primary template

2024-05-22 Thread Lebrun-Grandie, Damien
See patch attached to this email. Best, Damien 0001-libstdc-the-specialization-atomic_ref-bool-should-us.patch Description: 0001-libstdc-the-specialization-atomic_ref-bool-should-us.patch

[PATCH v4] Match: Add overloaded types_match to avoid code dup [NFC]

2024-05-22 Thread pan2 . li
From: Pan Li There are sorts of match pattern for SAT related cases, there will be some duplicated code to check the dest, op_0, op_1 are same tree types. Aka ternary tree type matches. Thus, add overloaded types_match func do this and avoid match code duplication. The below test suites are p

Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-22 Thread Carl Love
Kewen: On 5/13/24 22:44, Kewen.Lin wrote: >> perform the same operation as setting a specific element in the vector in >> C code. For example: >> >> src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index); >> src_v4si[index] = int_val; >> >> The built-in actually generates more instructi

[PATCH] aarch64: testsuite: Explicitly add -mlittle-endian to vget_low_2.c

2024-05-22 Thread Pengxuan Zheng
vget_low_2.c is a test case for little-endian, but we missed the -mlittle-endian flag in r15-697-ga2e4fe5a53cf75. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vget_low_2.c: Add -mlittle-endian. Signed-off-by: Pengxuan Zheng --- gcc/testsuite/gcc.target/aarch64/vget_low_2.c | 2 +- 1 f

[PATCH] missing reuire target has_arch_ppc64 for pr106550.c

2024-05-22 Thread Jiufu Guo
Hi, Case pr106550.c is testing constant building for 64bit register. So, this case requires target of has_arch_ppc64. Bootstrap and regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff(Jiufu) Guo --- gcc/testsuite/gcc.target/powerpc/pr106550.c | 1 + 1 file changed, 1 insertion(+) diff

Re: [V2 PATCH] Don't reduce estimated unrolled size for innermost loop at cunrolli.

2024-05-22 Thread Hongtao Liu
On Wed, May 22, 2024 at 1:07 PM liuhongt wrote: > > >> Hard to find a default value satisfying all testcases. > >> some require loop unroll with 7 insns increment, some don't want loop > >> unroll w/ 5 insn increment. > >> The original 2/3 reduction happened to meet all those testcases(or the > >>

Re: [PATCH 13/13] rs6000, remove vector set and vector init built-ins.

2024-05-22 Thread Kewen.Lin
Hi Carl, on 2024/5/23 08:29, Carl Love wrote: > Kewen: > > On 5/13/24 22:44, Kewen.Lin wrote: >>> perform the same operation as setting a specific element in the vector in >>> C code. For example: >>> >>> src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index); >>> src_v4si[index] = int

Re: [PATCH] missing reuire target has_arch_ppc64 for pr106550.c

2024-05-22 Thread Kewen.Lin
Hi Jeff, subject typo: s/reuire/require/ on 2024/5/23 09:11, Jiufu Guo wrote: > Hi, > > Case pr106550.c is testing constant building for 64bit > register. So, this case requires target of has_arch_ppc64. > Nit: Maybe add more comments saying it fails with -m32 without having the expected rldim

[PATCH] .gitattributes: disable crlf translation

2024-05-22 Thread Peter Damianov
By default, git has the "autocrlf" """feature""" enabled. This causes the files to have CRLF line endings when checked out on windows, which in the case of configure, causes confusing errors like: ./gcc/configure: line 14: $'\r': command not found ./gcc/configure: line 29: syntax error near unexpe

Re: [PATCH] AARCH64: Add Qualcomnm oryon-1 core

2024-05-22 Thread Andrew Pinski
On Tue, May 14, 2024 at 10:27 AM Kyrill Tkachov wrote: > > Hi Andrew, > > On Fri, May 3, 2024 at 8:50 PM Andrew Pinski wrote: >> >> This patch adds Qualcomm's new oryon-1 core; this is enough >> to recongize the core and later on will add the tuning structure. >> >> gcc/ChangeLog: >> >> *

RE: [PATCH v2] Match: Support __builtin_add_overflow branch form for unsigned SAT_ADD

2024-05-22 Thread Li, Pan2
Thanks Richard for reviewing. > I'm not convinced we should match this during early if-conversion, should we? > The middle-end doesn't really know .SAT_ADD but some handling of > .ADD_OVERFLOW is present. I tried to do the branch (aka cond) match in widen-mult pass similar as previous branchless

[PATCH] Avoid vector -Wfree-nonheap-object warnings

2024-05-22 Thread François Dumont
As explained in this email: https://gcc.gnu.org/pipermail/libstdc++/2024-April/058552.html I experimented -Wfree-nonheap-object because of my enhancements on algos. So here is a patch to extend the usage of the _Guard type to other parts of vector.     libstdc++: Use RAII to replace try/catc

  1   2   >