Re: [PATCH] Fix mismatch between constraint and predicate for ashl3_doubleword.

2024-07-26 Thread Hongtao Liu
On Fri, Jul 26, 2024 at 2:59 PM liuhongt wrote: > > (insn 98 94 387 2 (parallel [ > (set (reg:TI 337 [ _32 ]) > (ashift:TI (reg:TI 329) > (reg:QI 521))) > (clobber (reg:CC 17 flags)) > ]) "test.c":11:13 953 {ashlti3_doubleword} >

Re: [PATCH v2 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Kyrylo Tkachov
Hi Claudio, > On 25 Jul 2024, at 16:25, Claudio Bantaloukas > wrote: > > External email: Use caution opening links or attachments > > > This introduces the relevant flags to enable access to the fpmr register and > fp8 intrinsics, which will be added subsequently. > > gcc/ChangeLog: > >

[PATCH v2] i386: Fix AVX512 intrin macro typo

2024-07-26 Thread Haochen Jiang
Hi all, I have added related testcases into the patch. Ok for trunk and backport to GCC 14, GCC 13 and GCC 12? Thx, Haochen --- Changes in v2: Add related testcases --- There are several typo in AVX512 intrins macro define. Correct them to solve errors when compiled with -O0. gcc/ChangeLog

Re: [PATCH v2 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Kyrylo Tkachov
Hi Claudio, > On 25 Jul 2024, at 16:25, Claudio Bantaloukas > wrote: > > External email: Use caution opening links or attachments > > > The ACLE declares several helper types and functions to > facilitate construction of `fpm` arguments. > > gcc/ChangeLog: > >* config/aarch64/arm_ac

[PATCH v5 0/3] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2024-07-26 Thread Manolis Tsamis
noce_convert_multiple_sets has been introduced and extended over time to handle if conversion for blocks with multiple sets. Currently this is focused on register moves and rejects any sort of arithmetic operations. This series is an extension to allow more sequences to take part in if conversio

[PATCH v5 2/3] [RFC] ifcvt: Allow more operations in multiple set if conversion

2024-07-26 Thread Manolis Tsamis
Currently the operations allowed for if conversion of a basic block with multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by bb_ok_for_noce_convert_multiple_sets). This commit allows more operations (arithmetic, compare, etc) to participate in if conversion. The target's prof

[PATCH v5 3/3] [RFC] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_

[PATCH v5 1/3] [RFC] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it ass

Re: [PING] [PATCH v4 1/3] [RFC] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
On Thu, Jul 11, 2024 at 1:03 AM Jeff Law wrote: > > > > On 6/3/24 5:34 AM, Manolis Tsamis wrote: > > This is an extension of what was done in PR106590. > > > > Currently if a sequence generated in noce_convert_multiple_sets clobbers the > > condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is us

Re: [PATCH v4 1/3] [RFC] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
On Wed, Jun 5, 2024 at 2:00 PM Richard Sandiford wrote: > > Sorry for the slow review. > > Manolis Tsamis writes: > > This is an extension of what was done in PR106590. > > > > Currently if a sequence generated in noce_convert_multiple_sets clobbers the > > condition rtx (cc_cmp or rev_cc_cmp) th

[PATCH] i386: Mark target option with optimization when enabled with opt level [PR116065]

2024-07-26 Thread Hongyu Wang
Hi, When introducing munroll-only-small-loops, the option was marked as Target Save and added to -O2 default which makes attribute(optimize) resets target option and causing error when cmdline has O1 and funciton attribute has O2 and other target options. Mark this option as Optimization to fix.

[PATCH v2] i386: Add non-optimize prefetchi intrins

2024-07-26 Thread Haochen Jiang
Hi all, I added related O0 testcase in this patch. Ok for trunk and backport to GCC 14 and GCC 13? Thx, Haochen --- Changes in v2: Add testcases. --- Under -O0, with the "newly" introduced intrins, the variable will be transformed as mem instead of the origin symbol_ref. The compiler will th

Re: [PATCHv2, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-07-26 Thread HAO CHEN GUI
Hi Jeff, 在 2024/7/24 5:57, Jeff Law 写道: > > > On 7/21/24 7:58 PM, HAO CHEN GUI wrote: >> Hi, >>    This patch adds const0 move checking for CLEAR_BY_PIECES. The original >> vec_duplicate handles duplicates of non-constant inputs. But 0 is a >> constant. So even a platform doesn't support vec_dup

[PATCH v1] Match: Support .SAT_SUB with IMM op for form 1-4

2024-07-26 Thread pan2 . li
From: Pan Li This patch would like to support .SAT_SUB when one of the op is IMM. Aka below 1-4 forms. Form 1: #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_1 (T y) \ { \ return IM

[PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, This updates the cost for Neoverse V2 to reflect the updated Software Optimization Guide. It also makes Cortex-X3 use the Neoverse V2 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch

[PATCH 4/8]AArch64: Add Neoverse N3 and Cortex-A725 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse N3 and Cortex-A725. It also makes Cortex-A725 use the Neoverse N3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def

[PATCH 6/8]AArch64: Update Neoverse N2 cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, This updates the cost for Neoverse N2 to reflect the updated Software Optimization Guide. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/neoversen2.h: Update costs. --- diff --git a/gc

[PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse V3. It also makes Cortex-X4 use the Neoverse V3 cost model. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x4): Upda

[PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-07-26 Thread Tamar Christina
Hi All, Gather and scatters are not usually beneficial when the loop count is small. This is because there's not only a cost to their execution within the loop but there is also some cost to enter loops with them. As such this patch models this overhead. For generic tuning we however still prefe

[PATCH 7/8]AArch64: Add Cortex-X925 core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Cortex-X925. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (cortex-x925): New. * config/aarch64/aarch64-tune.md: Regenerate.

[PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This adds a cost model and core definition for Neoverse V3AE. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * config/aarch64/aarch64-tune.md: Regenera

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > For historical reasons AArch64 has TI mode vector types but does not consider > TImode a vector mode. > > What's happening in the PR is that get_vectype_for_scalar_type is returning > vector(1) TImode for a TImode scalar. This then fails when we call > target

[PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Tamar Christina
Hi All, this updates the costs for gener-armv9-a based on the updated costs for Neoverse V2 and Neoverse N2. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/tuning_models/generic_armv9_a.h: Update costs. ---

Re: [PATCH 2/5] aarch64: sve: Rename aarch64_bic to standard pattern, andn

2024-07-26 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 25 Jul 2024, at 04:14, Andrew Pinski wrote: >> >> External email: Use caution opening links or attachments >> >> >> Now there is an optab for bic, andn since r15-1890-gf379596e0ba99d. >> This moves aarch64_bic for sve over to use it instead. >> >> Note unlike the

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:24 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in get_mask_mode > [PR116074]

[PATCH]middle-end: check for vector mode before in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
Hi All, For historical reasons AArch64 has TI mode vector types but does not consider TImode a vector mode. What's happening in the PR is that get_vectype_for_scalar_type is returning vector(1) TImode for a TImode scalar. This then fails when we call targetm.vectorize.get_mask_mode (vecmode).exi

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
> Am 26.07.2024 um 11:29 schrieb Tamar Christina : > >  >> >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:24 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:24 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: check for vector m

Re: [PATCH]middle-end: check for vector mode before in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
> Am 26.07.2024 um 11:40 schrieb Tamar Christina : > > Hi All, > > For historical reasons AArch64 has TI mode vector types but does not consider > TImode a vector mode. > > What's happening in the PR is that get_vectype_for_scalar_type is returning > vector(1) TImode for a TImode scalar. Th

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:43 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in get_mask_mode > [PR116074]

RE: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This is a new version with the confirmed correct part number. An update TRM is being published. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * confi

Re: [PATCH v2 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 08:15, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index e0a641213ae..f293d49c61a 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -21843,6 +

Re: [PATCH v2 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 09:13, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> The ACLE declares several helper types and functions to >> facilitate construction of `fpm` arguments. >

[RESEND PATCH v5 1/3] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it ass

[RESEND PATCH v5 0/3] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2024-07-26 Thread Manolis Tsamis
noce_convert_multiple_sets has been introduced and extended over time to handle if conversion for blocks with multiple sets. Currently this is focused on register moves and rejects any sort of arithmetic operations. This series is an extension to allow more sequences to take part in if conversio

[RESEND PATCH v5 2/3] ifcvt: Allow more operations in multiple set if conversion

2024-07-26 Thread Manolis Tsamis
Currently the operations allowed for if conversion of a basic block with multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by bb_ok_for_noce_convert_multiple_sets). This commit allows more operations (arithmetic, compare, etc) to participate in if conversion. The target's prof

[RESEND PATCH v5 3/3] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_

Re: [RESEND PATCH v5 1/3] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Sam James
Manolis Tsamis writes: > This is an extension of what was done in PR106590. FWIW, I think that if a bug is worth mentioning in the commit message, it's worth tagging so the hooks pick it up (as you get a nice reverse-mapping then if anyone is looking at it and wondering if a follow-up occurred).

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:43 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: check for vector m

[PATCH] RISC-V: Work around bare apostrophe in error string.

2024-07-26 Thread Robin Dapp
Hi, an unquoted apostrophe slipped through when testing the recent V/M extension patch. This, again, re-words the message to "Currently the 'V' implementation requires the 'M' extension". Going to commit as obvious after testing. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (

Re: [PATCH v1 2/2] PR116019: Improve tail call error message

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 12:55 AM Andi Kleen wrote: > > From: Andi Kleen > > The "tail call must be the same type" message is common on some > targets with C++, or without optimization. It is generated > when gcc believes there is an access of the return value > after the call. However usually it

Re: [PATCH v1 1/2] PR116080: Fix tail call dejagnu checks

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 12:55 AM Andi Kleen wrote: > > From: Andi Kleen > > - Run the target_effective tail_call checks without optimization to > match the actual test cases. > - Add an extra check for external tail calls to handle targets like > powerpc that cannot tail call between different ob

Re: [PATCH v2] i386: Fix AVX512 intrin macro typo

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 10:14 AM Haochen Jiang wrote: > > Hi all, > > I have added related testcases into the patch. > > Ok for trunk and backport to GCC 14, GCC 13 and GCC 12? Hmm, it might be OK for 14.2 still, even without a new RC. But please wait until after 14.2 is released unless Jakub al

Re: [PATCH] i386: Mark target option with optimization when enabled with opt level [PR116065]

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 10:50 AM Hongyu Wang wrote: > > Hi, > > When introducing munroll-only-small-loops, the option was marked as > Target Save and added to -O2 default which makes attribute(optimize) > resets target option and causing error when cmdline has O1 and > funciton attribute has O2 an

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 1:15 PM Richard Sandiford wrote: > > Tamar Christina writes: > >> -Original Message- > >> From: Richard Sandiford > >> Sent: Friday, July 26, 2024 10:43 AM > >> To: Tamar Christina > >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > >> ; Marcus Shawcroft >

[PATCH 1/2] ipa: Treat static constructors and destructors as non-local (PR 115815)

2024-07-26 Thread Martin Jambor
Hi, in PR 115815, IPA-SRA thought it had control over all invocations of a (recursive) static destructor but it did not see the implied invocation which led to the original being left behind and the clean-up code encountering uses of SSAs that definitely should have been dead. Fixed by teaching c

[PATCH 2/2] ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra

2024-07-26 Thread Martin Jambor
Hi, when looking at PR 115815 we realized that it would make sense to make calls to functions originally declared static constructors and destructors created by pass_ipa_cdtor_merge visible to IPA-SRA. This patch does that. Bootstrapped and tested on x86_64-linux. OK for master? Thanks, Marti

Re: [PATCH v2] i386: Fix AVX512 intrin macro typo

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 04:10:48PM +0800, Haochen Jiang wrote: > * config/i386/avx512dqintrin.h > (_mm_mask_fpclass_ss_mask): Correct operand order. > (_mm_mask_fpclass_sd_mask): Ditto. > (_mm_reduce_round_sd): Use -1 as mask since it is non-mask. > (_mm_reduce_round_s

Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:19, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This updates the cost for Neoverse V2 to reflect the updated > Software Optimization Guide. > > It also makes Cortex-X3 use the Neoverse V2 cost model.

Re: [PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:20, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Neoverse V3. > > It also makes Cortex-X4 use the Neoverse V3 cost model. > > Bootstrapped Regtested on a

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:10 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model

Re: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 12:26, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This is a new version with the confirmed correct part number. > > An update TRM is being published. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no

Re: [PATCH 4/8]AArch64: Add Neoverse N3 and Cortex-A725 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:20, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Neoverse N3 and Cortex-A725. > > It also makes Cortex-A725 use the Neoverse N3 cost model. > > Bootstrapped Regt

Re: [PATCH 6/8]AArch64: Update Neoverse N2 cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This updates the cost for Neoverse N2 to reflect the updated > Software Optimization Guide. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Re: [PATCH 7/8]AArch64: Add Cortex-X925 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Cortex-X925. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? Ok. Thanks,

Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches

2024-07-26 Thread Richard Biener
On Thu, 18 Jul 2024, Filip Kastl wrote: > On Thu 2024-07-18 12:07:42, Richard Biener wrote: > > On Wed, 17 Jul 2024, Filip Kastl wrote: > > > > > + } > > > > > + > > > > > + vec v; > > > > > + v.create (1); > > > > > + v.quick_push (m_final_bb); > > > > > + iterate_fix_domi

Re: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > this updates the costs for gener-armv9-a based on the updated costs for > Neoverse V2 and Neoverse N2. > > Bootstrapped Regtested on aarch64-none-linux-

Re: [RFC][PATCH 1/5] vect: Fix single_imm_use in tree_vect_patterns

2024-07-26 Thread Richard Biener
On Sun, Jul 21, 2024 at 11:15 AM Feng Xue OS wrote: > > The work for RFC > (https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657860.html) > involves not a little code change, so I have to separate it into several > batches > of patchset. This and the following patches constitute the first bat

RE: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 5/8]AArch64: Update Generic Armv9-a cost m

[PATCH] LoongArch: Expand some SImode operations through "si3_extend" instructions if TARGET_64BIT

2024-07-26 Thread Xi Ruoyao
We already had "si3_extend" insns and we hoped the fwprop or combine passes can use them to remove unnecessary sign extensions. But this does not always work: for cases like x << 1 | y, the compiler tends to do (sign_extend:DI (ior:SI (ashift:SI (reg:SI $r4) (co

Re: [RFC] Generalize formation of lane-reducing ops in loop reduction

2024-07-26 Thread Richard Biener
On Sun, Jul 21, 2024 at 11:12 AM Feng Xue OS wrote: > > Hi, > > I composed some patches to generalize lane-reducing (dot-product is a > typical representative) pattern recognition, and prepared a RFC document so > as to help > review. The original intention was to make a complete solution for

Re: [PATCH] fold: Allow SSA names in inverse_conditions_p and fold VCOND_MASK.

2024-07-26 Thread Richard Biener
On Thu, Jul 25, 2024 at 3:34 PM Robin Dapp wrote: > > Hi, > > In preparation for the maskload else operand I split off this patch. The > patch > looks through SSA names for the conditions passed to inverse_conditions_p > which > helps match.pd recognize more redundant vec_cond expressions. It

Re: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > Gather and scatters are not usually beneficial when the loop count is small. > This is because there's not only a cost to their execution within the loop

Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This updates the cost for Neoverse V2 to reflect the updated > Software Optimization Guide. > > It also makes Cortex-X3 use the Neoverse V2 cost model. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar >

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 2:12 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release > costs

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-07-26 Thread Richard Biener
On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill wrote: > > From: Greg McGary > > gcc/ChangeLog: > * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent > divide-by-zero. > * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice. > --- > No changes in v3. Depends

Re: [PATCH v1] Match: Support .SAT_SUB with IMM op for form 1-4

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 11:20 AM wrote: > > From: Pan Li > > This patch would like to support .SAT_SUB when one of the op > is IMM. Aka below 1-4 forms. > > Form 1: > #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ > T __attribute__((noinline)) \ > sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)

arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-07-26 Thread Andre Vieira (lists)
This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn. It also makes sure that if it does not initially encounter a 'set' in such a form it tries to find another set that could be the right one.

[to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Jeff Law
pr116085 is a long standing (since late 2022) regression on the riscv port. A patch introduced a pattern to avoid unnecessary extensions when doing a min/max operation where one of the values is a 32 bit positive constant. (define_insn_and_split "*minmax" [(set (match_operand:DI 0 "registe

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jason Merrill
On 7/17/24 6:04 PM, Jakub Jelinek wrote: Hi! The following patch implements the easy parts of the paper. When @$` are added to the basic character set, it means that R"@$`()@$`" should now be valid (here I've noticed most of the raw string tests were tested solely with -std=c++11 or -std=gnu++11

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 11:43:13AM -0400, Jason Merrill wrote: > I'm now seeing a -std=c++26 failure on g++.dg/cpp/ucn-1.C. I don't remember seeing it when I wrote the patch, but today I see it as well. The following patch seems to fix that, tested on i686-linux, ok for trunk? 2024-07-26 Jakub

[PATCH] MAINTAINERS: Add myself to write after approval

2024-07-26 Thread Sam James
ChangeLog: * MAINTAINERS: Add myself. --- Pushed. MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 542d058d727c..595140b6f64f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -550,6 +550,7 @@ Andreas Jaeger aj H

[PATCH v3 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (fp8): New. * config/aarch64/aarch64.h (TARGET_FP8): Likewise. * doc/invoke.texi (

[PATCH v3 2/3] aarch64: Add support for moving fpm system register

2024-07-26 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use a an unspec, we treat the fpmr system register like all other registers and use a move

[PATCH v3 0/3] aarch64: Add initial support for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
This series introduces initial flags and functionality for the fp8 feature. Specifically, the following are added: - functions that enable constructing valid fpm register values. - support for the '+fp8' -march modifier. - support for reading and writing the new system register FPMR (Floating Po

[PATCH v3 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h, or arm_sme.h headers is included. These helpers don't map to specific FP8 instructions and there's no expectation that they will produce a

[RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
Hi! I've tried to implement the C++26 fold expanded constraints paper but ran into issues (see below). Would appreciate some guidance/help, I'm afraid I'm stuck. The patch introduces a FOLD_CONSTR tree to represent fold expanded constraints, normalizes for C++26 some {U,BI}NARY_{LEFT,RIGHT}_FOLD

Re: [PATCH 1/5] RISC-V: Small stack tie changes

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: Enable the register used by riscv_emit_stack_tie () to be passed as an argument so we can tie the stack with other registers besides hard_frame_pointer_rtx. Also don't allow operand 1 of stack_tie to be optimized to sp in preparation for the s

Re: [PATCH] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-26 Thread Carl Love
Segher: On 7/24/24 11:47 AM, Segher Boessenkool wrote: Hi! On Wed, Jul 24, 2024 at 11:38:11AM -0700, Carl Love wrote: On 7/24/24 10:03 AM, Segher Boessenkool wrote: So much manual stuff needed, sigh. On Fri, Jul 19, 2024 at 01:04:12PM -0700, Carl Love wrote: gcc/ChangeLog:     * config/rs6

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-07-26 Thread Patrick O'Neill
On 7/26/24 06:30, Richard Biener wrote: On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill wrote: From: Greg McGary gcc/ChangeLog: * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent divide-by-zero. * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice. -

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Jakub Jelinek
On Thu, Jul 25, 2024 at 07:48:38PM -0400, Siddhesh Poyarekar wrote: > The code to scale ranges for wide chars in format_string incorrectly > checks range.likely to scale range.unlikely, which is a copy-paste typo > from the immediate previous condition. > > gcc/ChangeLog: > > gimple-ssa-spr

Re: [PATCH] c++/modules: Ensure deduction guides are always reachable [PR115231]

2024-07-26 Thread Jason Merrill
On 7/26/24 12:52 AM, Nathaniel Shead wrote: On Tue, Jul 23, 2024 at 04:17:22PM -0400, Jason Merrill wrote: On 6/15/24 10:29 PM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? This probably isn't the most efficient approach, since we need to do name look

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jason Merrill
On 7/26/24 11:55 AM, Jakub Jelinek wrote: On Fri, Jul 26, 2024 at 11:43:13AM -0400, Jason Merrill wrote: I'm now seeing a -std=c++26 failure on g++.dg/cpp/ucn-1.C. I don't remember seeing it when I wrote the patch, but today I see it as well. The following patch seems to fix that, tested on i

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Siddhesh Poyarekar
On 2024-07-26 13:11, Jakub Jelinek wrote: On Thu, Jul 25, 2024 at 07:48:38PM -0400, Siddhesh Poyarekar wrote: The code to scale ranges for wide chars in format_string incorrectly checks range.likely to scale range.unlikely, which is a copy-paste typo from the immediate previous condition. gcc/C

Re: [PATCH] testsuite: Add dg-do run to even more tests, fix typo

2024-07-26 Thread Sam James
Sam James writes: > All of these are for wrong-code bugs. Confirmed to be used before but > with no execution. > > Tested on x86_64-pc-linux-gnu and checked test logs before/after. > Pushed as obvious after discussion on IRC. Thanks.

Re: [pushed] c++: #pragma target and deferred instantiation [PR115403]

2024-07-26 Thread Patrick Palka
On Thu, 25 Jul 2024, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > Also built highway to check. > > -- 8< -- > > My patch for 109753 applies the current #pragma target/optimize to a > function when we compile it, which was a problem for a template > instantiation def

[RFC/RFA][PATCH v2 01/12] Implement internal functions for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based CRC is generated. The supported dat

[Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]

2024-07-26 Thread Tobias Burnus
The main idea of 'link' is to permit putting only a subset of a huge array on the device. Well, in order to make this work properly, it requires that one can map an array section, which does not start with the first element. This patch adjusts the pointers such, that this actually works. (Tested

[RFC/RFA][PATCH v2 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs

2024-07-26 Thread Mariam Arutunian
This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the presence of a CRC optab), the builtins wil

[RFC/RFA][PATCH v2 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-07-26 Thread Mariam Arutunian
If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC generation for ZBC, ZBKC and ZBKB targets.

[RFC/RFA][PATCH v2 05/12] i386: Implement new expander for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
This patch introduces two new expanders for the i386 backend, dedicated to generating optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the pclmulqdq or crc32 instructions wh

[RFC/RFA][PATCH v3 06/12] aarch64: Implement new expander for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructi

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Jakub Jelinek wrote: > Hi! > > I've tried to implement the C++26 fold expanded constraints paper but ran > into issues (see below). Would appreciate some guidance/help, I'm afraid > I'm stuck. > > The patch introduces a FOLD_CONSTR tree to represent fold expanded > constrai

Re: [PATCH] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-26 Thread Peter Bergner
On 7/26/24 12:07 PM, Carl Love wrote: > On 7/24/24 11:47 AM, Segher Boessenkool wrote: > +/* { dg-do run { target { int128 } && { power10_hw } } } */ Everything power10 is int128 always. >>> OK, so don't need the power10_hw. Changed to just int128 for the target: >> No, the other way aro

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 01:39:04PM -0400, Siddhesh Poyarekar wrote: > > What exactly the code really wants to do is unclear to me, what does > > the INT_MAX on the target have to do with the minimum/maximum/expected > > sizes of %S or %ls printed strings is unclear, target PTRDIFF_MAX > > I think

[Patch, v2] OpenMP/Fortran: Fix handling of 'declare target' with 'link' clause [PR11555]

2024-07-26 Thread Tobias Burnus
Updated patch - only change is to the testcase: * With the just posted patch for PR116107, array sections with offset work for 'link', hence, I updated the testcase. * For 'arr2', I added ref to the associated PR. I intent to commit it once PR116107 has been committed. Tobias Tobias Burnus

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Patrick Palka wrote: > On Fri, 26 Jul 2024, Jakub Jelinek wrote: > > > Hi! > > > > I've tried to implement the C++26 fold expanded constraints paper but ran > > into issues (see below). Would appreciate some guidance/help, I'm afraid > > I'm stuck. > > > > The patch introd

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 02:35:01PM -0400, Patrick Palka wrote: > > IIUC the way gen_elem_of_pack_expansion_instantiation handles this for > > ordinary pack expnasions is by replacing each ARGUMENT_PACK with an > > ARGUMENT_PACK_SELECT. This ARGUMENT_PACK_SELECT contains the entire > > pack as well

Re: [PATCH 1/5] RISC-V: Small stack tie changes

2024-07-26 Thread Raphael Zinsly
On Fri, Jul 26, 2024 at 2:00 PM Jeff Law wrote: > On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: > ... > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index 46c46039c33..5780c5abacf 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@

Re: [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Philipp Tomsich
Nitpick: a typo slipped into the comment — "regsiter" -> "register". On Fri, 26 Jul 2024 at 16:18, Jeff Law wrote: > > pr116085 is a long standing (since late 2022) regression on the riscv > port. > > A patch introduced a pattern to avoid unnecessary extensions when doing > a min/max operation w

  1   2   >