Re: [PATCH][testsuite]: remove -fwrapv from signbit-5.c

2024-09-04 Thread Torbjorn SVENSSON
On 2024-09-03 20:23, Richard Biener wrote: Am 03.09.2024 um 19:00 schrieb Tamar Christina : Hi All, The meaning of the testcase was changed by passing it -fwrapv. The reason for the test failures on some platform was because the test was testing some implementation defined behavior wrt

RE: [PATCH] Match: Fix ordered and nonequal

2024-09-04 Thread Hu, Lin1
Type wrong hongtao's e-mail address. > -Original Message- > From: Hu, Lin1 > Sent: Wednesday, September 4, 2024 1:44 PM > To: gcc-patches@gcc.gnu.org > Cc: hontao@intel.com; ubiz...@gmail.com; rguent...@suse.de; > ja...@redhat.com; pins...@gmail.com > Subject: [PATCH] Match: Fix order

[PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread pan2 . li
From: Pan Li The gen_phi_on_cond can only support below control flow for cond from day 1. Aka: +--+ | def | | ... | +-+ | cond |-->| def | +--+ | ... | | +-+ | | v | +-+ | | PHI |<--+ +-+ U

[PATCH v1 2/2] Match: Support form 3 for scalar signed integer .SAT_ADD

2024-09-04 Thread pan2 . li
From: Pan Li This patch would like to support the form 3 of the scalar signed integer .SAT_ADD. Aka below example: Form 3: #define DEF_SAT_S_ADD_FMT_3(T, UT, MIN, MAX) \ T __attribute__((noinline))\ sat_s_add_##T##_fmt_3 (T x, T y)

Re: [PATCH 1/4]middle-end: have vect_recog_cond_store_pattern use pattern statement for cond if available

2024-09-04 Thread Richard Biener
On Tue, 3 Sep 2024, Tamar Christina wrote: > Hi All, > > When vectorizing a conditional operation we rely on the bool_recog pattern to > hit and convert the bool of the operand to a valid mask. > > However we are currently not using the converted operand as this is in a > pattern > statement.

Re: [PATCH] coros: mark .CO_YIELD as LEAF [PR106973]

2024-09-04 Thread Richard Biener
On Tue, Sep 3, 2024 at 8:11 PM Arsen Arsenović wrote: > > Tested on x86_64-pc-linux-gnu. OK for trunk? OK > -- >8 -- > We rely on .CO_YIELD calls being followed by an assignment (optionally) > and then a switch/if in the same basic block. This implies that a > .CO_YIELD can nev

Re: [PING] [PATCH] rust: avoid clobbering LIBS

2024-09-04 Thread Richard Biener
On Tue, Sep 3, 2024 at 8:42 PM Marc wrote: > > Richard Biener writes: > > > On Wed, Aug 28, 2024 at 11:10 AM Marc wrote: > >> > >> Hello, > >> > >> Gentle reminder for this simple autoconf patch :) > > > > OK. > > > > Note that completely wiping LIBS might remove requirements detected earlier, >

Re: [PATCH] object-size: Use simple_dce_from_worklist in object-size pass

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 1:58 AM Andrew Pinski wrote: > > While trying to see if there was a way to improve object-size pass > to use the ranger (for pointer plus), I noticed that it leaves around > the statement containing __builtin_object_size if it was reduced to a > constant. > This fixes that

Re: [PATCH] Match: Fix ordered and nonequal

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 9:15 AM Hu, Lin1 wrote: > > Type wrong hongtao's e-mail address. > > > -Original Message- > > From: Hu, Lin1 > > Sent: Wednesday, September 4, 2024 1:44 PM > > To: gcc-patches@gcc.gnu.org > > Cc: hontao@intel.com; ubiz...@gmail.com; rguent...@suse.de; > > ja...@

Re: [PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 9:25 AM wrote: > > From: Pan Li > > The gen_phi_on_cond can only support below control flow for cond > from day 1. Aka: > > +--+ > | def | > | ... | +-+ > | cond |-->| def | > +--+ | ... | >| +-+ >| | >v

RE: [PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread Li, Pan2
> I'm lazy - can you please quote genmatch generated code for the condition for > one case? Sure thing, list the before and after covers all the changes to generated code as blow. Before this patch: basic_block _b1 = gimple_bb (_a1); if (gimple_phi_num_args (_a1) == 2

RE: [r15-3359 Regression] FAIL: gcc.target/i386/avx10_2-bf-vector-cmpp-1.c (test for excess errors) on Linux/x86_64

2024-09-04 Thread Jiang, Haochen
> -Original Message- > From: Richard Biener > Sent: Tuesday, September 3, 2024 2:40 PM > > On Tue, Sep 3, 2024 at 7:36 AM Jiang, Haochen > wrote: > > > > > > > > > From: Hongtao Liu > > > Sent: Tuesday, September 3, 2024 1:47 PM > > > > > > On Tue, Sep 3, 2024 at 9:45 AM Jiang, Haochen

[PATCH] SVE intrinsics: Fold svdiv with all-zero operands to zero vector

2024-09-04 Thread Jennifer Schmitz
This patch folds svdiv where one of the operands is all-zeros to a zero vector, if the predicate is ptrue or the predication is _x or _z. This case was not covered by the recent patch that implemented constant folding, because that covered only cases where both operands are constant vectors. Here,

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene
On 9/3/24 15:07, Jan Hubicka wrote: Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by shuffles result in a higher load band

Re: [nvptx] Fix code-gen for alias attribute

2024-09-04 Thread Thomas Schwinge
Hi! Honza (or others, of course), there's a question about 'ultimate_alias_target'. On 2024-08-26T10:50:36+, Prathamesh Kulkarni wrote: > For the following test (adapted from pr96390.c): > > __attribute__((noipa)) int foo () { return 42; } > int bar () __attribute__((alias ("foo"))); > int b

Add 'gcc.target/nvptx/alias-weak-1.c' (was: [nvptx] Fix code-gen for alias attribute)

2024-09-04 Thread Thomas Schwinge
Hi! On 2024-09-04T11:45:20+0200, I wrote: > +int bar () __attribute__((weak, alias ("foo"))); > Now, that said: GCC/nvptx for such code currently diagnoses > "error: weak alias definitions not supported [...]" ;-| Pushed to trunk branch commit 2267d254eb6ad782cef7b462f2bb2128bc8ace30 "Add 'g

Add 'gcc.target/nvptx/alias-to-alias-1.c' (was: [nvptx] Fix code-gen for alias attribute)

2024-09-04 Thread Thomas Schwinge
Hi! On 2024-09-04T11:45:20+0200, I wrote: > On 2024-08-26T10:50:36+, Prathamesh Kulkarni > wrote: >> For the following test (adapted from pr96390.c): >> >> __attribute__((noipa)) int foo () { return 42; } >> int bar () __attribute__((alias ("foo"))); >> int baz () __attribute__((alias ("bar"

[PING] Handle 'NUM' in 'PUSH_INSERT_PASSES_WITHIN' (was: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)

2024-09-04 Thread Thomas Schwinge
Hi! Ping. On 2024-06-28T15:06:21+0200, I wrote: > As part of this: > > On 2013-07-26T11:04:33-0400, David Malcolm wrote: >> This patch is the hand-written part of the conversion of passes from >> C structs to C++ classes. > >> --- a/gcc/passes.c >> +++ b/gcc/passes.c > > ..., we did hard-code 'P

Fix gimple_debug_cfg declaration (was: [PATCH v2 2/N] Introduce dump_flags_t type and use it instead of int, type)

2024-09-04 Thread Thomas Schwinge
Hi! On 2017-05-17T11:02:09+0200, Martin Liška wrote: > On 05/17/2017 09:44 AM, Richard Biener wrote: >> On Tue, May 16, 2017 at 4:55 PM, Martin Liška wrote: >>> On 05/16/2017 03:48 PM, Richard Biener wrote: On Fri, May 12, 2017 at 3:00 PM, Martin Liška wrote: > Second part changes 'int

Fix branch prediction dump message (was: Predict loops containing recursive call with fewer iterations)

2024-09-04 Thread Thomas Schwinge
Hi! On 2016-06-26T21:36:56+0200, Jan Hubicka wrote: > this patch [...] > --- predict.c (revision 237789) > +++ predict.c (working copy) > @@ -3367,6 +3446,15 @@ pass_profile::execute (function *fun) > gimple_dump_cfg (dump_file, dump_flags); > if (profile_status_for_fn (fun) == PROFILE_A

Re: [PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 9:48 AM Li, Pan2 wrote: > > > I'm lazy - can you please quote genmatch generated code for the condition > > for > > one case? > > Sure thing, list the before and after covers all the changes to generated > code as blow. > > Before this patch: > basic_block _b

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Jan Hubicka
> On 9/3/24 15:07, Jan Hubicka wrote: > > > Hi, > > We disable gathers for zen4. It seems that gather has improved a bit > > compared > > to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions > > when > > the indices are known ahead of time. Vector loads followed by shuffles

[PATCH] RISC-V: Handle unused-only-live stmts in SLP discovery

2024-09-04 Thread Richard Biener
The following adds SLP discovery for roots that are only live but otherwise unused. These are usually inductions. This allows a few more testcases to be handled fully with SLP, for example gcc.dg/vect/no-scevccp-pr86725-1.c Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tr

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Richard Biener
On Wed, Sep 4, 2024 at 12:56 PM Jan Hubicka wrote: > > > On 9/3/24 15:07, Jan Hubicka wrote: > > > > > Hi, > > > We disable gathers for zen4. It seems that gather has improved a bit > > > compared > > > to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions > > > when > > > th

[RFC] On Adding Support for Target-Dependent Loop-Specific Pragmas

2024-09-04 Thread Paul Iannetta
Hi, Currently, the only pragma directives that can be added by a backend, only have access to the information on the same line as the pragma, which is enough for modifying a global state. This means that a loop target pragma could look like this: #pragma target begin keyword [options] #pragma ta

nvptx: Use 'enum ptx_version', 'enum ptx_isa' instead of 'int'

2024-09-04 Thread Thomas Schwinge
Hi! Pushed to trunk branch in commit fee2fbedbb43ad7a017a33ed2b820be79b75e7e5 "nvptx: Use 'enum ptx_version', 'enum ptx_isa' instead of 'int'", see attached. Grüße Thomas >From fee2fbedbb43ad7a017a33ed2b820be79b75e7e5 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 22 Jul 2024 10:4

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Evgeny Karpov
Monday, September 2, 2024 Martin Storsjö wrote: > The only non-obvious thing, is that for IMAGE_REL_ARM64_PAGEBASE_REL21, > i.e. "adrp" instructions, the immediate that gets stored in the > instruction, is the byte offset to the symbol. > > After linking, when the instruction is interpreted at ex

[PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-04 Thread Jason Merrill
Tested x86_64-pc-linux-gnu. Any objections? -- 8< -- Several PRs complain about -Wswitch warning about a case for a bitwise combination of enumerators. Clang has an attribute flag_enum to prevent this; let's adopt that approach as well. This also recognizes the attribute as [[clang::flag_enum]

[PATCH v1 7/9] aarch64: Disable the anchors

2024-09-04 Thread Evgeny Karpov
Monday, September 2, 2024 Andrew Pinski wrote: > Could you expand on this and why you think disabling is correct? > It is so you could do: >         adrp    x0, .LANCHOR0 >         add     x2, x0, :lo12:.LANCHOR0 >         ldr     w1, [x0, #:lo12:.LANCHOR0] >         ldr     w0, [x2, 4] > > Rathe

[PATCH v2 01/36] arm: [MVE intrinsics] improve comment for orrq shape

2024-09-04 Thread Christophe Lyon
Add a comment about the lack of "n" forms for floating-point nor 8-bit integers, to make it clearer why we use build_16_32 for MODE_n. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (binary_orrq_def): Improve comment. --- gcc/config/arm/arm-mve-builti

[PATCH v2 00/36] arm: [MVE intrinsics] Re-implement more intrinsics

2024-09-04 Thread Christophe Lyon
Hi, This is v2 of the patch series I sent in https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657065.html. I have taken into account the feedback I received, and added more patches to the series, converting more MVE intrinsics to the new framework. Changes v1-v2: - I kept patch #1 as-is (so,

[PATCH v2 10/36] arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq

2024-09-04 Thread Christophe Lyon
Factorize vcvtaq vcvtmq vcvtnq vcvtpq builtins so that they use the same parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTAQ_M_S, VCVTAQ_M_U, VCVTAQ_S, VCVTAQ_U, VCVTMQ_M_S, VCVTMQ_M_U, VCVTMQ_S, VCVTMQ_U, VCVTNQ

[PATCH v2 07/36] arm: [MVE intrinsics] factorize vcvtbq vcvttq

2024-09-04 Thread Christophe Lyon
Factorize vcvtbq, vcvttq so that they use the same parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTBQ_F16_F32, VCVTTQ_F16_F32, VCVTBQ_F32_F16, VCVTTQ_F32_F16, VCVTBQ_M_F16_F32, VCVTTQ_M_F16_F32, VCVTBQ_M_F32_F16

[PATCH v2 09/36] arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 vcvtbq_f32_f16 vcvttq_f32_f16

2024-09-04 Thread Christophe Lyon
Implement vcvtbq_f16_f32, vcvttq_f16_f32, vcvtbq_f32_f16 and vcvttq_f32_f16 using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtxq_impl): New. (vcvtbq, vcvttq): New. * config/arm/arm-mve-builtins-

[PATCH v2 05/36] arm: [MVE intrinsics] add vcvt shape

2024-09-04 Thread Christophe Lyon
This patch adds the vcvt shape description. It needs to add a new type_suffix_info parameter to explicit_type_suffix_p (), because vcvt uses overloads for type suffixes for integer to floating-point conversions, but not for floating-point to integer. 2024-07-11 Christophe Lyon gcc/

[PATCH v2 02/36] arm: [MVE intrinsics] remove useless resolve from create shape

2024-09-04 Thread Christophe Lyon
vcreateq have no overloaded forms, so there's no need for resolve (). 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (create_def): Remove resolve. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 6 -- 1 file changed, 6 deletions(-) diff --

[PATCH v2 08/36] arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes

2024-09-04 Thread Christophe Lyon
This patch adds the vcvt_f16_f32 and vcvt_f32_f16 shapes descriptions. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvt_f16_f32) (vcvt_f32_f16): New. * config/arm/arm-mve-builtins-shapes.h (vcvt_f16_f32) (vcvt_f32_f16): New.

[PATCH v2 14/36] arm: [MVE intrinsics] factorize vorn

2024-09-04 Thread Christophe Lyon
Factorize vorn so that they use parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (MVE_INT_M_BINARY_LOGIC): Add VORNQ_M_S, VORNQ_M_U. (MVE_FP_M_BINARY_LOGIC): Add VORNQ_M_F. (mve_insn): Add VORNQ_M_S, VORNQ_M_U, VORNQ_M_F.

[PATCH v2 06/36] arm: [MVE intrinsics] rework vcvtq

2024-09-04 Thread Christophe Lyon
Implement vcvtq using the new MVE builtins framework. In config/arm/arm-mve-builtins-base.def, the patch also restores the alphabetical order. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vcvtq_impl): New. (vcvtq): New. * config/

[PATCH v2 04/36] arm: [MVE intrinsics] factorize vcvtq

2024-09-04 Thread Christophe Lyon
Factorize vcvtq so that they use parameterized names. 2024-07-11 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VCVTQ_FROM_F_S, VCVTQ_FROM_F_U, VCVTQ_M_FROM_F_S, VCVTQ_M_FROM_F_U, VCVTQ_M_N_FROM_F_S, VCVTQ_M_N_FROM_F_U, VCVTQ_M_N_TO_F_S,

[PATCH v2 11/36] arm: [MVE intrinsics] add vcvtx shape

2024-09-04 Thread Christophe Lyon
This patch adds the vcvtx shape description for vcvtaq, vcvtmq, vcvtnq, vcvtpq. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vcvtx): New. * config/arm/arm-mve-builtins-shapes.h (vcvtx): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 59

[PATCH v2 13/36] arm: [MVE intrinsics] rework vbicq

2024-09-04 Thread Christophe Lyon
Implement vbicq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vbicq): New. * config/arm/arm-mve-builtins-base.def (vbicq): New. * config/arm/arm-mve-builtins-base.h (vbicq): New. * config/arm

[PATCH v2 20/36] arm: [MVE intrinsics] update v[id]dup tests

2024-09-04 Thread Christophe Lyon
Testing v[id]dup overloads with '1' as argument for uint32_t* does not make sense: instead of choosing the '_wb' overload, we choose the '_n', but we already do that in the '_n' tests. This patch removes all such bogus foo2 functions. 2024-08-28 Christophe Lyon gcc/testsuite/

[PATCH v2 15/36] arm: [MVE intrinsics] rework vorn

2024-09-04 Thread Christophe Lyon
Implement vorn using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vornq): New. * config/arm/arm-mve-builtins-base.def (vornq): New. * config/arm/arm-mve-builtins-base.h (vornq): New. * config/arm/

[PATCH v2 03/36] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h

2024-09-04 Thread Christophe Lyon
This patch brings no functional change but removes some code duplication in arm-mve-builtins-functions.h and makes it easier to read and maintain. It introduces a new expand_unspec () member of unspec_based_mve_function_base and makes a few classes inherit from it instead of function_base. This a

[PATCH v2 16/36] arm: [MVE intrinsics] rework vctp

2024-09-04 Thread Christophe Lyon
Implement vctp using the new MVE builtins framework. 2024-08-21 Christophe Lyon gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New. (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-bui

[PATCH v2 22/36] arm: [MVE intrinsics] fix checks of immediate arguments

2024-09-04 Thread Christophe Lyon
As discussed in [1], it is better to use "su64" for immediates in intrinsics signatures in order to provide better diagnostics (erroneous constants are not truncated for instance). This patch thus uses su64 instead of ss32 in binary_lshift_unsigned, binary_rshift_narrow, binary_rshift_narrow_unsig

[PATCH v2 21/36] arm: [MVE intrinsics] remove v[id]dup expanders

2024-09-04 Thread Christophe Lyon
We use code_for_mve_q_u_insn, rather than the expanders used by the previous implementation, so we can remove the expanders and their declaration as builtins. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u) (vddupq_m_n_u, vidup

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 2, 2024 Martin Storsjö wrote: The only non-obvious thing, is that for IMAGE_REL_ARM64_PAGEBASE_REL21, i.e. "adrp" instructions, the immediate that gets stored in the instruction, is the byte offset to the symbol. After linking, when

[PATCH v2 12/36] arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq

2024-09-04 Thread Christophe Lyon
Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.def (

[PATCH v2 19/36] arm: [MVE intrinsics] rework vddup vidup

2024-09-04 Thread Christophe Lyon
Implement vddup and vidup using the new MVE builtins framework. We generate better code because we take advantage of the two outputs produced by the v[id]dup instructions. For instance, before: ldr r3, [r0] sub r2, r3, #8 str r2, [r0] mov r2, r3

[PATCH v2 17/36] arm: [MVE intrinsics] factorize vddup vidup

2024-09-04 Thread Christophe Lyon
Factorize vddup and vidup so that they use the same parameterized names. This patch updates only the (define_insn "@mve_q_u_insn") patterns and does not bother with the (define_expand "mve_vidupq_n_u") ones, because a subsequent patch avoids using them. 2024-08-21 Christophe Lyon gcc/

[PATCH v2 27/36] arm: [MVE intrinsics] remove useless v[id]wdup expanders

2024-09-04 Thread Christophe Lyon
Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop the expanders and their declarations as builtins, now useless. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete. *

[PATCH v2 18/36] arm: [MVE intrinsics] add viddup shape

2024-09-04 Thread Christophe Lyon
This patch adds the viddup shape description for vidup and vddup. This requires the addition of report_not_one_of and function_checker::require_immediate_one_of to gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE counterpart). This patch also introduces MODE_wb. 2024-08-21

[PATCH v2 33/36] arm: [MVE intrinsics] rework vadciq

2024-09-04 Thread Christophe Lyon
Implement vadciq using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New. (vadciq): New. * config/arm/arm-mve-builtins-base.def (vadciq): New. * config/arm/arm-mve-builtins-b

[PATCH v2 31/36] arm: [MVE intrinsics] add vadc_vsbc shape

2024-09-04 Thread Christophe Lyon
This patch adds the vadc_vsbc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New. * config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 36 ++

[PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape

2024-09-04 Thread Christophe Lyon
This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-buil

[PATCH v2 36/36] arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers

2024-09-04 Thread Christophe Lyon
In several places we are looking for a type twice or half as large as the type suffix: this patch introduces helper functions to avoid code duplication. long_type_suffix is similar to the SVE counterpart, but adds an 'expected_tclass' parameter. half_type_suffix is similar to it, but does not exis

[PATCH v2 23/36] arm: [MVE intrinsics] factorize vdwdup viwdup

2024-09-04 Thread Christophe Lyon
Factorize vdwdup and viwdup so that they use the same parameterized names. Like with vddup and vidup, we do not bother with the corresponding expanders, as we stop using them in a subsequent patch. The patch also adds the missing attributes to vdwdupq_wb_u_insn and viwdupq_wb_u_insn patterns. 20

[PATCH v2 26/36] arm: [MVE intrinsics] update v[id]wdup tests

2024-09-04 Thread Christophe Lyon
Testing v[id]wdup overloads with '1' as argument for uint32_t* does not make sense: this patch adds a new 'unit32_t *a' parameter to foo2 in such tests. The difference with v[id]dup tests (where we removed 'foo2') is that in 'foo1' we test the overload with a variable 'wrap' parameter (b) and we n

[PATCH v2 28/36] arm: [MVE intrinsics] add vshlc shape

2024-09-04 Thread Christophe Lyon
This patch adds the vshlc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vshlc): New. * config/arm/arm-mve-builtins-shapes.h (vshlc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 44 +++ gcc/confi

[PATCH v2 25/36] arm: [MVE intrinsics] rework vdwdup viwdup

2024-09-04 Thread Christophe Lyon
Implement vdwdup and viwdup using the new MVE builtins framework. In order to share more code with viddup_impl, the patch swaps operands 1 and 2 in @mve_v[id]wdupq_m_wb_u_insn, so that the parameter order is similar to what @mve_v[id]dupq_m_wb_u_insn uses. 2024-08-28 Christophe Lyon g

[PATCH v2 29/36] arm: [MVE intrinsics] rework vshlcq

2024-09-04 Thread Christophe Lyon
Implement vshlc using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New. (vshlc): New. * config/arm/arm-mve-builtins-base.def (vshlcq): New. * config/arm/arm-mve-builtins-base.h

[PATCH v2 30/36] arm: [MVE intrinsics] remove vshlcq useless expanders

2024-09-04 Thread Christophe Lyon
Since we rewrote the implementation of vshlcq intrinsics, we no longer need these expanders. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_ternop_unone_none_unone_imm_qualifiers) (-arm_ternop_none_none_unone_imm_qualifiers): Delete. *

[PATCH v2 34/36] arm: [MVE intrinsics] rework vadcq

2024-09-04 Thread Christophe Lyon
Implement vadcq using the new MVE builtins framework. We re-use most of the code introduced by the previous patch to support vadciq: we just need to initialize carry from the input parameter. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vadcq_vsbc):

[PATCH v2 32/36] arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci

2024-09-04 Thread Christophe Lyon
Factorize vadc/vsbc and vadci/vsbci so that they use the same parameterized names. 2024-08-28 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U, VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U, VSBCIQ_M_S, VSBCIQ_M_

[PATCH v2 35/36] arm: [MVE intrinsics] rework vsbcq vsbciq

2024-09-04 Thread Christophe Lyon
Implement vsbcq vsbciq using the new MVE builtins framework. We re-use most of the code introduced by the previous patches. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add support for vsbciq and vsbcq. (vadciq,

[Bug tree-optimization/109429] [PATCH] ivopts: fixed complexities

2024-09-04 Thread Aleksandar Rakic
>From 0130d3cb01fd9d5c1c997003245ed57bbdeb00a2 Mon Sep 17 00:00:00 2001 From: Aleksandar Date: Fri, 23 Aug 2024 11:36:50 +0200 Subject: [PATCH] [Bug tree-optimization/109429] ivopts: fixed complexities This patch addresses a bug introduced in commit f9f69dd by correcting the complexity calculatio

[PATCH] Use dg-additional-options for gfortran.dg/vect/vect-8.f90 and RISC-V

2024-09-04 Thread Richard Biener
r14-9122-g67a29f99cc8138 disabled scheduling on a lot of testcases for RISC-V for PR113249 but using dg-options. This makes gfortran.dg/vect/vect-8.f90 UNRESOLVED as it relies on default flags to enable vectorization. The following uses dg-additional-options instead. Tested on riscv64-linux with

Re: [to-be-committed] [RISC-V][PR target/115921] Improve reassociation for rv64

2024-09-04 Thread Xi Ruoyao
Hi Jeff, On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote: >  (define_insn_and_split "_shift_reverse" >    [(set (match_operand:X 0 "register_operand" "=r") > (any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r") > @@ -2934,9 +2936,9 @@ (define_insn_and_split "_shift_reverse" >

[PATCH] RISC-V Handle non-grouped stores as single-lane SLP

2024-09-04 Thread Richard Biener
The following enables single-lane loop SLP discovery for non-grouped stores and adjusts vectorizable_store to properly handle those. For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop, not running into the "not falling back to strided accesses" bail-out. I have not investigated in

[PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Evgeny Karpov
Monday, September 4, 2024 Martin Storsjö wrote: >> Let's consider the following example, when symbol is located at 3072. >> >> 1. Example without the fix >> compilation time >> adrp        x0, (3072 + 256) & ~0xFFF // x0 = 0 >> add         x0, x0, (3072 + 256) & 0xFFF // x0 = 3328 >> >> linking t

Re: [PATCH] RISC-V: Handle unused-only-live stmts in SLP discovery

2024-09-04 Thread Palmer Dabbelt
On Wed, 04 Sep 2024 04:10:52 PDT (-0700), rguent...@suse.de wrote: The following adds SLP discovery for roots that are only live but otherwise unused. These are usually inductions. This allows a few more testcases to be handled fully with SLP, for example gcc.dg/vect/no-scevccp-pr86725-1.c Boo

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene
On 9/4/24 12:55, Jan Hubicka wrote: On 9/3/24 15:07, Jan Hubicka wrote: Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by

Re: [PATCH] c++: ICE with TTP [PR96097]

2024-09-04 Thread Jason Merrill
On 9/3/24 6:12 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? The change to return bool seems like unrelated cleanup; please push that separately on trunk only. + /* We can also have: + + template typename X> + void

Re: [PATCH] c++: noexcept and pointer to member function type [PR113108]

2024-09-04 Thread Jason Merrill
On 9/3/24 2:47 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? OK. -- >8 -- We ICE in nothrow_spec_p because it got a DEFERRED_NOEXCEPT. This DEFERRED_NOEXCEPT was created in implicitly_declare_fn when declaring Foo& operator=(Foo&&) = default; in

[pushed] c++: add a testcase for [PR 108620]

2024-09-04 Thread Arsen Arsenović
Pushed as obvious. -- >8 -- Fixed by r15-2540-g32e678b2ed7521. Add a testcase, as the original ones do not cover this particular failure mode. gcc/testsuite/ChangeLog: PR c++/108620 * g++.dg/coroutines/pr108620.C: New test. --- gcc/testsuite/g++.dg/coroutines/pr1

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 08:15:25AM -0400, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu. Any objections? Looks good except... > +/* Attributes also recognized in the clang:: namespace. */ > +const struct attribute_spec c_common_clang_attributes[] = { > + { "flag_enum", 0, 0, fal

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jason Merrill
On 9/2/24 1:49 PM, Jakub Jelinek wrote: Hi! The following testcase is miscompiled, because get_member_function_from_ptrfunc emits something like (((FUNCTION.__pfn & 1) != 0) ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1 : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...) or so, so FUNCTION tree

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: Let's consider the following example, when symbol is located at 3072. 1. Example without the fix compilation time adrp        x0, (3072 + 256) & ~0xFFF // x0 = 0 add         x0, x0, (3072 + 256) & 0xFFF

Re: [PATCH] c++: Add missing auto_diagnostic_groups

2024-09-04 Thread Jason Merrill
On 9/2/24 7:43 AM, Nathaniel Shead wrote: Ping for https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659796.html OK. For clarity's sake, here's the full patch with the adjustment I mentioned earlier: -- >8 -- This patch goes through all .cc files in gcc/cp and adds in any auto_diagnosti

Re: Handle 'NUM' in 'PUSH_INSERT_PASSES_WITHIN' (was: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)

2024-09-04 Thread David Malcolm
On Fri, 2024-06-28 at 15:06 +0200, Thomas Schwinge wrote: > Hi! > > As part of this: > > On 2013-07-26T11:04:33-0400, David Malcolm > wrote: > > This patch is the hand-written part of the conversion of passes > > from > > C structs to C++ classes. > > > --- a/gcc/passes.c > > +++ b/gcc/passes.c

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote: > On 9/2/24 1:49 PM, Jakub Jelinek wrote: > > Hi! > > > > The following testcase is miscompiled, because > > get_member_function_from_ptrfunc > > emits something like > > (((FUNCTION.__pfn & 1) != 0) > > ? ptr + FUNCTION.__delta + FU

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Martin Storsjö wrote: On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: Let's consider the following example, when symbol is located at 3072. 1. Example without the fix compilation time adrp        x0, (3072 + 256) & ~0xFFF // x0 =

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-04 Thread Eric Gallager
On Wed, Sep 4, 2024 at 8:18 AM Jason Merrill wrote: > > Tested x86_64-pc-linux-gnu. Any objections? > > -- 8< -- > > Several PRs complain about -Wswitch warning about a case for a bitwise > combination of enumerators. Clang has an attribute flag_enum to prevent > this; let's adopt that approach

Re: [PATCH] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2024-09-04 Thread Jason Merrill
On 9/1/24 2:51 PM, Simon Martin wrote: Hi Jason, On 26 Aug 2024, at 19:23, Jason Merrill wrote: On 8/25/24 12:37 PM, Simon Martin wrote: On 24 Aug 2024, at 23:59, Simon Martin wrote: On 24 Aug 2024, at 15:13, Jason Merrill wrote: On 8/23/24 12:44 PM, Simon Martin wrote: We currently emit

Re: [PATCH] c++, coroutines: Instrument missing return_void UB.

2024-09-04 Thread Jason Merrill
On 9/1/24 12:17 PM, Iain Sandoe wrote: This came up in discussion of an earlier patch. I'm in two minds as to whether it's a good idea or not - the underlying issue being that libubsan does not yet (AFAICT) have the concept of a coroutine, so that the diagnostics are not very specific and might

Re: [PATCH] c++, coroutines: Revise promise construction/destruction.

2024-09-04 Thread Jason Merrill
On 8/31/24 12:37 PM, Iain Sandoe wrote: tested on x86_64-darwin/linux powerpc64le-linux, OK for trunk? alternate suggestions? thanks, Iain --- 8< --- In examining the coroutine testcases for unexpected diagnostic output for 'Wall', I found a 'statement has no effect' warning for the promise con

Re: [PATCH] c++: fn redecl in fn scope wrongly accepted [PR116239]

2024-09-04 Thread Jason Merrill
On 8/30/24 3:40 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Redeclaration such as void f(void); consteval void f(void); is invalid. In a namespace scope, we detect the collision in validate_constexpr_redeclaration, but not when one decl

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: compilation time adrp x0, symbol + 256 9000 adrp x0, 0 As the symbol offset is 256, you will need to encode the offset "256" in the instruction immediate field. Not "256 >> 12". This is the somewhat

[PATCH] libstdc++: hashing support for chrono value classes (P2592R2)

2024-09-04 Thread Giuseppe D'Angelo
Hello, The attached patch implements P2592, adding std::hash specializations for std::chrono classes. One aspect I'm quite unhappy with is the hash combiner I've used. I'm not sure if there's some longer-term goal for libstdc++ here -- would you prefer to roll something à la Boost.HashCombin

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jason Merrill
On 9/4/24 11:15 AM, Jakub Jelinek wrote: On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote: On 9/2/24 1:49 PM, Jakub Jelinek wrote: Hi! The following testcase is miscompiled, because get_member_function_from_ptrfunc emits something like (((FUNCTION.__pfn & 1) != 0) ? ptr + FUNCT

Re: [to-be-committed] [RISC-V][PR target/115921] Improve reassociation for rv64

2024-09-04 Thread Jeff Law
On 9/4/24 8:08 AM, Xi Ruoyao wrote: Hi Jeff, On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote:  (define_insn_and_split "_shift_reverse"    [(set (match_operand:X 0 "register_operand" "=r") (any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r") @@ -2934,9 +2936,9 @@ (def

[PATCH] c++, v2: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 12:34:04PM -0400, Jason Merrill wrote: > > So, one possibility would be to call save_expr unconditionally in > > get_member_function_from_ptrfunc as well. > > > > Or build a TARGET_EXPR (force_target_expr or similar). > > Yes. I don't have a strong preference between the

[PATCH v2] c++: fn redecl in fn scope wrongly accepted [PR116239]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 12:28:49PM -0400, Jason Merrill wrote: > On 8/30/24 3:40 PM, Marek Polacek wrote: > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > Redeclaration such as > > > >void f(void); > >consteval void f(void); > > > > is invalid. In a

Re: [PATCH] c++, v2: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-04 Thread Jason Merrill
On 8/30/24 1:37 PM, Jakub Jelinek wrote: On Wed, Aug 21, 2024 at 02:08:16PM -0400, Jason Merrill wrote: I was concerned about the use of a single boolean to guard the destruction of multiple objects, suspecting that it would break in obscure EH cases. When I finally managed to construct a testca

[pushed] c++: cleanup coerce_template_template_parm

2024-09-04 Thread Marek Polacek
Split out from https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662261.html which was tested on x86_64-pc-linux-gnu. I'm checking this in. -- >8 -- This function could use some sprucing up. gcc/cp/ChangeLog: * pt.cc (coerce_template_template_parm): Return bool instead of int. --

[PATCH] c++, v3: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 01:22:47PM -0400, Jason Merrill wrote: > > @@ -8985,6 +9003,13 @@ cp_finish_decl (tree decl, tree init, bo > > if (var_definition_p) > > abstract_virtuals_error (decl, type); > > + if (decomp && !processing_template_decl) > > + { > > + need_decomp_init

[committed][RISC-V] Fix scan test output after recent path-splitting changes

2024-09-04 Thread Jeff Law
The recent path splitting changes from Andrew result in identifying more saturation idioms instead of just identifying an overflow check. As a result many of the tests in the RISC-V port started failing a scan check on the .expand output. As expected, identifying a saturation idiom is more

Re: [PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-09-04 Thread Andrew Carlotti
On Mon, Aug 19, 2024 at 03:52:58PM +0100, Andrew Carlotti wrote: > On Fri, Aug 16, 2024 at 07:17:24AM +, Kyrylo Tkachov wrote: > > > > > > > On 15 Aug 2024, at 18:48, Andrew Carlotti wrote: > > > > > > External email: Use caution opening links or attachments > > > > > > > > > On Thu, Aug

  1   2   >