RE: [PATCH] Remove SPR/GNR/DMR from avx512_{move,store}_by pieces tune.

2025-09-16 Thread Liu, Hongtao
> -Original Message- > From: Richard Biener > Sent: Wednesday, September 17, 2025 2:55 PM > To: Hongtao Liu > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org; > hjl.to...@gmail.com > Subject: Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move,store}_by > pieces tune. > > On Wed, Sep 17, 2025

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 7:59 AM Hongtao Liu wrote: > > On Wed, Sep 17, 2025 at 2:12 PM Hongtao Liu wrote: > > > > On Wed, Sep 17, 2025 at 2:08 PM Hongtao Liu wrote: > > > > > > On Tue, Sep 16, 2025 at 3:53 PM Liu, Hongtao > > > wrote: > > > > > > > > > > > > > > > > > -Original Message

Re: [PATCH] forwprop: Don't loop on the stmt when optimize_aggr_zeroprop or optimize_agr_copyprop return true

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 4:55 AM Andrew Pinski wrote: > > Since now optimize_aggr_zeroprop and optimize_agr_copyprop work by forward > walk to prop > the zero/aggregate and does not change the statement at hand, there is no > reason to > repeat the loop if they do anything. This will prevent pro

Re: [PATCH] uninclude: Add lib/gcc//include as an possible include dir

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 3:18 AM Andrew Pinski wrote: > > While running uninclude on PR99912's preprocessed source uninclude > didn't uninclude some of the x86_64 target headers. This was because > `lib/gcc//include` was not noticed as an possible system > include dir. It supported `gcc-lib//includ

Re: [PATCH 2/2] forwprop: Fix up "nop" copies after recent changes [PR121962]

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 12:33 AM Andrew Pinski wrote: > > After r16-3887-g597b50abb0d2fc, the check to see if the copy is > a nop copy becomes inefficient. The code going into an infinite > loop as the copy keeps on being propagated over and over again. > > That is if we have: > ``` > struct s1

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

2025-09-16 Thread Bill Wendling
On Tue, Sep 16, 2025 at 6:25 PM Yeoul Na wrote: > On Sep 16, 2025, at 4:32 PM, Bill Wendling wrote: > On Tue, Sep 16, 2025 at 11:39 AM Yeoul Na wrote: > >> Hi folks, >> >> Hi Yeoul, > > I wanted to share some updates from our WG14 meeting in Brno, where we >> presented our proposal on dependent

Re: [PATCH 1/2] forwprop: Add a quick out for new_src_based_on_copy when both are decls

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 12:33 AM Andrew Pinski wrote: > > If both operands that are being compared are decls, operand_equal_p will > already > handle that case so an early out can be done here. > > Bootstrapped and tested on x86_64-linux-gnu. OK. > gcc/ChangeLog: > > * tree-ssa-forwprop

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 4:15 PM Robin Dapp wrote: > > > Well, what you want to catch now isn't single-lane anymore. But I guess > > since > > we now check the permute before this we can rely on check for n_perms == 0 > > to catch the "no actual permutation required" case? > > I'm seeing n_perms =

[PATCH] [X86] Fixes for AMD znver5 enablement

2025-09-16 Thread Umesh Kalvakuntla
From: Umesh Kalvakuntla - cpuid bit for prefetchi is different from Intel (https://docs.amd.com/v/u/en-US/24594_3.37) - Fix cpu family model numbers --- gcc/common/config/i386/cpuinfo.h | 11 +++ gcc/config/i386/cpuid.h | 4 2 files changed, 15 insertions(+) diff --git a

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Hongtao Liu
On Tue, Sep 16, 2025 at 3:53 PM Liu, Hongtao wrote: > > > > > -Original Message- > > From: Richard Biener > > Sent: Tuesday, September 16, 2025 3:03 PM > > To: Liu, Hongtao > > Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com > > Subject: Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move,

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Hongtao Liu
On Wed, Sep 17, 2025 at 2:12 PM Hongtao Liu wrote: > > On Wed, Sep 17, 2025 at 2:08 PM Hongtao Liu wrote: > > > > On Tue, Sep 16, 2025 at 3:53 PM Liu, Hongtao wrote: > > > > > > > > > > > > > -Original Message- > > > > From: Richard Biener > > > > Sent: Tuesday, September 16, 2025 3:03

Re: [PATCH v3][PR119702] rs6000: Use vector addition when left shifting by 1

2025-09-16 Thread Michael Meissner
On Mon, Sep 08, 2025 at 02:54:53PM +0530, Avinash Jayakar wrote: > Hi, > > This is the third version of the patch proposed for master aiming to fix > PR119702. Requesting review of this patch. Some minor comments inline below. > The following sequence of assembly in powerpc64le > vspltisw

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Hongtao Liu
On Tue, Sep 16, 2025 at 4:59 PM Richard Biener wrote: > > On Tue, Sep 16, 2025 at 9:53 AM Liu, Hongtao wrote: > > > > > > > > > -Original Message- > > > From: Richard Biener > > > Sent: Tuesday, September 16, 2025 3:03 PM > > > To: Liu, Hongtao > > > Cc: gcc-patches@gcc.gnu.org; hjl.to.

[PATCH] LoongArch: Add isnan expander [PR 66462]

2025-09-16 Thread Xi Ruoyao
Add an expander for isnan using fclass. Since isnan is just a compare, enable it only with -fsignaling-nans to avoid generating spurious exceptions. This fixes part of PR66462. int isnan1 (float x) { return __builtin_isnan (x); } With -fno-signaling-nans: fcmp.cun.s $fcc0,$f0,$f0

[PATCH v2 3/4] RISC-V: Add test for vec_duplicate + vwsubu.vv signed combine with GR2VR cost 0, 1 and 15

2025-09-16 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vwsubu.vv combine to vwsubu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vwsubu.vx. * gcc.target/riscv/rvv/autovec/vx_v

[PATCH] uninclude: Add lib/gcc//include as an possible include dir

2025-09-16 Thread Andrew Pinski
While running uninclude on PR99912's preprocessed source uninclude didn't uninclude some of the x86_64 target headers. This was because `lib/gcc//include` was not noticed as an possible system include dir. It supported `gcc-lib//include` though. contrib/ChangeLog: * uninclude: Add `lib/gc

Re: [PATCH] forwprop: Add a simple DSE after a clobber

2025-09-16 Thread Andrew Pinski
On Tue, Sep 16, 2025 at 5:54 AM Richard Biener wrote: > > On Tue, Sep 16, 2025 at 6:34 AM Andrew Pinski > wrote: > > > > After copy propagation for aggregates patches we might end up with > > now: > > ``` > > tmp = a; > > b = a; // was b = tmp; > > tmp = {CLOBBER}; > > ``` > > To help out ESRA, i

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

2025-09-16 Thread Bill Wendling
On Tue, Sep 16, 2025 at 11:39 AM Yeoul Na wrote: > Hi folks, > > Hi Yeoul, I wanted to share some updates from our WG14 meeting in Brno, where we > presented our proposal on dependent attributes. > > # Our Proposal > > We presented N3656 (www.open-std.org/jtc1/sc22/wg14/www/docs/n3656.pdf), > wh

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Michael Matz
Hello, On Tue, 16 Sep 2025, Richard Biener wrote: > > Do you mean TREE_READONLY or TREE_THIS_NOTRAP is a useless flag? In my view > > they mean different and orthogonal things. We do propagate TREE_READONLY in > > the inliner and the tree rewriting routines too. > > I mean TREE_READONLY on ...

[PATCH 1/2] forwprop: Add a quick out for new_src_based_on_copy when both are decls

2025-09-16 Thread Andrew Pinski
If both operands that are being compared are decls, operand_equal_p will already handle that case so an early out can be done here. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-forwprop.cc (new_src_based_on_copy): An early out if both are decls. Signed-

Re: [patch, Fortran] Implement -fexternal-blas64, PR 121161

2025-09-16 Thread Steve Kargl
On Tue, Sep 16, 2025 at 10:10:16PM +0200, Thomas Koenig wrote: > Hello world, > > the attached patch implements a new option, -fexternal-blas64, > so people can use 64-bit libraries for external BLAS, like Intel > MKL. > > Regression-tested. OK for trunk? > Thanks. I've read through the patch,

Re: [PATCH] c: Reject gimple and rtl functions as needed functions [PR121421]

2025-09-16 Thread Joseph Myers
On Tue, 16 Sep 2025, Andrew Pinski wrote: > These two don't make sense as nested functions as they both don't handle > the unnesting and/or have support for the static chain. > > So let's reject them. > > Bootstrapped and tested on x86_64-linux-gnu. > > PR c/121421 > > gcc/c/ChangeLog: >

[patch, Fortran] Implement -fexternal-blas64, PR 121161

2025-09-16 Thread Thomas Koenig
Hello world, the attached patch implements a new option, -fexternal-blas64, so people can use 64-bit libraries for external BLAS, like Intel MKL. Regression-tested. OK for trunk? Best regards Thomas Implement -fexternal-blas64 option. Libraries like Intel MKL use 64-bit integers in

Re: [PATCH] fortran: allow character in conditional expression

2025-09-16 Thread Tobias Burnus
Hi Yuao, Yuao Ma wrote: On Tue, Sep 16, 2025 at 5:11 PM Tobias Burnus wrote: PS: Already with the current code, we may run into the issue of passing an actual argument like '(cond ? "abc" : "cdfg")' to 'class(*)' – and I am not sure whether we handle this correctly or not. That is a great test

Re: [PATCH v2 0/4] RISC-V: Combine vec_duplicate + v{widen}u.vv to v{widen}u.vx on GR2VR cost

2025-09-16 Thread Robin Dapp
> This patch would like to introduce the combine of vec_dup + v{widen}u.vv > into v{widen}u.vx on the cost value of GR2VR. The late-combine will take > place if the cost of GR2VRlike 1, 2, 15 in test. This series LGTM, thanks. -- Regards Robin

Re: [PATCH] [RFC] Delayed parsing for bounds safety attributes

2025-09-16 Thread Yeoul Na
Hi folks, I wanted to share some updates from our WG14 meeting in Brno, where we presented our proposal on dependent attributes. # Our Proposal We presented N3656 (www.open-std.org/jtc1/sc22/wg14/www/docs/n3656.pdf), which introduces "Dependent Attributes" as a new category of attributes that

[PATCH] c: Reject gimple and rtl functions as needed functions [PR121421]

2025-09-16 Thread Andrew Pinski
These two don't make sense as nested functions as they both don't handle the unnesting and/or have support for the static chain. So let's reject them. Bootstrapped and tested on x86_64-linux-gnu. PR c/121421 gcc/c/ChangeLog: * c-parser.cc (c_parser_declaration_or_fndef): Error

Re: [PATCH v4] preprocessor: More escapes for Makefile rules (-M option) [PR41329, PR121450]

2025-09-16 Thread joergboe
Am 16.09.25 um 6:34 PM schrieb Jakub Jelinek: On Tue, Sep 16, 2025 at 06:19:41PM +0200, Joerg Boehmer wrote: -/* Apply Make quoting to STR, TRAIL. Note that it's not possible to - quote all such characters - e.g. \n, %, *, ?, [, \ (in some - contexts), and ~ are not properly handled. It

Re: [PATCH] libstdc++: Optimize determination of std::tuple_cat return type

2025-09-16 Thread Patrick Palka
On Tue, 16 Sep 2025, Jonathan Wakely wrote: > The std::tuple_cat function has to determine a std::tuple return type > from zero or more tuple-like arguments. This uses the __make_tuple class > template to transform a tuple-like type into a std::tuple, and the > __combine_tuples class template to c

Re: [PATCH] libstdc++: Fix missing change to views::pairwise from P2165R4 [PR121956]

2025-09-16 Thread Jonathan Wakely
On Tue, 16 Sept 2025 at 17:15, Patrick Palka wrote: > > On Tue, 16 Sep 2025, Jonathan Wakely wrote: > > > ranges::adjacent_view::_Iterator::value_type should have been changed by > > r14-8710-g65b4cba9d6a9ff to always produce std::tuple, even for the > > N == 2 views::pairwise specialization. > >

Re: [PATCH] libstdc++: ranges::rotate should use ranges::iter_move [PR121913]

2025-09-16 Thread Jonathan Wakely
On Tue, 16 Sept 2025 at 17:12, Patrick Palka wrote: > > On Tue, 16 Sep 2025, Jonathan Wakely wrote: > > > The r16-3835-g7801236069a95c change to use ranges::iter_move should also > > have used iter_value_t<_Iter> to ensure we get an object of the value > > type, not a proxy reference. > > > > libs

Re: [PATCH v4] preprocessor: More escapes for Makefile rules (-M option) [PR41329, PR121450]

2025-09-16 Thread Jakub Jelinek
On Tue, Sep 16, 2025 at 06:19:41PM +0200, Joerg Boehmer wrote: > -/* Apply Make quoting to STR, TRAIL. Note that it's not possible to > - quote all such characters - e.g. \n, %, *, ?, [, \ (in some > - contexts), and ~ are not properly handled. It isn't possible to > - get this right in any

[PATCH v4] preprocessor: More escapes for Makefile rules (-M option) [PR41329, PR121450]

2025-09-16 Thread Joerg Boehmer
This patch adds support for more characters that are special to GNU make in file-names. Especially GNU make expects in rules that #, %, :, *, ? and [ characters are preceded by a backslash to remove their special meaning. PR preprocessor/41329 PR preprocessor/121450 libcpp/Change

Re: [PATCH] libstdc++: Fix missing change to views::pairwise from P2165R4 [PR121956]

2025-09-16 Thread Patrick Palka
On Tue, 16 Sep 2025, Jonathan Wakely wrote: > ranges::adjacent_view::_Iterator::value_type should have been changed by > r14-8710-g65b4cba9d6a9ff to always produce std::tuple, even for the > N == 2 views::pairwise specialization. LGTM. I missed this part of P2165R4 because it didn't use the __tup

Re: [PATCH] xtensa: Simplify the definition of REGNO_OK_FOR_BASE_P() and avoid calling it directly

2025-09-16 Thread Max Filippov
On Mon, Sep 15, 2025 at 5:42 PM Takayuki 'January June' Suwa wrote: > > In recent gcc versions, REGNO_OK_FOR_BASE_P() is not called directly, but > rather via regno_ok_for_base_p() which is a wrapper in gcc/addresses.h. > The wrapper obtains a hard register number from pseudo via reg_renumber > ar

Re: [PATCH] testsuite: arm: Simplify fp16-aapcs tests

2025-09-16 Thread Richard Earnshaw (lists)
On 27/08/2025 16:07, Torbjörn SVENSSON wrote: > Reduce fp16-aapcs testcases to return value testing since parameter > passing are already tested in aapcs/vfp*.c > > gcc/testsuite/ChangeLog: > * gcc.target/arm/fp16-aapcs.c: New test. > * gcc.target/arm/fp16-aapcs-1.c: Removed. > *

Re: [PATCH v2] RISC-V: Improve slide patterns recognition

2025-09-16 Thread Jeff Law
On 9/16/25 08:21, Raphael Moreira Zinsly wrote: Changes since v1: - Fixed permutations with two pivots and repeated elements. -- >8 -- Improve shuffle_slide_patterns to better recognize permutations that can be constructed by a slideup or slidedown, covering more cases: Slideup one v

Re: [PATCH] libstdc++: Explicitly pass -Wsystem-headers to tests that need it

2025-09-16 Thread Jonathan Wakely
On Tue, 16 Sept 2025 at 16:36, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does thsi look OK for trunk and > perhaps 15/14? OK for trunk/15/14 > > -- >8 -- > > When running libstdc++ tests using an installed gcc (as opposed to > an in-tree gcc), warnings from within system headers a

[PATCH] libstdc++: ranges::rotate should use ranges::iter_move [PR121913]

2025-09-16 Thread Jonathan Wakely
The r16-3835-g7801236069a95c change to use ranges::iter_move should also have used iter_value_t<_Iter> to ensure we get an object of the value type, not a proxy reference. libstdc++-v3/ChangeLog: PR libstdc++/121913 * include/bits/ranges_algo.h (__rotate_fn::operator()): Use

[PATCH] libstdc++: Fix missing change to views::pairwise from P2165R4 [PR121956]

2025-09-16 Thread Jonathan Wakely
ranges::adjacent_view::_Iterator::value_type should have been changed by r14-8710-g65b4cba9d6a9ff to always produce std::tuple, even for the N == 2 views::pairwise specialization. libstdc++-v3/ChangeLog: PR libstdc++/121956 * include/std/ranges (adjacent_view::_Iterator::value_typ

[PATCH] libstdc++: Optimize determination of std::tuple_cat return type

2025-09-16 Thread Jonathan Wakely
The std::tuple_cat function has to determine a std::tuple return type from zero or more tuple-like arguments. This uses the __make_tuple class template to transform a tuple-like type into a std::tuple, and the __combine_tuples class template to combine zero or more std::tuple types into a single st

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 10:30 AM Eric Botcazou wrote: > > > I mean TREE_READONLY on ..._REF nodes. We can't rely on the absence of > > TREE_READONLY on ..._REF meaning the object is writable, so the flag does > > not add any information (but maybe some costing hint that the object is > > definite

Re: [PATCH] libstdc++: Fix algorithms to use iterators' difference_type for arithmetic [PR121890]

2025-09-16 Thread Patrick Palka
On Fri, 12 Sep 2025, Jonathan Wakely wrote: > On Thu, 11 Sept 2025 at 22:41, Jonathan Wakely wrote: > > > > Whenever we use operator+ or similar operators on random access > > iterators we need to be careful to use the iterator's difference_type > > rather than some other integer type. It's no

[PATCH] testsuite: Fix vector-subscript-4.c [PR116421]

2025-09-16 Thread Stefan Schulze Frielinghaus
From: Stefan Schulze Frielinghaus Verify we don't have any vector temporaries in the IL at least until ISEL which may introduce VEC_EXTRACTs on targets which support non-constant indices (see PR116421). As a pass I chose NRV for no particular reason except that it is literally the last pass prio

[PATCH v2] RISC-V: Improve slide patterns recognition

2025-09-16 Thread Raphael Moreira Zinsly
Changes since v1: - Fixed permutations with two pivots and repeated elements. -- >8 -- Improve shuffle_slide_patterns to better recognize permutations that can be constructed by a slideup or slidedown, covering more cases: Slideup one vector into the middle the other like {0, 4, 5, 3}.

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Robin Dapp
> Well, what you want to catch now isn't single-lane anymore. But I guess > since > we now check the permute before this we can rely on check for n_perms == 0 > to catch the "no actual permutation required" case? I'm seeing n_perms == 1 for {0, 1, 2, 3} as well as for {1, 0, 2, 3}. We initializ

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 3:07 PM Robin Dapp wrote: > > > I think this now conflicts a bit with what I just pushed (sorry). > > > >>&& loop_vinfo) > >> { > >> + unsigned i, j; > >> + bool simple_perm_series = true; > >> + FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (slp_n

[Patch] libgomp: Add Fortran version of acc_copyout_finalize_async and acc_delete_finalize_async

2025-09-16 Thread Tobias Burnus
I stumbled over the following: GCC misses* two OpenACC 2.5 functions in Fortran but not in C; actually, when looking deeper at it, .texi and .map already contain everything, just the actual implementation (in openacc.f90) and the interface (in openacc.f90 and in openacc_lib.h) were missing. The a

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Robin Dapp
> I think this now conflicts a bit with what I just pushed (sorry). > >>&& loop_vinfo) >> { >> + unsigned i, j; >> + bool simple_perm_series = true; >> + FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (slp_node), i, j) >> + if (i != j) >> + simple_perm_series

Re: [PATCH][PR104116] Add vectorization logic for floor_{mod,div}

2025-09-16 Thread Richard Biener
On Mon, 15 Sep 2025, Avinash Jayakar wrote: > Hello Richard, > > Thank you for reviewing the patch! I have made changes based on your > comments, but I have some doubts for a few comments as mentioned below. > > On Thu, 2025-09-11 at 13:08 +0200, Richard Biener wrote: > > On Wed, 10 Sep 2025, Av

[PATCH] s390x: Fix fmin/fmax patterns

2025-09-16 Thread Juergen Christ
s390x floating point minimum and maximum functions unfortunately do not canonicalize NaNs. Hence, test pr105414.c fails since c476f554e3f. Fix this by only allowing fmin/fmax pattern if trapping math is disabled. Bootstrapped and reg-tested on s390x. Ok for trunk? gcc/ChangeLog: * con

Re: [PATCH 0/2] [aarch64] sme/nonlocal_goto_* tests fail remat with PIE

2025-09-16 Thread Martin Uecker
Am Montag, dem 15.09.2025 um 16:04 -0700 schrieb Andrew Pinski: > On Mon, Sep 15, 2025 at 3:59 PM Alexandre Oliva wrote: > > > > On Sep 13, 2025, Alexandre Oliva wrote: > > > > > gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets > > > configured with --enable-default-pie. > >

Re: [PATCH v2 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 5:22 AM wrote: > > From: Pan Li > > This patch would like to try to match the the unsigned > SAT_MUL form 4, aka below: > > #define DEF_SAT_U_MUL_FMT_5(NT, WT) \ > NT __attribute__((noinline))\ > sat_u_mul_##NT##_from_##WT##_fmt_5 (NT

[PATCH v2 2/4] RISC-V: Add test for vec_duplicate + vwaddu.vv signed combine with GR2VR cost 0, 1 and 15

2025-09-16 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vwaddu.vv combine to vwaddu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vwaddu.vx. * gcc.target/riscv/rvv/autovec/vx_v

[PATCH v2 1/4] RISC-V: Combine vec_duplicate + vwaddu.vv to vwaddu.vx on GR2VR cost

2025-09-16 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vwaddu.vv to the vwaddu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if

[PATCH v2 4/4] RISC-V: Add test for vec_duplicate + vwmulu.vv signed combine with GR2VR cost 0, 1 and 15

2025-09-16 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vwmulu.vv combine to vwmulu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check for vwmulu.vx. * gcc.target/riscv/rvv/autovec/vx_v

[PATCH v2 0/4] RISC-V: Combine vec_duplicate + v{widen}u.vv to v{widen}u.vx on GR2VR cost

2025-09-16 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + v{widen}u.vv into v{widen}u.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VRlike 1, 2, 15 in test. The below insn from uint32_t to uint64_t are included. * vwaddu.vx * vwsubu.vx * vwmulu

Re: [PATCH] xtensa: Renovate support for load/store of hardware FP register with pre/post-modify side-effect

2025-09-16 Thread Max Filippov
Hi Suwa-san, On Sat, Sep 13, 2025 at 3:43 AM Takayuki 'January June' Suwa wrote: > > The Xtensa ISA has hardware FP register load/store instructions that have > pre- or post-modify side effects (LSIU/SSIU or LSIP/SSIP machine instruc- > tions) depending on the configuration and currently these ar

Re: [PATCH] i386/testsuite: Fix non unique name tests

2025-09-16 Thread Hongtao Liu
On Mon, Sep 15, 2025 at 3:33 PM Haochen Jiang wrote: > > Hi all, > > After r16-3651, compare_tests script will explicitly mention those > tests have the same name. This helps us review all the tests we have. > > Among them, most of them are unintentional typos (e.g., keep testing > the same vector

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Andrew Stubbs
On 15/09/2025 10:13, Richard Biener wrote: diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index df1c1a5b19b..b018f96e8bc 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -5368,7 +5368,7 @@ gcn_preferred_vector_alignment (const_tree type) static bool gcn_vectorize

[COMMITTED 2/2] ada: Fix error message for Stream_Size

2025-09-16 Thread Marc Poulhiès
From: Ronan Desplanques Before this patch, confirming Stream_Size aspect specifications on elementary types were incorrectly rejected when the stream size was 128, and the error messages emitted for Stream_Size aspect errors gave incorrect possible values. This patch fixes this. The most signifi

Re: [PATCH] fortran: allow character in conditional expression

2025-09-16 Thread Tobias Burnus
Hi Yuao, Tobias Burnus wrote: For BT_DERIVED: (i) The type needs to be the same – or compatible ('SEQUENCE' attribute) I was referring to: "7.5.2.4 Determination of derived types": "Data entities also have the same type if they are declared with reference to different derived-type definition

Re: [PATCH] i386/testsuite: Fix scan tree dump in vect-epilogue-4.c

2025-09-16 Thread Richard Biener
On Tue, 16 Sep 2025, Haochen Jiang wrote: > vect-epilogue-4.c uses mask 64 byte to vectorize in epilogue part. > Similar as r16-876 fix for vect-epilogue-5.c, we need to adjust the > scan tree dump. > > Ok for trunk? OK. Thanks, Richard. > Thx, > Haochen > > gcc/testsuite/ChangeLog: > >

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 9:53 AM Liu, Hongtao wrote: > > > > > -Original Message- > > From: Richard Biener > > Sent: Tuesday, September 16, 2025 3:03 PM > > To: Liu, Hongtao > > Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com > > Subject: Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move,

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Eric Botcazou
> I mean TREE_READONLY on ..._REF nodes. We can't rely on the absence of > TREE_READONLY on ..._REF meaning the object is writable, so the flag does > not add any information (but maybe some costing hint that the object is > definitely _not_ writable(?)). OK, I agree that it may not say much for

[COMMITTED 1/2] ada: Revert "Remove dependence on secondary stack for type with controlled component"

2025-09-16 Thread Marc Poulhiès
From: Gary Dismukes This reverts commit 91b51fc42b167eedaaded6360c490a4306bc5c55. Tested on x86_64-pc-linux-gnu, committed on master. --- gcc/ada/exp_ch6.adb | 49 ++--- gcc/ada/exp_ch6.ads | 6 -- gcc/ada/exp_ch7.adb | 20 +- 3 file

Re: [patch,wwwdocs,applied] Notice AVR32EB14/20/28/32 addition in v15.3.

2025-09-16 Thread Georg-Johann Lay
Am 15.09.25 um 20:05 schrieb Gerald Pfeifer: On Mon, 15 Sep 2025, Georg-Johann Lay wrote: + Support for the following devices has been added in v15.3: Thanks for documenting this. Can we make the GCC reference "GCC 15.3" instead of "v15.3" which is a form we generally don't use? Happy to mak

[PATCH v2] Remove SPR/GNR/DMR from avx512_move_by_pieces tune.

2025-09-16 Thread liuhongt
From: "hongtao.liu" Update in V2: Only remove SPR/GNR/DMR from avx512_move_by_pieces. Align move_max with prefer_vector_width for SPR/GNR/DMR similar as below commit. commit 6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_m

RE: [PATCH] Remove SPR/GNR/DMR from avx512_{move,store}_by pieces tune.

2025-09-16 Thread Liu, Hongtao
> -Original Message- > From: Richard Biener > Sent: Tuesday, September 16, 2025 3:03 PM > To: Liu, Hongtao > Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com > Subject: Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move,store}_by > pieces tune. > > On Tue, Sep 16, 2025 at 7:53 AM liuhong

[PATCH] i386/testsuite: Fix scan tree dump in vect-epilogue-4.c

2025-09-16 Thread Haochen Jiang
vect-epilogue-4.c uses mask 64 byte to vectorize in epilogue part. Similar as r16-876 fix for vect-epilogue-5.c, we need to adjust the scan tree dump. Ok for trunk? Thx, Haochen gcc/testsuite/ChangeLog: * gcc.target/i386/vect-epilogues-4.c: Fix for epilogue vect tree dump. ---

Re: [PATCH] forwprop: Handle memcpy for arguments with respect to copies

2025-09-16 Thread Richard Biener
On Mon, Sep 15, 2025 at 7:20 PM Andrew Pinski wrote: > > This moves the code used in optimize_agr_copyprop_1 (r16-3887-g597b50abb0d) > to handle this same case into its new function and use it inside > optimize_agr_copyprop_arg. This allows to remove more copies that show up only > in arguments. >

Re: [PATCH] Ada, libgnarl: Fix Ada bootstrap for Darwin.

2025-09-16 Thread Eric Botcazou
> Tested on x86_64-darwin, OK for trunk? Sure, thanks for fixing it, but... > > --- 8< --- > > Recent changes to Ada have produced a new diagnostic: > s-osinte.adb:34:18: warning: unit "Interfaces.C.Extensions"... > which causes a bootstrap fail on Darwin when Ada is enabled. > > Fixed thus. >

Re: Re: RISC-V: Improve slide patterns recognition

2025-09-16 Thread 钟居哲
OK juzhe.zh...@rivai.ai From: Raphael Zinsly Date: 2025-09-16 02:34 To: 钟居哲 CC: gcc-patches; kito.cheng; Robin Dapp; jeffreyalaw Subject: Re: RISC-V: Improve slide patterns recognition On Sun, Sep 14, 2025 at 11:12 PM 钟居哲 wrote: > > +/* Recognize patterns like [4 5 6 7 12 13 14 15] where a co

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 7:53 AM liuhongt wrote: > > From: "hongtao.liu" > > Align move_max with prefer_vector_width for SPR/GNR/DMR to avoid STLF issue. > It's similar as previous commit. > > commit 6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 > Author: liuhongt > Date: Thu Aug 15 12:54:07 2024 +0

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Richard Biener
On Mon, Sep 15, 2025 at 9:29 PM Eric Botcazou wrote: > > > Yes. So I read the comment in a way to say that TREE_THIS_NOTRAP does not > > mean the reference is writable. In some context we check > > > > || tree_could_trap_p (lhs) > > > > /* tree_could_trap_p is a predicate for

[PATCH] Ada, libgnarl: Fix Ada bootstrap for Darwin.

2025-09-16 Thread Iain Sandoe
Tested on x86_64-darwin, OK for trunk? thanks, Iain --- 8< --- Recent changes to Ada have produced a new diagnostic: s-osinte.adb:34:18: warning: unit "Interfaces.C.Extensions"... which causes a bootstrap fail on Darwin when Ada is enabled. Fixed thus. gcc/ada/ChangeLog: * libgnarl/s-o