Re: [PATCH] sched1: debug/model: dump predecessor list and BB num [NFC]

2024-11-24 Thread Vineet Gupta
On 11/6/24 14:20, Vineet Gupta wrote: > This is broken out of predecessor promotion patch so that debugging can > proceed during stage1 restrictions. > > Signed-off-by: Vineet Gupta ping ! > --- > gcc/haifa-sched.cc | 10 +- > gcc/sched-rgn.cc | 14 -- > 2 files changed,

Re: [PATCH v2] sched1: parameterize pressure scheduling spilling agressiveness [PR/114729]

2024-11-24 Thread Vineet Gupta
On 11/6/24 12:11, Vineet Gupta wrote: > changes since v1 > * Changed target hook to --param > * squash addon patch for RISC-V opting-in, testcase here > * updated changelog with latest perf numbers ping ! > --- > > sched1 computes ECC (Excess Change Cost) for each insn, which represents > t

Re: [PATCH 4/4] RISC-V: Add -fcf-protection=[full|branch|return] to enable zicfiss, zicfilp.

2024-11-24 Thread Kito Cheng
I guess this should also adjust the testcase as well? On Fri, Nov 15, 2024 at 6:55 PM Monk Chiang wrote: > > gcc/ChangeLog: > * gcc/config/riscv/riscv.cc > (is_zicfilp_p): New function. > (is_zicfiss_p): New function. > * gcc/config/riscv/riscv-zicfilp.cc: Upd

Re: [PATCH 2/2] RISC-V: Use dynamic shadow offset

2024-11-24 Thread Kito Cheng
committed :) On Wed, Nov 20, 2024 at 3:26 AM Jeff Law wrote: > > > > On 11/14/24 9:14 PM, Kito Cheng wrote: > > Switch to dynamic offset so that we can support Sv39, Sv48, and Sv57 at > > the same time without building multiple libasan versions! > > > > [1] > > https://github.com/llvm/llvm-proje

Re: [PATCH 1/2] asan: Support dynamic shadow offset

2024-11-24 Thread Kito Cheng
Committed with changelog update and minor tweak (move RISC-V bits to second patch) On Wed, Nov 20, 2024 at 4:18 AM Jeff Law wrote: > > > > On 11/14/24 9:14 PM, Kito Cheng wrote: > > AddressSanitizer has supported dynamic shadow offsets since 2016[1], but > > GCC hasn't implemented this yet becaus

Re: Patch ping - [PATCH] [APX EGPR] Fix indirect call prefix

2024-11-24 Thread Hongtao Liu
On Mon, Nov 25, 2024 at 2:32 PM Kong, Lingling wrote: > > Hi, > > LGTM. > Now Hongyu and Hongtao are working on APX. Ok. > > Thanks, > Lingling > > > -Original Message- > > From: Gregory Kanter > > Sent: Saturday, November 23, 2024 8:16 AM > > To: gcc-patches@gcc.gnu.org > > Cc: Kong, Lin

RE: Patch ping - [PATCH] [APX EGPR] Fix indirect call prefix

2024-11-24 Thread Kong, Lingling
Hi, LGTM. Now Hongyu and Hongtao are working on APX. Thanks, Lingling > -Original Message- > From: Gregory Kanter > Sent: Saturday, November 23, 2024 8:16 AM > To: gcc-patches@gcc.gnu.org > Cc: Kong, Lingling ; Gregory Kanter > > Subject: Patch ping - [PATCH] [APX EGPR] Fix indirect ca

Re: [PATCH] i386/testsuite: Correct AVX10.2 FP8 test mask usage

2024-11-24 Thread Hongtao Liu
On Fri, Nov 22, 2024 at 4:08 PM Haochen Jiang wrote: > > Hi all, > > Under FP8, we should not use AVX512F_LEN_HALF to get the mask size since > it will get 16 instead of 8 and drop into wrong if condition. Correct > the usage for vcvtneph2[b,h]f8[,s] runtime test. > > Tested under sde. Ok for trun

Re: [PATCH] Optimize 128-bit vector permutation with pand, pandn and por.

2024-11-24 Thread Hongtao Liu
On Wed, Nov 20, 2024 at 8:03 PM Cui, Lili wrote: > > Hi, all > > This patch aims to handle certain vector shuffle operations using pand, pandn > and por more efficiently. > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Although it's stage 3, I think this one is low risk, so O

[PATCH] aarch64: Use SVE ASRD instruction with Neon modes.

2024-11-24 Thread Soumya AR
The ASRD instruction on SVE performs an arithmetic shift right by an immediate for divide. This patch enables the use of ASRD with Neon modes. For example: int in[N], out[N]; void foo (void) { for (int i = 0; i < N; i++) out[i] = in[i] / 4; } compiles to: ldr q31, [x1, x0]

Re: [PATCH] [RFC] Add extra 64bit SSE vector epilogue in some cases

2024-11-24 Thread Hongtao Liu
On Sun, Nov 24, 2024 at 8:05 PM Richard Biener wrote: > > > > > Am 24.11.2024 um 09:17 schrieb Hongtao Liu : > > > > On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote: > >> > >> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables > >> an extra 128bit SSE vector epilouge when doi

[PATCH v1 2/2] RISC-V: Refactor the testcases for RVV gather/scatter

2024-11-24 Thread pan2 . li
From: Pan Li This patch would like to refactor the testcases of gather/scatter after sorts of optimization option passing to testcase. Includes: * Remove unnecessary optimization options. * Adjust dg-final by any-opts and/or no-opts if the rtl dump changes on different optimization options (l

[PATCH v1 1/2] RISC-V: Fix incorrect optimization options passing to gather/scatter

2024-11-24 Thread pan2 . li
From: Pan Li Like the strided load/store, the testcases of vector gather/scatter are designed to pick up different sorts of optimization options but actually these option are ignored according to the Execution log of gcc.log. This patch would like to make it correct almost the same as what we fi

Pushed: [PATCH] pa: Remove pa_section_type_flags

2024-11-24 Thread Xi Ruoyao
On Sun, 2024-11-24 at 14:12 -0500, John David Anglin wrote: > I don't see any regressions with this change.  Patch is okay > if you remove declaration of pa_section_type_flags in pa.cc. Pushed https://gcc.gnu.org/r15-5641 with the declaration of pa_section_type_flags removed. -- Xi Ruoyao Schoo

RE: [committed] c: Default to -std=gnu23

2024-11-24 Thread Jiang, Haochen
> From: Joseph Myers > Sent: Saturday, November 16, 2024 7:47 AM > > Change the default language version for C compilation from -std=gnu17 > to -std=gnu23. A few tests are updated to remove local definitions of > bool, true and false (where making such an unconditional test change > seemed to ma

[patch, fortran] PR117765 Impure function within a BLOCK construct within a DO CONCURRENT

2024-11-24 Thread Jerry D
I would like to commit the attached patch for Steve. Regression tested on x86-64-linux-gnu. OK for trunk? Author: Steve Kargl Date: Sun Nov 24 18:26:03 2024 -0800 Fortran: Check IMPURE in BLOCK inside DO CONCURRENT. PR fortran/117765 gcc/fortran/ChangeLog:

RE: [gcc-wwwdocs PATCH] gcc-15: Mention new ISA and Diamond Rapids support for x86_64 backend

2024-11-24 Thread Jiang, Haochen
> From: Gerald Pfeifer > Sent: Sunday, November 24, 2024 7:17 AM > > On Mon, 11 Nov 2024, Haochen Jiang wrote: > > This patch will add recent new ISA and arch support for x86_64 backend > > into gcc-wwwdocs. > > > + New ISA extension support for Intel AMX-AVX512 was added. > > In all these cas

[PATCH v2] I386: Add more testcases for unsigned SAT_ADD vector pattern

2024-11-24 Thread pan2 . li
From: Pan Li Update in v2: * Skip lto build as no such dump files. * scan dump check for optimized. Original log: There are some forms like below failed to recog the SAT_ADD pattern for target i386. It is related to some match pattern extraction but get fixed after the refactor of the SAT_ADD

Re: [PATCH v8] Target-independent store forwarding avoidance.

2024-11-24 Thread Philipp Tomsich
Pushed to master with the following fixups: - new timevar added - nits addressed - whitespace fixes Philipp. On Mon, 25 Nov 2024 at 03:30, Jeff Law wrote: > > > > On 11/9/24 2:48 AM, Konstantinos Eleftheriou wrote: > > From: kelefth > > > > This pass detects cases of expensive store forw

Re: [PATCH] [x86] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-24 Thread Hongtao Liu
On Fri, Nov 22, 2024 at 9:16 PM Richard Biener wrote: > > On Fri, 22 Nov 2024, liuhongt wrote: > > > It could cause weired spill in RA when register pressure is high. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for trunk? > > > > BTW, It's difficult to get a decent tes

RE: [PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector pattern

2024-11-24 Thread Li, Pan2
> You're scanning ".SAT_ADD ", so maybe better with pass "optimized" instead of > "expand"? Sure, let me update in v2. Pan -Original Message- From: Liu, Hongtao Sent: Monday, November 25, 2024 10:09 AM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: ubiz...@gmail.com Subject: RE: [PATCH v1

RE: [PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector pattern

2024-11-24 Thread Liu, Hongtao
> -Original Message- > From: Li, Pan2 > Sent: Monday, November 25, 2024 10:01 AM > To: gcc-patches@gcc.gnu.org > Cc: ubiz...@gmail.com; Liu, Hongtao ; Li, Pan2 > > Subject: [PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector > pattern > > From: Pan Li > > There are some

[PATCH v1] I386: Add more testcases for unsigned SAT_ADD vector pattern

2024-11-24 Thread pan2 . li
From: Pan Li There are some forms like below failed to recog the SAT_ADD pattern for target i386. It is related to some match pattern extraction but get fixed after the refactor of the SAT_ADD pattern. Thus, add testcases to ensure we may have similar issue in futrue. #define DEF_SAT_ADD(T)

[PATCH v1] Match: Refactor the unsigned SAT_ADD match ADD_OVERFLOW pattern [NFC]

2024-11-24 Thread pan2 . li
From: Pan Li This patch would like to refactor the unsigned SAT_ADD pattern when leverage the IFN ADD_OVERFLOW, aka: * Extract type check outside. * Re-arrange the related match pattern forms together. * Remove unnecessary helper pattern matches. The below test suites are passed for this patch.

Re: [Bug fortran/84869] [12/13/14/15 Regression] ICE in gfc_class_len_get, at fortran/trans-expr.c:233

2024-11-24 Thread Harald Anlauf
Am 24.11.24 um 17:40 schrieb Paul Richard Thomas: Fixed as 'obvious' on 13-branch to mainline with commit r15-5629-g470ebd31843db58fc503ccef38b82d0da93c65e4 An error with PR number in the mainline ChangeLogs will be corrected tomorrow. Fortran: Fix segfault in allocation of unlimited poly

[committed] testsuite/x86: Add -mfpmath=sse to add_options_for_float16

2024-11-24 Thread Uros Bizjak
Add -mfpmath=sse to add_options_for_float16 to avoid error: '-fexcess-precision=16' is not compatible with '-mfpmath=387' when compiling gcc.dg/tree-ssa/pow_fold_1.c. gcc/testsuite/ChangeLog: * lib/target-supports.exp (add_options_for_float16): Add -mpfpmath=sse. Tested on x86_64-linux-gnu {

[committed] i386: x86 can use x >> -y for x >> 32-y [PR36503]

2024-11-24 Thread Uros Bizjak
x86 targets mask 32-bit shifts with a 5-bit mask (and 64-bit with 6-bit mask), so they can use x >> -y instead of x >> 32-y. This form is very common in bitstream readers, where it's used to read the top N bits from a word. The optimization converts: movl$32, %ecx subl%es

[PUSHED] opt.url: Regenerate the .opt.urls files

2024-11-24 Thread Andrew Pinski
Just regenerated them after the addition of msplit-bit-shift avr option. Pushed as obvious. gcc/ChangeLog: * config/avr/avr.opt.urls: Regenerate. * config/g.opt.urls: Regenerate. * config/i386/nto.opt.urls: Regenerate. * config/riscv/riscv.opt.urls: Regenerate.

Re: [PATCH] RISC-V: Ensure vtype for full-register moves [PR117544].

2024-11-24 Thread Jeff Law
On 11/22/24 10:48 AM, Robin Dapp wrote: Hi, as discussed in PR117544 the VTYPE register is not preserved across function calls. Even though vmv1r-like instructions operate independently of the actual vtype they still require a valid vtype. As we cannot guarantee that the vtype is valid we m

Re: [PATCH v8] Target-independent store forwarding avoidance.

2024-11-24 Thread Jeff Law
On 11/9/24 2:48 AM, Konstantinos Eleftheriou wrote: From: kelefth This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0,

[SPARC] Fix PR target/117715

2024-11-24 Thread Eric Botcazou
This fixes the vectorization regressions present on the SPARC by switching from vcond[u] patterns to vec_cmp[u] + vcond_mask_ patterns. While I was at it, I merged the patterns for V4HI/V2SI and V8QI enabled with VIS 3/VIS 4 to follow the model of those enabled with VIS 4B, and standardized the

Re: [PATCH] pa: Remove pa_section_type_flags

2024-11-24 Thread John David Anglin
I don't see any regressions with this change. Patch is okay if you remove declaration of pa_section_type_flags in pa.cc. Dave On Thu, Nov 21, 2024 at 09:04:52PM +0800, Xi Ruoyao wrote: > It's no longer needed since r15-4842 (when the target-independent code > started to handle the case). > > gc

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-24 Thread Jeff Law
On 11/24/24 9:27 AM, Mariam Arutunian wrote: Thank you very much! I'll have a look. Please let me know if there's anything specific you’d like me to address. Not yet. Things are looking really good. Enough so that I've been diving into the small word targets (avr, pru, rl78). Jeff

Re: [PATCH v2 10/14] Support for 64-bit location_t: gimple parts

2024-11-24 Thread Hans-Peter Nilsson
On Sat, 16 Nov 2024, Lewis Hyatt wrote: > The size of struct gimple increases by 8 bytes with the change in size of > location_t from 32- to 64-bit Half-way scrolling through the patches, this seems a good time for a possibly disruptive comment from the side-line: ;-) For the size-critical types

[Bug fortran/84869] [12/13/14/15 Regression] ICE in gfc_class_len_get, at fortran/trans-expr.c:233

2024-11-24 Thread Paul Richard Thomas
Fixed as 'obvious' on 13-branch to mainline with commit r15-5629-g470ebd31843db58fc503ccef38b82d0da93c65e4 An error with PR number in the mainline ChangeLogs will be corrected tomorrow. Fortran: Fix segfault in allocation of unlimited poly array [PR84869] 2024-11-24 Paul Thomas g

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-24 Thread Mariam Arutunian
On Sun, Nov 24, 2024, 08:59 Jeff Law wrote: > > > On 11/13/24 7:16 AM, Mariam Arutunian wrote: > > > > > > > To address this, I added code in |target-supports.exp| and modified the > > relevant tests. > > I've attached the patch. Could you please check whether it is correct? > Just a few more not

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-24 Thread Mark Wielaard
Hi Jason, On Fri, Nov 22, 2024 at 05:13:30PM +0100, Jason Merrill wrote: > My take has been that this change is not necessary for us because > the FSF can accept copyright assignment for pseudonymous > contributions, so individual reviewers don't need to adjudicate > whether a particular pseudonym

Re: improve std::deque::_M_reallocate_map

2024-11-24 Thread Jan Hubicka
> Hi, > looking into reason why we still do throw_bad_alloc in clang binary I noticed > that quite few calls come from deque::_M_reallocate_map. This patch adds > unreachable to limit the size of realloc_map. _M_reallocate_map is called > only > if new size is smaller then max_size. map is an a

Re: [PATCH] libsanitizer: Remove -pedantic from AM_CXXFLAGS [PR117732]

2024-11-24 Thread Jeff Law
On 11/22/24 5:44 PM, Jakub Jelinek wrote: Hi! We aren't the master repository for the sanitizers and clearly upstream introduces various extensions in the code. All we care about is whether it builds and works fine with GCC, so -pedantic flag is of no use to us, only maybe to upstream if they

Re: [PATCH] wwwdocs: Align the DCO text for the GNU Toolchain to match community usage.

2024-11-24 Thread Mark Wielaard
Hi Carlos, On Thu, Nov 21, 2024 at 02:26:39PM -0500, Carlos O'Donell wrote: > On 11/21/24 1:47 PM, Sam James wrote: > > Mark Wielaard writes: > >> On Thu, 2024-11-21 at 12:04 -0500, Carlos O'Donell wrote: > >> > >> I suggest including the actual clarification in the explantion, so > >> there is n

Re: [PATCH] testsuite: Fix up various powerpc tests after -std=gnu23 by default switch [PR117663]

2024-11-24 Thread Kewen Lin
Hi Jakub, 在 2024/11/22 16:18, Jakub Jelinek 写道: > Hi! > > These tests use the K&R function style definitions or pass arguments > to () functions. > It seemed easiest to just use -std=gnu17 for all of those. Thanks for fixing! I slightly prefer passing -Wno-old-style-definition instead as the te

Re: [PATCH] rs6000: Add PowerPC inline asm redzone clobber support

2024-11-24 Thread Kewen Lin
Hi Jakub, Thanks for doing this! 在 2024/11/7 20:16, Jakub Jelinek 写道: > Hi! > > The following patch on top of the > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667949.html > patch adds rs6000 part of the support (the only other target I'm aware of > which clearly has red zone as well

Re: [PATCH ver2 4/4] rs6000, Add tests and documentation for vector, conversions between integer and float

2024-11-24 Thread Kewen Lin
Hi Carl, 在 2024/10/1 23:28, Carl Love 写道: > > GCC maintainers: > > Version 2, added the argument changes for the__builtin_vsx_uns_double[e | o | > h | l ]_v4si built-ins. Added support to the vector {un,}signed int to vector > float builtins so they are supported using Altivec instructions if

Re: [PATCH ver2 3/4] rs6000, Remove redundant built-in __builtin_vsx_xvcvuxwdp

2024-11-24 Thread Kewen Lin
Hi Carl, 在 2024/10/1 23:27, Carl Love 写道: > > > GCC maintainers: > > Version 2: Fixed the wording in the changelog per the feedback. With this > change the patch was approved by Kewen. > > The patch removed the built-in __builtin_vsx_xvcvuxwdp as it is covered by > the overloaded vec_doubleo

Re: [PATCH ver2 2/4] rs6000, remove built-ins __builtin_vsx_vperm_8hi and, __builtin_vsx_vperm_8hi_uns

2024-11-24 Thread Kewen Lin
Hi Carl, 在 2024/10/1 23:27, Carl Love 写道: > > GCC maintainers: > > version 2, added the reference to the patch where the removal of the > built-ins was missed.  Note, patch was approved by Kewen with this change. > > The following patch removes two redundant built-ins __builtin_vsx_vperm_8hi

Re: [PATCH ver2 1/4] rs6000, add testcases to the overloaded vec_perm built-in

2024-11-24 Thread Kewen Lin
Hi Carl, 在 2024/10/1 23:27, Carl Love 写道: > > > GCC maintainers: > > Version 2, fixed the changelog, updated the wording in the documentation and > updated the argument types in the vsx-builtin-3.c test file. > > The following patch adds missing test cases for the overloaded vec_perm > built

Re: [PATCH] rs6000, fix test builtins-1-p10-runnable.c

2024-11-24 Thread Kewen Lin
Hi Carl, 在 2024/10/3 23:11, Carl Love 写道: > GCC maintainers: > > The builtins-1-10-runnable.c has the debugging inadvertently enabled.  The > test uses #ifdef to enable/disable the debugging. Unfortunately, the #define > DEBUG was set to 0 to disable debugging and enable the call to abort in ca

Re: [PATCH] gimplefe: Fix handling of ')'/'}' after a parse error [PR117741]

2024-11-24 Thread Richard Biener
> Am 24.11.2024 um 02:36 schrieb Andrew Pinski : > > The problem here is c_parser_skip_until_found stops at a closing nesting > delimiter without consuming it. So if we don't consume it in > c_parser_gimple_compound_statement, we would go into an infinite loop. The C > parser similar code in c

Re: [PATCH] [RFC] Add extra 64bit SSE vector epilogue in some cases

2024-11-24 Thread Richard Biener
> Am 24.11.2024 um 09:17 schrieb Hongtao Liu : > > On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote: >> >> Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables >> an extra 128bit SSE vector epilouge when doing 512bit AVX512 >> vectorization in the main loop the following all

[PATCH v2] ada: PR target/117538 Traceback includes load address if executable is PIE.

2024-11-24 Thread Simon Wright
If s-trasym.adb (System.Traceback.Symbolic, used as a renaming by GNAT.Traceback.Symbolic) is given a traceback from a position-independent executable, it does not include the executable's load address in the report. This is necessary in order to decode the traceback report. Note, this has already

Re: [PATCH] [RFC] Add extra 64bit SSE vector epilogue in some cases

2024-11-24 Thread Hongtao Liu
On Fri, Nov 22, 2024 at 9:33 PM Richard Biener wrote: > > Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables > an extra 128bit SSE vector epilouge when doing 512bit AVX512 > vectorization in the main loop the following allows a 64bit SSE > vector epilogue to be generated when the pr