Re: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match

2024-06-11 Thread Richard Biener
On Mon, Jun 10, 2024 at 4:49 PM wrote: > > From: Pan Li > > When enabled the PHI handing for COND_EXPR, we need to insert the gcall > to replace the PHI node. Unfortunately, I made a mistake that insert > the gcall to before the last stmt of the bb. See below gimple, the PHI > is located at

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Xi Ruoyao
On Sat, 2024-05-11 at 17:16 +0200, FX Coudert wrote: > * libgccjit.h: Include Per the C standard size_t should be provided by stddef.h. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Richard Biener
On Mon, 10 Jun 2024, Jeff Law wrote: > > > On 6/10/24 1:55 AM, Manolis Tsamis wrote: > > >> > > There was an older submission of a load-pair specific pass but this is > > a complete reimplementation and indeed significantly more general. > > Apart from being target independant, it addresses a n

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Richard Biener
On Tue, 11 Jun 2024, FX Coudert wrote: > Hi > > I can’t seem to get a review of this one-line patch. Could a global reviewer > help? While stdio.h can be relied on to exist I do not think you can assume the same for sys/types.h without "configury", but libgccjit.h is an installed API. I would

Re: [PATCH] Add SLP_TREE_MEMORY_ACCESS_TYPE

2024-06-11 Thread Richard Sandiford
Richard Biener writes: > It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE > to identify operations from (emulated) gathers for example. This > doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE > there as the vectorization strathegy might differ be

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote: > On Tue, 11 Jun 2024, FX Coudert wrote: > > > Hi > > > > I can’t seem to get a review of this one-line patch. Could a global > > reviewer help? > > While stdio.h can be relied on to exist I do not think you can assume > the same f

RE: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match

2024-06-11 Thread Li, Pan2
Thanks Richard for comments. > This should use gsi_after_labels (bb); otherwise you'll ICE when there's a > label > in the BB. > Please fix the label issue though. Sure. > You also have to look out for a first stmt that returns twice since > you may not insert anything before that. I would su

[PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread pan2 . li
From: Pan Li As the middle support of .SAT_SUB committed, implement the unsigned vector int of .SAT_SUB for the riscv backend. Consider below example code: void __attribute__((noinline)) \ vec_sat_u_sub_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit

Re: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Robin Dapp
Hi Pan, in general LGTM. Would you mind adding the coremark-pro testcase which should be working now, and, was the original reason for doing this? I believe the following should do: extern int wsize; typedef unsigned short Posf; #define NIL 0 void foo (Posf *p) { register unsigned n, m; d

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Iain Sandoe
> On 11 Jun 2024, at 08:44, Jakub Jelinek wrote: > > On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote: >> On Tue, 11 Jun 2024, FX Coudert wrote: >> >>> Hi >>> >>> I can’t seem to get a review of this one-line patch. Could a global >>> reviewer help? >> >> While stdio.h can be

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Richard Biener
On Tue, 11 Jun 2024, Iain Sandoe wrote: > > > > On 11 Jun 2024, at 08:44, Jakub Jelinek wrote: > > > > On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote: > >> On Tue, 11 Jun 2024, FX Coudert wrote: > >> > >>> Hi > >>> > >>> I can’t seem to get a review of this one-line patch. Co

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Richard Biener
On Tue, 11 Jun 2024, Richard Biener wrote: > On Tue, 11 Jun 2024, Iain Sandoe wrote: > > > > > > > > On 11 Jun 2024, at 08:44, Jakub Jelinek wrote: > > > > > > On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote: > > >> On Tue, 11 Jun 2024, FX Coudert wrote: > > >> > > >>> Hi > >

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Iain Sandoe
> On 11 Jun 2024, at 09:06, Richard Biener wrote: > > On Tue, 11 Jun 2024, Richard Biener wrote: > >> On Tue, 11 Jun 2024, Iain Sandoe wrote: >> >>> >>> On 11 Jun 2024, at 08:44, Jakub Jelinek wrote: On Tue, Jun 11, 2024 at 09:27:37AM +0200, Richard Biener wrote: > On

RE: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Li, Pan2
Thanks Robin. > in general LGTM. Would you mind adding the coremark-pro > testcase which should be working now, and, was the original > reason for doing this? Yes, of course. Unfortunately, the pattern from coremark-pro is not working for now because it is branch form that generate PHI node du

Re: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Robin Dapp
Thanks, the patch is OK then. Regards Robin

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Andreas Schwab
On Jun 11 2024, Richard Biener wrote: >> Don't you also need to add >> >> approrpiate #define _POSIX_C_SOURCE or #define _XOPE_SOURCE befor the >> include in case somebody builds with -std=c99? Such feature macros can only be defined before the very first include of a system header. > Oh, and t

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 10:06:49AM +0200, Richard Biener wrote: > > approrpiate #define _POSIX_C_SOURCE or #define _XOPE_SOURCE befor the > > include in case somebody builds with -std=c99? > > Oh, and the manpage says that also defines ssize_t which > is a bit odd since we already include that ..

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread FX Coudert
> While stdio.h can be relied on to exist I do not think you can assume > the same for sys/types.h without "configury", but libgccjit.h is an > installed API. sys/types.h is already included unconditionally in gcc/system.h and gcc/tsystem.h. The later says: /* All systems have this header. */ #

[PATCH] s390: Extend two/four element integer vectors

2024-06-11 Thread Stefan Schulze Frielinghaus
For the moment I deliberately left out one-element QHS vectors since it is unclear whether these are pathological cases or whether they are really used. If we ever get an extend for V1DI -> V1TI we should reconsider this. As a side-effect this fixes PR115261. gcc/ChangeLog: target/PR115

RE: [PATCH v1] RISC-V: Implement .SAT_SUB for unsigned vector int

2024-06-11 Thread Li, Pan2
Committed, thanks Robin. Pan -Original Message- From: Robin Dapp Sent: Tuesday, June 11, 2024 4:19 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com Subject: Re: [PATCH v1] RISC-V: Implement .SAT_SUB for

[PATCH] s390: Extend two element float vector

2024-06-11 Thread Stefan Schulze Frielinghaus
This implements a V2SF -> V2DF extend. gcc/ChangeLog: * config/s390/vector.md (*vmrhf): New. (extendv2sfv2df2): New. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-extend-3.c: New test. --- Bootstrap and regtested on s390. Ok for mainline? gcc/config/s390/vect

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-11 Thread Andreas Schwab
On Jun 11 2024, Iain Sandoe wrote: > well, afaict, all the code is c++ and we are building with a std >= 11, so > that > presumes c99 support. The C standard does not define ssize_t at all, it is only part of POSIX. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1

Re: [PATCH] s390: Extend two/four element integer vectors

2024-06-11 Thread Andreas Krebbel
On 6/11/24 10:24, Stefan Schulze Frielinghaus wrote: For the moment I deliberately left out one-element QHS vectors since it is unclear whether these are pathological cases or whether they are really used. If we ever get an extend for V1DI -> V1TI we should reconsider this. As a side-effect t

[PATCH V4 2/2] split complicate 64bit to constant pool under -m32 -mpowerpc64

2024-06-11 Thread Jiufu Guo
Hi, For "-m32 -mpowerpc64", it is also ok to use just one instruciton (p?ld) to loading 64bit constant from memory. So, splitting the complicate 64bit constant to constant pool should also work under this case. Bootstrap and regtest pass on ppc64{,le}. Also no regression for "-m32 -mpowerpc64" va

[PATCH V4 1/2] split complicate 64bit constant to memory

2024-06-11 Thread Jiufu Guo
Hi, Sometimes, a complicated constant is built via 3(or more) instructions. Generally speaking, it would not be as fast as loading it from the constant pool (as the discussions in PR63281): "ld" is one instruction. If consider "address/toc" adjust, we may count it as 2 instructions. And "pld" ma

Re: [PATCH] s390: Extend two element float vector

2024-06-11 Thread Andreas Krebbel
On 6/11/24 10:26, Stefan Schulze Frielinghaus wrote: This implements a V2SF -> V2DF extend. gcc/ChangeLog: * config/s390/vector.md (*vmrhf): New. (extendv2sfv2df2): New. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-extend-3.c: New test. Since we already have

[PATCH][v2] tree-optimization/114107 - avoid peeling for gaps in more cases

2024-06-11 Thread Richard Biener
The following refactors the code to detect necessary peeling for gaps, in particular the PR103116 case when there is no gap but the group size is smaller than the vector size. The testcase in PR114107 shows we fail to SLP for (int i=0; i

[PATCH] tree-optimization/115385 - handle more gaps with peeling of a single iteration

2024-06-11 Thread Richard Biener
The following makes peeling of a single scalar iteration handle more gaps, including non-power-of-two cases. This can be done by rounding up the remaining access to the next power-of-two which ensures that the next scalar iteration will pick at least the number of excess elements we access. I've

[PATCH] Improve code generation of strided SLP loads

2024-06-11 Thread Richard Biener
This avoids falling back to elementwise accesses for strided SLP loads when the group size is not a multiple of the vector element size. Instead we can use a smaller vector or integer type for the load. For stores we can do the same though restrictions on stores we handle and the fact that store-

Re: [PATCH] s390: Extend two element float vector

2024-06-11 Thread Stefan Schulze Frielinghaus
On Tue, Jun 11, 2024 at 10:42:26AM +0200, Andreas Krebbel wrote: > On 6/11/24 10:26, Stefan Schulze Frielinghaus wrote: > > This implements a V2SF -> V2DF extend. > > > > gcc/ChangeLog: > > > > * config/s390/vector.md (*vmrhf): New. > > (extendv2sfv2df2): New. > > > > gcc/testsuite/Chang

Re: [PATCH] rust: Do not link with libdl and libpthread unconditionally

2024-06-11 Thread Arthur Cohen
Thanks Richi! Tested again and pushed on trunk. Best, Arthur On 5/31/24 15:02, Richard Biener wrote: On Fri, May 31, 2024 at 12:24 PM Arthur Cohen wrote: Hi Richard, On 4/30/24 09:55, Richard Biener wrote: On Fri, Apr 19, 2024 at 11:49 AM Arthur Cohen wrote: Hi everyone, This patch c

Re: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match

2024-06-11 Thread Richard Biener
On Tue, Jun 11, 2024 at 9:45 AM Li, Pan2 wrote: > > Thanks Richard for comments. > > > This should use gsi_after_labels (bb); otherwise you'll ICE when there's a > > label > > in the BB. > > Please fix the label issue though. > > Sure. > > > You also have to look out for a first stmt that returns

[PATCH v2] middle-end: Drop __builtin_prefetch calls in autovectorization [PR114061]

2024-06-11 Thread Victor Do Nascimento
At present the autovectorizer fails to vectorize simple loops involving calls to `__builtin_prefetch'. A simple example of such loop is given below: void foo(double * restrict a, double * restrict b, int n){ int i; for(i=0; i *references) clobbers_memory = true; break;

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 10/06/24 3:58 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> On 10/06/24 3:20 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: On 10/06/24 2:52 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> On 10/06/24 2:12 pm, Richard Sandiford wrote: >>

Re: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space

2024-06-11 Thread Andrew Burgess
Jeff Law writes: > On 2/10/24 10:26 AM, Andrew Burgess wrote: >> GDB makes use of the libiberty function buildargv for splitting the >> inferior (program being debugged) argument string in the case where >> the inferior is not being started under a shell. >> >> I have recently been working to im

Re: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space

2024-06-11 Thread Andrew Burgess
Jeff Law writes: > On 2/10/24 10:26 AM, Andrew Burgess wrote: >> GDB makes use of the libiberty function buildargv for splitting the >> inferior (program being debugged) argument string in the case where >> the inferior is not being started under a shell. >> >> I have recently been working to im

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > After LRA reload: > > (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] > [240]) > (mem:V2DF (plus:DI (reg:DI 8 8 [orig:1285 ivtmp.886 ] [1285]) > (const_int 16 [0x10])) [1 MEM > [(real(kind=8) *)_4

[PATCH] middle-end/115426 - wrong gimplification of "rm" asm output operand

2024-06-11 Thread Richard Biener
When the operand is gimplified to an extract of a register or a register we have to disallow memory as we otherwise fail to gimplify it properly. Instead of __asm__("" : "=rm" __imag ); we want __asm__("" : "=rm" D.2772); _1 = REALPART_EXPR ; r = COMPLEX_EXPR <_1, D.2772>; otherwise SS

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 4:36 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> After LRA reload: >> >> (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] >> [240]) >> (mem:V2DF (plus:DI (reg:DI 8 8 [orig:1285 ivtmp.886 ] [1285]) >>

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > On 11/06/24 4:56 pm, Ajit Agarwal wrote: >> Hello Richard: >> >> On 11/06/24 4:36 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: After LRA reload: (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 5:15 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> On 11/06/24 4:56 pm, Ajit Agarwal wrote: >>> Hello Richard: >>> >>> On 11/06/24 4:36 pm, Richard Sandiford wrote: Ajit Agarwal writes: > After LRA reload: > > (in

[committed] libstdc++: Add test for chrono::leap_seconds ostream insertion

2024-06-11 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- Also add a comment to the three-way comparison oeprator for chrono::leap_seconds, noting the deviation from the spec (which is functionally equivalent). What we implement is the originally proposed resolution to LWG 3383, which should compile slightl

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 4:56 pm, Ajit Agarwal wrote: > Hello Richard: > > On 11/06/24 4:36 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> After LRA reload: >>> >>> (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] >>> [240]) >>> (me

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 4:56 pm, Ajit Agarwal wrote: > Hello Richard: > > On 11/06/24 4:36 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> After LRA reload: >>> >>> (insn 9299 2472 2412 187 (set (reg:V2DF 51 19 [orig:240 vect__302.545 ] >>> [240]) >>> (me

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > > On 11/06/24 5:15 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> Hello Richard: >>> On 11/06/24 4:56 pm, Ajit Agarwal wrote: Hello Richard: On 11/06/24 4:36 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> After

[PATCH v2 0/4] Libatomic: Cleanup ifunc selector and aliasing

2024-06-11 Thread Victor Do Nascimento
Changes in V2: As explained in patch v2 1/4, it has become clear that the current approach of querying assembler support for newer architectural extensions at compile time is undesirable both from a maintainability as well as a consistency standpoint - Different compiled versions of Libatomic may

[PATCH v2 1/4] Libatomic: AArch64: Convert all lse128 assembly to .insn directives

2024-06-11 Thread Victor Do Nascimento
Given the lack of support for the LSE128 instructions in all but the the most up-to-date version of Binutils (2.42), having the build-time test for assembler support for these instructions often leads to the building of Libatomic without support for LSE128-dependent atomic function implementations.

[PATCH v2 3/4] Libatomic: Make ifunc selector behavior contingent on importing file

2024-06-11 Thread Victor Do Nascimento
By querying previously-defined file-identifier macros, `host-config.h' is able to get information about its environment and, based on this information, select more appropriate function-specific ifunc selectors. This reduces the number of unnecessary feature tests that need to be carried out in ord

[PATCH v2 4/4] Libatomic: Clean up AArch64 `atomic_16.S' implementation file

2024-06-11 Thread Victor Do Nascimento
At present, `atomic_16.S' groups different implementations of the same functions together in the file. Therefore, as an example, the LSE2 implementation of `load_16' follows on immediately from its core implementation, as does the `store_16' LSE2 implementation. Such architectural extension-depen

[PATCH v2 2/4] Libatomic: Define per-file identifier macros

2024-06-11 Thread Victor Do Nascimento
In order to facilitate the fine-tuning of how `libatomic_i.h' and `host-config.h' headers are used by different atomic functions, we define distinct identifier macros for each file which, in implementing atomic operations, imports these headers. The idea is that different parts of these headers co

Re: [COMMITTED] tree-optimization/115221 - Do not invoke SCEV if it will use a different range query.

2024-06-11 Thread Andrew MacLeod
On 5/29/24 03:19, Richard Biener wrote: On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote: The original patch causing the PR made ranger's cache re-entrant to enable SCEV to use the current range_query when called from within ranger.. SCEV uses the currently active range query (via get_ran

Re: [PATCH] gimple ssa: Teach switch conversion to optimize powers of 2 switches

2024-06-11 Thread Richard Biener
On Thu, 30 May 2024, Filip Kastl wrote: > Hi, > > This patch adds a transformation into the switch conversion pass -- > the "exponential index transform". This transformation can help switch > conversion convert switches it otherwise could not. The transformation is > intended for switches whos

[Patch, Fortran] 0/3 (PR90076) Setting _vptr correctly.

2024-06-11 Thread Andre Vehreschild
Hi GFortraneers, I like to present a small series of patches. While working of PR90076 and figuring how to best set the _vptr of class types, I discovered several ways of doing this in slightly different ways which are more or less complete (mostly rather less). I therefore decided to fix not only

[Patch, Fortran] 2/3 Refactor locations where _vptr is (re)set.

2024-06-11 Thread Andre Vehreschild
Hi all, this patch refactors most of the locations where the _vptr of a class data type is reset. The code was inconsistent in most of the locations. The goal of using only one routine for setting the _vptr is to be able to later modify it more easily. The ultimate goal being that every time one

[Patch, Fortran, 90076] 1/3 Fix Polymorphic Allocate on Assignment Memory Leak

2024-06-11 Thread Andre Vehreschild
Hi all, the attached patch fix the last case in the bug report. The inital example code is already fixed by the combination of PR90068 and PR90072. The issue was the _vptr was not (re)set correctly, like in the __vtab_...-structure was not created. This made the compiler ICE. Regtests fine on x8

[Patch, Fortran] 3/3 RFC: Introduce gfc_class_set_vptr.

2024-06-11 Thread Andre Vehreschild
Hi all, although this mail has a patch attached, it is rather a request for comment. The attached patch introduces `gfc_class_set_vptr()` for consistently assigning the _vptr of a class data type. I figured that gfortran does these assignments in various locations and does them differently everywh

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 6:12 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> On 11/06/24 5:15 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 4:56 pm, Ajit Agarwal wrote: > Hello Richard: > > On 11/06/24 4:36 p

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > On 11/06/24 6:12 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> Hello Richard: >>> >>> On 11/06/24 5:15 pm, Richard Sandiford wrote: Ajit Agarwal writes: > Hello Richard: > On 11/06/24 4:56 pm, Ajit Agarwal wrote: >> Hello Richar

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Jeff Law
On 6/11/24 1:22 AM, Richard Biener wrote: Absolutely. But forwarding from a smaller store to a wider load is painful from a hardware standpoint and if we can avoid it from a codegen standpoint, we should. Note there's also the possibility to increase the distance between the store and the

Re: [PATCH v3 1/2] Factor out static_assert constexpr string extraction for reuse

2024-06-11 Thread Jason Merrill
On 6/5/24 00:45, Andi Kleen wrote: The only semantics changes are slightly more vague error messages to generalize. Just a few nits: +/* Extracting strings from constexpr. */ + +class cexpr_str +{ +public: + cexpr_str (tree message) : message(message) {} Space before paren. ... +/* Get

RE: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match

2024-06-11 Thread Li, Pan2
Got it. Thanks Richard. Pan -Original Message- From: Richard Biener Sent: Tuesday, June 11, 2024 5:31 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match On Tue, Jun

Re: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Jeff Law
On 6/11/24 12:19 AM, pan2...@intel.com wrote: From: Pan Li The test cases of pr115387 are target independent, at least x86 and riscv are able to reproduce. Thus, move these cases to the gcc.dg/torture. The below test suites are passed. 1. The rv64gcv fully regression test. 2. The x86 ful

Re: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Andrew Pinski
On Mon, Jun 10, 2024, 11:20 PM wrote: > From: Pan Li > > The test cases of pr115387 are target independent, at least x86 > and riscv are able to reproduce. Thus, move these cases to > the gcc.dg/torture. > > The below test suites are passed. > 1. The rv64gcv fully regression test. > 2. The x8

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Philipp Tomsich
On Tue, 11 Jun 2024 at 15:37, Jeff Law wrote: > > > > On 6/11/24 1:22 AM, Richard Biener wrote: > > >> Absolutely. But forwarding from a smaller store to a wider load is > >> painful > >> from a hardware standpoint and if we can avoid it from a codegen > >> standpoint, > >> we should. > > > >

[PATCH v1] Widening-Mul: Take gsi after_labels instead of start_bb for gcall insertion

2024-06-11 Thread pan2 . li
From: Pan Li We inserted the gcall of .SAT_ADD before the gsi_start_bb for avoiding the ssa def after use ICE issue. Unfortunately, there will be the potential ICE when the first stmt is label. We cannot insert the gcall before the label. Thus, we take gsi_after_labels to locate the 'really'

RE: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Li, Pan2
> Since you are moving it to torture, please remove -O3 as it is already > supplied there as one of the torture options. Oh, I see. Thanks for comments, and will update it in v2. Pan From: Andrew Pinski Sent: Tuesday, June 11, 2024 9:45 PM To: Li, Pan2 Cc: GCC Patches ; 钟居哲 ; Kito Cheng ; Ri

Re: [PATCH v3 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-11 Thread Richard Earnshaw (lists)
On 10/06/2024 15:04, Torbjörn SVENSSON wrote: > Properly handle zero and sign extension for Armv8-M.baseline as > Cortex-M23 can have the security extension active. > Currently, there is an internal compiler error on Cortex-M23 for the > epilog processing of sign extension. > > This patch addresse

Re: [PATCH v3 2/2] testsuite: Fix expand-return CMSE test for Armv8.1-M [PR115253]

2024-06-11 Thread Richard Earnshaw (lists)
On 10/06/2024 15:04, Torbjörn SVENSSON wrote: > For Armv8.1-M, the clearing of the registers is handled differently than > for Armv8-M, so update the test case accordingly. > > gcc/testsuite/ChangeLog: > > PR target/115253 > * gcc.target/arm/cmse/extend-return.c: Update test case >

[Patch, Fortran, 96418] Fix Test coarray_alloc_comp_4.f08 ICEs

2024-06-11 Thread Andre Vehreschild
Hi all, attached patch has already been present in 2020, but lost my attention. It fixes an ICE in the testsuite. The old mails description is: attached patch fixes PR96418 where the code in the testsuite when compiled with -fcoarray=single lead to an ICE. The reason was that the coarray object

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Jeff Law
On 6/11/24 7:52 AM, Philipp Tomsich wrote: On Tue, 11 Jun 2024 at 15:37, Jeff Law wrote: On 6/11/24 1:22 AM, Richard Biener wrote: Absolutely. But forwarding from a smaller store to a wider load is painful from a hardware standpoint and if we can avoid it from a codegen standpoint, we

[PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7

2024-06-11 Thread Rene Rebe
Hi Kewen, v2 with test case - I hope I worked all your nits in: Glibc uses .machine to determine assembler optimizations to use. However, since reworking the rs6000 .machine output selection in commit e154242724b084380e3221df7c08fcdbd8460674 22 May 2019, G5 as well as Cell, and even power4 w/ -ma

Re: [PATCH v3 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-11 Thread Andre Vieira (lists)
On 11/06/2024 14:59, Richard Earnshaw (lists) wrote: You effectively have an 'else if' split across a comment here, and the indentation looks weird. Either write 'else if' on one line (and re-indent accordingly) or put this entire block inside braces. Apologies here, Torbjorn had this as

[PATCH V2] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread liuhongt
> > I think if you only handle CONST_INT_P, you should check just for that, and > in both places where you check for CONST_VECTOR_DUPLICATE_P (there is one > spot 2 lines above this). > So add > && CONST_INT_P (XVECEXP (XEXP (op0, 1), 0, 0)) > and > && CONST_INT_P (XVECEXP (op1, 0, 0)) > tests righ

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 7:07 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> On 11/06/24 6:12 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 5:15 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Ric

Re: [PATCH] [testsuite] [arm] test board cflags in multilib.exp

2024-06-11 Thread Richard Earnshaw (lists)
On 07/06/2024 05:47, Alexandre Oliva wrote: > > multilib.exp tests for multilib-altering flags in a board's > multilib_flags and skips the test, but if such flags appear in the > board's cflags, with the same distorting effects on tested multilibs, > we fail to skip the test. > > Extend the skipp

Re: [PATCH V2] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 10:40:01PM +0800, liuhongt wrote: > gcc/ChangeLog: > > PR target/115384 > * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): > Only do the simplification of (AND (ASHIFTRT A imm) mask) > to (LSHIFTRT A imm) when the component of const

[PATCH v2] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread pan2 . li
From: Pan Li The test cases of pr115387 are target independent, at least x86 and riscv are able to reproduce. Thus, move these cases to the gcc.dg/torture. The below test suites are passed. 1. The rv64gcv fully regression test. 2. The x86 fully regression test. gcc/testsuite/ChangeLog:

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Robin Dapp
The attached v3 tracks the use of cond_earliest as you suggested and adds its cost in default_noce_conversion_profitable_p. Bootstrapped and regtested on x86 and p10, aarch64 still running. Regtested on riscv64. Regards Robin Before noce_find_if_block processes a block it sets up an if_info st

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > On 11/06/24 7:07 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> Hello Richard: >>> On 11/06/24 6:12 pm, Richard Sandiford wrote: Ajit Agarwal writes: > Hello Richard: > > On 11/06/24 5:15 pm, Richard Sandiford wrote: >> Ajit Agarwal writes:

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Richard Sandiford
Robin Dapp writes: > The attached v3 tracks the use of cond_earliest as you suggested > and adds its cost in default_noce_conversion_profitable_p. > > Bootstrapped and regtested on x86 and p10, aarch64 still > running. Regtested on riscv64. > > Regards > Robin > > Before noce_find_if_block proce

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 8:59 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> On 11/06/24 7:07 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 6:12 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> On

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: >>> Thanks a lot. Can I know what should we be doing with neg (fma) >>> correctness failures with load fusion. >> >> I think it would involve: >> >> - describing lxvp and stxvp as unspec patterns, as I mentioned >> in the previous reply >> >> - making plain movoo split lo

[PATCH v2] Arm: Fix disassembly error in Thumb-1 relaxed load/store [PR115188]

2024-06-11 Thread Wilco Dijkstra
Hi Christophe, >  PR target/115153 I guess this is typo (should be 115188) ? Correct. > +/* { dg-options "-O2 -mthumb" } */-mthumb is included in arm_arch_v6m, so I > think you don't need to add it here? Indeed, it's not strictly necessary. Fixed in v2: A Thumb-1 memory operand allows

[PATCH v2] Arm: Fix ldrd offset range [PR115153]

2024-06-11 Thread Wilco Dijkstra
v2: use a new arm_arch_v7ve_neon, fix use of DImode in output_move_neon The valid offset range of LDRD in arm_legitimate_index_p is increased to -1024..1020 if NEON is enabled since VALID_NEON_DREG_MODE includes DImode. Fix this by moving the LDRD check earlier. Passes bootstrap & regress, OK for

Re: [RFC 1/2] libbacktrace: add FDPIC support

2024-06-11 Thread Max Filippov
On Sun, May 26, 2024 at 11:50 PM Max Filippov wrote: > > Instead of a single base address FDPIC ELF files use load map: a > structure with an array of mappings for individual segments. Change > libbacktrace functions and structures to support that. Ping? > libbacktrace/ > > PR libbacktr

Re: [PATCH V4 1/2] split complicate 64bit constant to memory

2024-06-11 Thread Segher Boessenkool
Hi! On Tue, Jun 11, 2024 at 04:37:25PM +0800, Jiufu Guo wrote: > Sometimes, a complicated constant is built via 3(or more) > instructions. Generally speaking, it would not be as fast > as loading it from the constant pool (as the discussions in > PR63281): > "ld" is one instruction. If consider

[committed] i386: Use CMOV in .SAT_{ADD|SUB} expansion for TARGET_CMOV [PR112600]

2024-06-11 Thread Uros Bizjak
For TARGET_CMOV targets emit insn sequence involving conditional move. .SAT_ADD: addl%esi, %edi movl$-1, %eax cmovnc %edi, %eax ret .SAT_SUB: subl%esi, %edi movl$0, %eax cmovnc %edi, %eax ret PR target/112600 gc

RE: [PATCH] aarch64: Add vector floating point trunc pattern

2024-06-11 Thread Pengxuan Zheng (QUIC)
> Pengxuan Zheng writes: > > This patch is a follow-up of r15-1079-g230d62a2cdd16c to add vector > > floating point trunc pattern for V2DF->V2SF and V4SF->V4HF conversions > > by renaming the existing > > aarch64_float_truncate_lo_ pattern to the standard > > optab one, i.e., trunc2. This allows t

Ping [PATCH] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-11 Thread Pengxuan Zheng (QUIC)
Ping https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650311.html > -Original Message- > From: Pengxuan Zheng (QUIC) > Sent: Tuesday, April 30, 2024 5:32 PM > To: gcc-patches@gcc.gnu.org > Cc: Andrew Pinski (QUIC) ; Pengxuan Zheng > (QUIC) > Subject: [PATCH] aarch64: Add vector popcoun

[Committed 1/3] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
From: Edwin Lu There is a proposal to split the A extension into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Proposal: https://github.com/riscv/riscv-zaamo-zalrsc/tags gcc/ChangeLog: * common/config/riscv/riscv-common.cc:

[Committed 3/3] RISC-V: Add Zalrsc amo-op patterns

2024-06-11 Thread Patrick O'Neill
All amo patterns can be represented with lrsc sequences. Add these patterns as a fallback when Zaamo is not enabled. gcc/ChangeLog: * config/riscv/sync.md (atomic_): New expand pattern. (amo_atomic_): Rename amo pattern. (atomic_fetch_): New lrsc sequence pattern.

[Committed 2/3] RISC-V: Add Zalrsc and Zaamo testsuite support

2024-06-11 Thread Patrick O'Neill
Convert testsuite infrastructure to use Zalrsc and Zaamo rather than A. gcc/ChangeLog: * doc/sourcebuild.texi: Add docs for atomic extension testsuite infra. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Use Zaamo rather than A. * gcc.target/risc

[Committed] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 21:33, Jeff Law wrote: On 6/10/24 3:46 PM, Patrick O'Neill wrote: The A extension has been split into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Zaamo/Zalrsc spec: https://github.com/riscv/riscv-zaamo-zalrsc/tags R

Re: [PATCH] PHIOPT: Don't transform minmax if middle bb contains a phi [PR115143]

2024-06-11 Thread Andrew Pinski
On Mon, May 20, 2024 at 11:08 PM Richard Biener wrote: > > On Mon, May 20, 2024 at 11:37 PM Andrew Pinski (QUIC) > wrote: > > > > > -Original Message- > > > From: Richard Biener > > > Sent: Sunday, May 19, 2024 11:55 AM > > > To: Andrew Pinski (QUIC) > > > Cc: gcc-patches@gcc.gnu.org >

Re: [PATCH v3 0/3] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 21:32, Jeff Law wrote: On 6/10/24 6:15 PM, Andrea Parri wrote: On Mon, Jun 10, 2024 at 02:46:54PM -0700, Patrick O'Neill wrote: The A extension has been split into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Zaamo/

[PATCH 0/3] RISC-V: Amo testsuite cleanup

2024-06-11 Thread Patrick O'Neill
This series moves the atomic-related riscv testcases into their own folder and fixes some minor bugs/rigidity of existing testcases. Patrick O'Neill (3): RISC-V: Move amo tests into subfolder RISC-V: Fix amoadd call arguments RISC-V: Allow any temp register to be used in amo tests .../risc

[PATCH 2/3] RISC-V: Fix amoadd call arguments

2024-06-11 Thread Patrick O'Neill
Update __atomic_add_fetch arguments to be a pointer and value rather than two pointers. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Update __atomic_add_fetch args. * gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Ditto. * gcc.target/

[PATCH 1/3] RISC-V: Move amo tests into subfolder

2024-06-11 Thread Patrick O'Neill
There's a large number of atomic related testcases in the riscv folder. Move them into a subfolder similar to what was done for rvv testcases. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Move to... * gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: ...her

[PATCH 3/3] RISC-V: Allow any temp register to be used in amo tests

2024-06-11 Thread Patrick O'Neill
We artifically restrict the temp registers to be a[0-9]+ when other registers like t[0-9]+ are valid too. Update to make the regex accept any register for the temp value. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-load-1.c: Update temp register regex. * gcc.tar

Re: [PATCH v2 2/3] RISC-V: Add Zalrsc and Zaamo testsuite support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 09:39, Patrick O'Neill wrote: On 6/7/24 16:04, Jeff Law wrote: On 6/3/24 3:53 PM, Patrick O'Neill wrote: Convert testsuite infrastructure to use Zalrsc and Zaamo rather than A. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Use Zaamo rather than A.

  1   2   >