Re: [PATCH 1/5] OpenMP, NVPTX: memcpy[23]D bias correction

2023-09-26 Thread Thomas Schwinge
Hi Julian! On 2023-09-06T02:34:30-0700, Julian Brown wrote: > This patch works around behaviour of the 2D and 3D memcpy operations in > the CUDA driver runtime. Particularly in Fortran, the "base pointer" > of an array (used for either source or destination of a host/device copy) > may lie outsi

Re: Re: [Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]

2023-09-26 Thread 钟居哲
Hi, Jeff. I removed mem-to-mem patterns as you suggested that means we don't have scalar move optimization for small size vector modes. Is it ok for trunk? Since it is a bug fix patch, I hope we can land it soon. We may will find another way to optimize small size vector mode mem-to-mem. ju

Re: [PATCH 1/2] c++: remove NON_DEPENDENT_EXPR, part 1

2023-09-26 Thread Jason Merrill
On 9/25/23 16:43, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- This tree code dates all the way back to r69130[1] which implemented typing of non-dependent expressions. Its motivation was never clear (to me at least) since the do

RE: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-09-26 Thread Tamar Christina
Hi, I can't approve but hope you don't mind the review, > +/* Return true if this CODE describes a conditional (masked) > +internal_fn. */ > + > +bool > +cond_fn_p (code_helper code) > +{ > + if (!code.is_fn_code ()) > +return false; > + > + if (!internal_fn_p ((combined_fn) code)) > +

[PATCH]middle-end Fold vec_cond into conditional ternary or binary operation when sharing operand [PR109154]

2023-09-26 Thread Tamar Christina
Hi All, When we have a vector conditional on a masked target which is doing a selection on the result of a conditional operation where one of the operands of the conditional operation is the other operand of the select, then we can fold the vector conditional into the operation. Concretely this t

[PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-09-26 Thread Tamar Christina
Hi All, For targets that allow conversion between int and float modes this adds a new optimization transforming fneg (fabs (x)) into x | (1 << signbit(x)). Such sequences are common in scientific code working with gradients. The transformed instruction if the target has an inclusive-OR that take

[PATCH]AArch64 Add movi for 0 moves for scalar types [PR109154]

2023-09-26 Thread Tamar Christina
Hi All, Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immediates should be created with a movi of 0. At the moment we generate an `fmov .., xzr` which is slower and requires a GP -> FP transfer. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master?

[PATCH]AArch64: Use SVE unpredicated LOGICAL expressions when Advanced SIMD inefficient [PR109154]

2023-09-26 Thread Tamar Christina
Hi All, SVE has much bigger immediate encoding range for bitmasks than Advanced SIMD has and so on a system that is SVE capable if we need an Advanced SIMD Inclusive-OR by immediate and would require a reload then an unpredicated SVE ORR instead. This has both speed and size improvements. Bootst

[PATCH]AArch64 Add special patterns for creating DI scalar and vector constant 1 << 63 [PR109154]

2023-09-26 Thread Tamar Christina
Hi All, This adds a way to generate special sequences for creation of constants for which we don't have single instructions sequences which would have normally lead to a GP -> FP transfer or a literal load. The patch starts out by adding support for creating 1 << 63 using fneg (mov 0). Bootstrap

[PATCH]AArch64 Rewrite simd move immediate patterns to new syntax

2023-09-26 Thread Tamar Christina
Hi All, This rewrites the simd MOV patterns to use the new compact syntax. No change in semantics is expected. This will be needed in follow on patches. This also merges the splits into the define_insn which will also be needed soon. Bootstrapped Regtested on aarch64-none-linux-gnu and no issue

Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-09-26 Thread Andrew Pinski
On Tue, Sep 26, 2023 at 5:51 PM Tamar Christina wrote: > > Hi All, > > For targets that allow conversion between int and float modes this adds a new > optimization transforming fneg (fabs (x)) into x | (1 << signbit(x)). Such > sequences are common in scientific code working with gradients. > > T

Re: [PATCH]AArch64 Rewrite simd move immediate patterns to new syntax

2023-09-26 Thread Ramana Radhakrishnan
On Wed, Sep 27, 2023 at 1:53 AM Tamar Christina wrote: > > Hi All, > > This rewrites the simd MOV patterns to use the new compact syntax. > No change in semantics is expected. This will be needed in follow on patches. > > This also merges the splits into the define_insn which will also be needed

Re: [PATCH]AArch64 Add movi for 0 moves for scalar types [PR109154]

2023-09-26 Thread Ramana Radhakrishnan
On Wed, Sep 27, 2023 at 1:51 AM Tamar Christina wrote: > > Hi All, > > Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immediates > should be created with a movi of 0. > > At the moment we generate an `fmov .., xzr` which is slower and requires a > GP -> FP transfer. > > Bootstr

RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-09-26 Thread Tamar Christina
> -Original Message- > From: Andrew Pinski > Sent: Wednesday, September 27, 2023 2:17 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > j...@ventanamicro.com > Subject: Re: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << > signbit(x)) [PR1

RE: [PATCH]AArch64 Rewrite simd move immediate patterns to new syntax

2023-09-26 Thread Tamar Christina
> -Original Message- > From: Ramana Radhakrishnan > Sent: Wednesday, September 27, 2023 2:28 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov ; > Richard Sandiford > Subject: Re: [PATCH]AArch64 Rewrite simd move immedia

[PATCH] RISC-V: Bugfix for RTL check[PR111533]

2023-09-26 Thread Li Xu
From: xuli Consider the flowing situation: BB5: local_dem(RVV Insn 1, AVL(reg zero)) RVV Insn 1: vmv.s.x, AVL (const_int 1) RVV Insn 2: vredsum.vs, AVL(reg zero) vmv.s.x has vl operand, the following code will get avl (cosnt_int) from RVV Insn 1. rtx avl = has_vl_op (insn->rtl ()) ? get_vl (insn

Re: [PATCH] RISC-V: Bugfix for RTL check[PR111533]

2023-09-26 Thread juzhe.zh...@rivai.ai
+ vid sequence. The elt (i) can be either const_int or + const_poly_int. */ + HOST_WIDE_INT diff = rtx_to_poly_int64 (builder.elt (i)).to_constant () - i; How about: poly_int64 diff = rtx_to_poly_int64 (builder.elt (i)) - i; rtx avl - = has_vl_op (insn->rtl ()) ? get_vl (insn

[PATCH v1] RISC-V: Support FP trunc auto-vectorization

2023-09-26 Thread pan2 . li
From: Pan Li This patch would like to support auto-vectorization for the trunc API in math.h. It depends on the -ffast-math option. When we would like to call trunc/truncf like v2 = trunc (v1), we will convert it into below insns (reference the implementation of llvm). * vfcvt.rtz.x.f v3, v1 *

Re: [PATCH v1] RISC-V: Support FP trunc auto-vectorization

2023-09-26 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: pan2.li Date: 2023-09-27 11:28 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support FP trunc auto-vectorization From: Pan Li This patch would like to support auto-vectorization for the trunc API in math.h. I

[PATCH, obvious] OpenMP: GIMPLE_OMP_STRUCTURED_BLOCK bug fix

2023-09-26 Thread Sandra Loosemore
I'm planning to push the attached 1-liner bug fix to mainline tomorrow, when testing has completed. I think this qualifies as "obvious". There's no test case because I discovered it while testing the updated loop transformation patches, which are still a work in progress; it fixed an ICE in th

[PATCH] remove workaround for GCC 4.1-4.3

2023-09-26 Thread Jakub Jelinek
Hi! While looking into vec.h, I've noticed we still have a workaround for GCC 4.1-4.3 bugs. As we now use C++11 and thus need to be built by GCC 4.8 or later, I think this is now never used. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-09-27 Jakub Jelinek

Re: [PATCH] [11/12/13/14 Regression] ABI break in _Hash_node_value_base since GCC 11 [PR 111050]

2023-09-26 Thread François Dumont
Still no chance to get feedback from TC ? Maybe I can commit the below then ? AFAICS on gcc mailing list several gcc releases were done recently, too late. On 14/09/2023 06:46, François Dumont wrote: Author: TC Date:   Wed Sep 6 19:31:55 2023 +0200     libstdc++: Force _Hash_node_value_ba

Re: [PATCH] remove workaround for GCC 4.1-4.3

2023-09-26 Thread Bernhard Reutner-Fischer
On 27 September 2023 06:43:24 CEST, Jakub Jelinek wrote: >Hi! > >While looking into vec.h, I've noticed we still have a workaround for >GCC 4.1-4.3 bugs. This is https://gcc.gnu.org/PR105656 thanks, >As we now use C++11 and thus need to be built by GCC 4.8 or later, >I think this is now never u

[PATCH] vec.h: Make some ops work with non-trivially copy constructible and/or destructible types

2023-09-26 Thread Jakub Jelinek
Hi! We have some very limited support for non-POD types in vec.h (in particular grow_cleared will invoke default ctors on the cleared elements and vector copying invokes copy ctors. My pending work on wide_int/widest_int which makes those two non-trivially default/copy constructible, assignable a

Re: [PATCH-1v2, rs6000] Enable SImode in FP registers on P7 [PR88558]

2023-09-26 Thread Kewen.Lin
Hi, on 2023/9/25 09:57, HAO CHEN GUI wrote: > Hi Kewen, > > 在 2023/9/18 15:34, Kewen.Lin 写道: >> hanks for checking! So for P7, this patch looks neutral, but for P8 and >> later, it may cause some few differences in code gen. I'm curious that how >> many total object files and different object f

Re: [PATCH-2v3, rs6000] Implement 32bit inline lrint [PR88558]

2023-09-26 Thread Kewen.Lin
Hi, on 2023/9/25 10:05, HAO CHEN GUI wrote: > Hi, > This patch implements 32bit inline lrint by "fctiw". It depends on > the patch1 to do SImode move from FP registers on P7. > > Compared to last version, the main change is to add some test cases. > https://gcc.gnu.org/pipermail/gcc-patches/2

Re: [PATCH] remove workaround for GCC 4.1-4.3

2023-09-26 Thread Richard Biener
> Am 27.09.2023 um 06:43 schrieb Jakub Jelinek : > > Hi! > > While looking into vec.h, I've noticed we still have a workaround for > GCC 4.1-4.3 bugs. > As we now use C++11 and thus need to be built by GCC 4.8 or later, > I think this is now never used. > > Bootstrapped/regtested on x86_64-l

[PATCH] rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]

2023-09-26 Thread Kewen.Lin
Hi, As PR111367 shows, with prefixed insn supported, some of checkings consider it's able to leverage prefixed insn for stack protect related load/store, but since we don't actually change the emitted assembly for 32 bit, it can cause the assembler error as exposed. Mike's commit r10-4547-gce6a6c

[PATCH] testsuite: Avoid uninit var in pr60510.f [PR111427]

2023-09-26 Thread Kewen.Lin
Hi, The uninitialized variable a in pr60510.f can cause some random failures as exposed in PR111427, see the details there. This patch is to make it initialized accordingly. As verified, it can fix the reported -m32 failures on P7 and P8 BE. It's also tested well on powerpc64-linux-gnu P9 and p

Re: RISC-V sign extension query

2023-09-26 Thread Vineet Gupta
Hi Jeff, On 9/19/23 07:59, Jeff Law wrote: On 9/18/23 21:37, Vineet Gupta wrote: On 9/18/23 19:41, Jeff Law wrote: On 9/18/23 13:45, Vineet Gupta wrote: For the cases which do require sign extends, but not being eliminated due to "missing definition(s)" I'm working on adapting Ajit's RE

Re: [PATCH]AArch64 Add movi for 0 moves for scalar types [PR109154]

2023-09-26 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > Following the Neoverse N/V and Cortex-A optimization guides SIMD 0 immediates > should be created with a movi of 0. > > At the moment we generate an `fmov .., xzr` which is slower and requires a > GP -> FP transfer. > > Bootstrapped Regtested on aarch64-none-l

[PATCH] ifcvt: Fix comments

2023-09-26 Thread Juzhe-Zhong
Fix comments since original comment is confusing. gcc/ChangeLog: * tree-if-conv.cc (is_cond_scalar_reduction): Fix comments. --- gcc/tree-if-conv.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index 799f071965e..a8c

<    1   2