Re: [PATCH] tree-ssa-dce: Fix up maybe_optimize_arith_overflow for BITINT_TYPE [PR112880]

2023-12-07 Thread Richard Biener
On Thu, 7 Dec 2023, Jakub Jelinek wrote: > Hi! > > The following testcase ICEs because maybe_optimize_arith_overflow > uses build_nonstandard_integer_type, which is inappropriate if > type is large BITINT_TYPE. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, > ok for tru

Re: [PATCH] expr: Handle BITINT_TYPE in count_type_elements [PR112881]

2023-12-07 Thread Richard Biener
On Thu, 7 Dec 2023, Jakub Jelinek wrote: > Hi! > > The following testcaser ICEs during gimplification, because > count_type_elements doesn't handle BITINT_TYPE. It should handle it like > other integral types. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. > 2023

Re: [PATCH] c-family: Fix up -fno-debug-cpp [PR111965]

2023-12-07 Thread Richard Biener
On Thu, Dec 7, 2023 at 8:54 AM Jakub Jelinek wrote: > > Hi! > > As can be seen in the second testcase, -fno-debug-cpp is actually > implemented the same as -fdebug-cpp and so doesn't turn the debugging > off. > > The following patch fixes that. > > Bootstrapped/regtested on x86_64-linux and i686-l

Re: [PATCH v3] LoongArch: Fix eh_return epilogue for normal returns

2023-12-07 Thread Xi Ruoyao
On Thu, 2023-12-07 at 14:18 +0800, Yang Yujie wrote: > On Thu, Dec 07, 2023 at 11:02:58AM +0800, Xi Ruoyao wrote: > > > > I don't like this pair of {} for the for statement.  It's not necessary > > and it changes the indent level, causing the diff hard to review. > > > > Otherwise LGTM.  I'm not

[PATCH] Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Jakub Jelinek
Hi! As documented, --param min-nondebug-insn-uid= is very useful in debugging -fcompare-debug issues in RTL dumps, without it it is really hard to find differences. With it, DEBUG_INSNs generally use low INSN_UIDs (1+) and non-DEBUG_INSNs use INSN_UIDs from the parameter up. For good results, the

[PATCH] v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote: > So, one way to fix the LRA issue would be just to use > lra_insn_recog_data_len = index * 3U / 2; > if (lra_insn_recog_data_len <= index) > lra_insn_recog_data_len = index + 1; > basically do what vec.cc does. I thought we ca

[committed] testsuite: Add testcase for already fixed PR [PR111068]

2023-12-07 Thread Jakub Jelinek
Hi! This one unfortunately can't be bisected, it ICEd until r14-3430 inclusive, but r14-3431 removed -mavx10.1-512 support and when it was readded in r14-5607 it doesn't ICE anymore. I'm just committing the testcase so that it doesn't reappear. Regtested on x86_64-linux and i686-linux, committed

Re: [PATCH] tree-optimization/PR112774 - SCEV: extend the chrec tree with a nonwrapping flag

2023-12-07 Thread Hao Liu OS
> Can you try to do some statistics on say SPEC CPU? I'm usually > building (with -j1) with -fopt-info-vec and diff build logs, you can then see > how many more loops (and which) we vectorize additionally? I tried this option with SPEC2017 intrate+fprate and count the "optimized: " lines. Five mo

Re: [PATCH] testsuite: Adjust for the new permerror -Wincompatible-pointer-types

2023-12-07 Thread Florian Weimer
* Yang Yujie: > With this patch, I also noticed a few errors in building unpatched older > software like expect-5.45.4, perl-5.28.3 and bash-5.0. Will this also be > the case when GCC 14 gets released? For Fedora, we keep pointers of the changes needed here:

[PATCH] c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]

2023-12-07 Thread Jakub Jelinek
Hi! The following testcase is miscompiled because two ubsan instrumentations run into each other. The first one is the shift instrumentation. Before the C++ FE calls it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects in them aren't evaluated multiple times. And, ubsan_ins

Re: [PATCH] Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Richard Biener
On Thu, 7 Dec 2023, Jakub Jelinek wrote: > Hi! > > As documented, --param min-nondebug-insn-uid= is very useful in debugging > -fcompare-debug issues in RTL dumps, without it it is really hard to > find differences. With it, DEBUG_INSNs generally use low INSN_UIDs > (1+) and non-DEBUG_INSNs use

Several test failures due to "Introduce strub: machine-independent stack scrubbing"

2023-12-07 Thread FX Coudert
Hi Alexandre, The commit https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=f0a90c7d7333fc7f554b906245c84bdf04d716d7 (Introduce strub: machine-independent stack scrubbing) has introduced many test failures on x86_64-apple-darwin21: +FAIL: c-c++-common/strub-apply2.c -std=gnu++98 (internal compiler

Re: [PATCH] v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Richard Biener
On Thu, 7 Dec 2023, Jakub Jelinek wrote: > On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote: > > So, one way to fix the LRA issue would be just to use > > lra_insn_recog_data_len = index * 3U / 2; > > if (lra_insn_recog_data_len <= index) > > lra_insn_recog_data_len = index +

[PATCH] RISC-V: Support interleave vector with different step sequence for VLA SLP

2023-12-07 Thread Juzhe-Zhong
This patch fixes 400 ICEs in full coverage testing since they happens due to same reason. Before this patch: internal compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4597 appears 400 times in full coverage testing report. The root cause is we didn't support interleave v

[PATCH V2 1/2] RISC-V: Add C intrinsics of Scalar Crypto Extension

2023-12-07 Thread Liao Shihua
This patch adds C intrinsics for Scalar Crypto Extension. gcc/ChangeLog: * config.gcc: Add riscv_crypto.h. * config/riscv/riscv_crypto.h: New file. gcc/testsuite/ChangeLog: * gcc.target/riscv/scalar_crypto_intrinsic-32.c: New test. * gcc.target/riscv/scalar_crypt

[PATCH V2 0/2] RISC-V: Add intrinsics for Bitmanip and Scalar Crypto extensions

2023-12-07 Thread Liao Shihua
In accordance with the suggestions of Christoph Müllner, the following amendments are made Update v1 -> v2: 1. Rename *_intrinsic-* to *_intrinsic-XLEN. 2. Typo fix. 3. Intrinsics with immediate arguments will use marcos at O0 . It's a little patch add just provides a mapping from the RV i

[PATCH V2 2/2]RISC-V: Add C intrinsics of Bitmanip Extension

2023-12-07 Thread Liao Shihua
This patch adds C intrinsics for Bitmanip Extension. RISCV_BUILTIN_NO_PREFIX is a new riscv_builtin_description like RISCV_BUILTIN. But it uses CODE_FOR_##INSN rather than CODE_FOR_riscv_##INSN. gcc/ChangeLog: * config.gcc: Add riscv_bitmanip.h * config/riscv/riscv-builtins.cc (AV

[PATCH] RISC-V: Support interleave vector with different step sequence for VLA SLP

2023-12-07 Thread Juzhe-Zhong
This patch fixes 64 ICEs in full coverage testing since they happens due to same reason. Before this patch: internal compiler error: in expand_const_vector, at config/riscv/riscv-v.cc:1270 appears 400 times in full coverage testing report. The root cause is we didn't support interleave vector

Re: [PATCH] RISC-V: Support interleave vector with different step sequence for VLA SLP

2023-12-07 Thread juzhe.zh...@rivai.ai
Resend the patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639728.html with changelog changes. No codes change. juzhe.zh...@rivai.ai From: Juzhe-Zhong Date: 2023-12-07 18:15 To: gcc-patches CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong Subject: [PATCH] RISC-V

Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-07 Thread Manos Anagnostakis
Στις Πέμ 7 Δεκ 2023, 09:39 ο χρήστης Richard Biener < richard.guent...@gmail.com> έγραψε: > On Wed, Dec 6, 2023 at 6:15 PM Manos Anagnostakis > wrote: > > > > Hi again, > > > > I went and tested the requested changes and found out the following: > > > > 1. The pass is currently increasing insn_cn

Re: [PATCH v3 1/3] LoongArch: Adjust D version strings.

2023-12-07 Thread Iain Buclaw
Hi, Thanks for this. Excerpts from Yang Yujie's message of Dezember 1, 2023 11:08 am: > diff --git a/gcc/d/dmd/cond.d b/gcc/d/dmd/cond.d > index 568b639e0b6..02af0cc9e29 100644 > --- a/gcc/d/dmd/cond.d > +++ b/gcc/d/dmd/cond.d > @@ -693,10 +693,10 @@ extern (C++) final class VersionCondition : DV

Re: [PATCH] v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 11:12:39AM +0100, Richard Biener wrote: > > 2023-12-07 Jakub Jelinek > > > > PR middle-end/112411 > > * params.opt (-param=min-nondebug-insn-uid=): Add > > IntegerRange(0, 1073741824). > > * lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3

Re: [PATCH v3 2/3] libphobos: Update build scripts for LoongArch64.

2023-12-07 Thread Iain Buclaw
Excerpts from Yang Yujie's message of Dezember 1, 2023 11:08 am: > libphobos/ChangeLog: > > * m4/druntime/cpu.m4: Support loongarch* targets. > * libdruntime/Makefile.am: Same. > * libdruntime/Makefile.in: Regenerate. > * configure: Regenerate. > --- > libphobos/configure

Re: [PATCH v3 3/3] libruntime: Add fiber context switch code for LoongArch.

2023-12-07 Thread Iain Buclaw
Excerpts from Yang Yujie's message of Dezember 1, 2023 11:08 am: > libphobos/ChangeLog: > > * libdruntime/config/loongarch/switchcontext.S: New file. > --- OK. Thanks, Iain.

Re: [PATCH v2 3/3] libphobos: LoongArch hardware support.

2023-12-07 Thread Iain Buclaw
Excerpts from Yang Yujie's message of Dezember 1, 2023 8:46 am: > libphobos/ChangeLog: > > * src/std/math/hardware.d: Implement FP control. > --- > libphobos/src/std/math/hardware.d | 53 +++ > > diff --git a/libphobos/src/std/math/hardware.d > b/libphobos/src/std/math/har

Re: [PATCH] Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 09:36:23AM +0100, Jakub Jelinek wrote: > Without the dg-skip-if I got on 64-bit host: > cc1: out of memory allocating 571230784744 bytes after a total of 2772992 > bytes I've looked at this and the problem is in haifa-sched.cc: 9047 h_i_d.safe_grow_cleared (3 * ge

Re: [PATCH v2] rs6000: Add new pass for replacement of contiguous addresses vector load lxv with lxvp

2023-12-07 Thread Ajit Agarwal
Hello Kewen: On 06/12/23 7:52 am, Kewen.Lin wrote: > on 2023/12/6 02:01, Ajit Agarwal wrote: >> Hello Kewen: >> >> >> On 05/12/23 7:13 pm, Ajit Agarwal wrote: >>> Hello Kewen: >>> >>> On 04/12/23 7:31 am, Kewen.Lin wrote: Hi Ajit, on 2023/12/1 17:10, Ajit Agarwal wrote: > Hello

Re: [PATCH] v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

2023-12-07 Thread Richard Biener
On Thu, 7 Dec 2023, Jakub Jelinek wrote: > On Thu, Dec 07, 2023 at 11:12:39AM +0100, Richard Biener wrote: > > > 2023-12-07 Jakub Jelinek > > > > > > PR middle-end/112411 > > > * params.opt (-param=min-nondebug-insn-uid=): Add > > > IntegerRange(0, 1073741824). > > > * lra.cc (check_an

[PATCH 0/2] asan: Align .LASANPC on function boundary

2023-12-07 Thread Ilya Leoshkevich
Hi, this is another attempt to fix the .LASANPC alignment on s390x. Currently it's not only inefficient ([1]-[5]), but also causes linker errors in template-heavy code ([6]). The previous attempts to add a new constant for minimum code alignment value ([1]-[5]) did not arouse considerable enthusi

[PATCH 1/2] Implement ASM_DECLARE_FUNCTION_NAME using ASM_OUTPUT_FUNCTION_LABEL

2023-12-07 Thread Ilya Leoshkevich
gccint recommends using ASM_OUTPUT_FUNCTION_LABEL in ASM_DECLARE_FUNCTION_NAME, but many implementations use ASM_OUTPUT_LABEL instead. It's inconsistent and prevents changes to ASM_OUTPUT_FUNCTION_LABEL from affecting the respective targets. --- gcc/config/aarch64/aarch64.cc | 2 +- gcc/co

[PATCH 2/2] asan: Align .LASANPC on function boundary

2023-12-07 Thread Ilya Leoshkevich
GCC can emit code between the function label and the .LASANPC label, making the latter unaligned. Some architectures cannot load unaligned labels directly and require literal pool entries, which is inefficient. Move the invocation of asan_function_start to ASM_OUTPUT_FUNCTION_LABEL, which guarant

Re: [PATCH] combine: Fix ICE in try_combine on pr112494.c [PR112560]

2023-12-07 Thread Richard Biener
On Mon, Dec 4, 2023 at 10:34 AM Uros Bizjak wrote: > > On Wed, Nov 29, 2023 at 1:25 PM Richard Biener > wrote: > > > > On Wed, Nov 29, 2023 at 10:35 AM Uros Bizjak wrote: > > > > > > The compiler, configured with --enable-checking=yes,rtl,extra ICEs with: > > > > > > internal compiler error: RTL

[PATCH] RISC-V: Fix AVL propagation ICE for vleff/vlsegff

2023-12-07 Thread Juzhe-Zhong
This patch fixes 400 ICEs in full coverage testing: internal compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4597 The root cause is each operand is used in vleff/vlsegff twice: (define_insn "@pred_fault_load" [(set (match_operand:V 0 "register_operand" "=vd

Re: [PATCH] RISC-V: Fix AVL propagation ICE for vleff/vlsegff

2023-12-07 Thread Robin Dapp
LGTM. Btw your vsetvl patch from yesterday fixes the vectorized strlen/strcmp problems. Those use vleff as first instruction. Regards Robin

[PATCH] RISC-V: Add avail interface into function_group_info

2023-12-07 Thread Feng Wang
In order to add other extension about vector,this patch add unsigned int (*avail) (void) into function_group_info to determine whether to register the intrinsic based on ISA info. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-functions.def (DEF_RVV_FUNCTION): Add AVAIL def.

Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-07 Thread Richard Sandiford
Richard Biener writes: > On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich > wrote: >> >> On Wed, 6 Dec 2023 at 23:32, Richard Biener >> wrote: >> > >> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis >> > wrote: >> > > >> > > This is an RTL pass that detects store forwarding from stores to l

[PATCH] LoongArch: Allow -mcmodel=extreme and model attribute with -mexplicit-relocs=auto

2023-12-07 Thread Xi Ruoyao
There seems no real reason to require -mexplicit-relocs=always for -mcmodel=extreme or model attribute. As the linker does not know how to relax a 3-operand la.local or la.global pseudo instruction, just emit explicit relocs for SYMBOL_PCREL64, and under TARGET_CMODEL_EXTREME also SYMBOL_GOT_DISP.

Re: [PATCH 2/2] asan: Align .LASANPC on function boundary

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 01:08:27PM +0100, Ilya Leoshkevich wrote: > GCC can emit code between the function label and the .LASANPC label, > making the latter unaligned. Some architectures cannot load unaligned > labels directly and require literal pool entries, which is inefficient. > > Move the i

Re: [PATCH] libssp: Fix gets-chk.c compilation on Solaris

2023-12-07 Thread Rainer Orth
Rainer Orth writes: > The recent warning patches broke the libssp build on Solaris: > > /vol/gcc/src/hg/master/local/libssp/gets-chk.c: In function '__gets_chk': > /vol/gcc/src/hg/master/local/libssp/gets-chk.c:67:12: error: implicit > declaration of function 'gets'; did you mean 'getw'? > [-Wimp

Re: [PATCH] RISC-V: Add avail interface into function_group_info

2023-12-07 Thread juzhe.zh...@rivai.ai
Also add avail.h into: riscv-vector-builtins.o: $(srcdir)/config/riscv/riscv-vector-builtins.cc \   $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) $(TM_P_H) \   memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) $(EXPR_H) \   $(FUNCTION_H) fold-const.h gimplify.h explow

Re: Re: [PATCH] RISC-V: Add avail interface into function_group_info

2023-12-07 Thread Feng Wang
2023-12-07 20:23 juzhe.zhong wrote: >Thanks for doing this! > >+AVAIL (always, true) >-> AVAIL (true, true) > >+DEF_RVV_FUNCTION (vmul, alu, full_preds, iu_vvv_ops, always) >-> DEF_RVV_FUNCTION (vmul, alu, full_preds, iu_vvv_ops, true) > >Btw, we have full coverage rvv -

Re: [PATCH v2 3/3] libphobos: LoongArch hardware support.

2023-12-07 Thread Xi Ruoyao
On Thu, 2023-12-07 at 11:41 +0100, Iain Buclaw wrote: > Excerpts from Yang Yujie's message of Dezember 1, 2023 8:46 am: > > libphobos/ChangeLog: > > > > * src/std/math/hardware.d: Implement FP control. > > --- > >  libphobos/src/std/math/hardware.d |  53 +++ > > > > diff --git

Re: [PATCH] RISC-V: Add avail interface into function_group_info

2023-12-07 Thread Kito Cheng
> #ifndef DEF_RVV_FUNCTION > -#define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO) > +#define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO, AVAIL) #define DEF_RVV_FUNCTION(NAME, SHAPE, PREDS, OPS_INFO, ...) And add a comment to mention 5th argument is optional for AVAIL. > #endif > > /* In

Re: [PATCH] libssp: Fix gets-chk.c compilation on Solaris

2023-12-07 Thread Jakub Jelinek
On Mon, Dec 04, 2023 at 11:42:09AM +0100, Rainer Orth wrote: > The recent warning patches broke the libssp build on Solaris: > > /vol/gcc/src/hg/master/local/libssp/gets-chk.c: In function '__gets_chk': > /vol/gcc/src/hg/master/local/libssp/gets-chk.c:67:12: error: implicit > declaration of funct

Re: [PATCH v2 6/7] aarch64,arm: Fix branch-protection= parsing

2023-12-07 Thread Richard Earnshaw
On 03/11/2023 15:36, Szabolcs Nagy wrote: Refactor the parsing to have a single API and fix a few parsing issues: - Different handling of "bti+none" and "none+bti": these should be rejected because "none" can only appear alone. - Accepted empty strings such as "bti++pac-ret" or "bti+", th

Re: [PATCH v2 7/7] aarch64,arm: Move branch-protection data to targets

2023-12-07 Thread Richard Earnshaw
On 03/11/2023 15:36, Szabolcs Nagy wrote: The branch-protection types are target specific, not the same on arm and aarch64. This currently affects pac-ret+b-key, but there will be a new type on aarch64 that is not relevant for arm. gcc/ChangeLog: * config/aarch64/aarch64-opts.h (enu

Re: [PATCH] Reimplement __gnu_cxx::__ops operators

2023-12-07 Thread Jonathan Wakely
On Wed, 6 Dec 2023 at 20:55, François Dumont wrote: > > I think I still got no feedback about this cleanup proposal. Can you remind me why we have all those different functions in predefined_ops.h in the first place? I think it was to avoid having two versions of every algorithm, one that does *l

Re: [PATCH v2 5/6] libgomp, nvptx: Cuda pinned memory

2023-12-07 Thread Andrew Stubbs
@Thomas, there are questions for you below On 22/11/2023 17:07, Tobias Burnus wrote: Note before: Starting with TR11 alias OpenMP 6.0, OpenMP supports handling multiple devices for allocation. It seems as if after using:   my_memspace = omp_get_device_and_host_memspace( 5 , omp_default_me

Re: [PATCH] Reimplement __gnu_cxx::__ops operators

2023-12-07 Thread Jonathan Wakely
On Thu, 7 Dec 2023 at 13:41, Jonathan Wakely wrote: > > On Wed, 6 Dec 2023 at 20:55, François Dumont wrote: > > > > I think I still got no feedback about this cleanup proposal. > > Can you remind me why we have all those different functions in > predefined_ops.h in the first place? I think it was

Re: [PATCH v6] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-12-07 Thread Richard Biener
On Thu, Dec 7, 2023 at 1:20 PM Richard Sandiford wrote: > > Richard Biener writes: > > On Wed, Dec 6, 2023 at 7:44 PM Philipp Tomsich > > wrote: > >> > >> On Wed, 6 Dec 2023 at 23:32, Richard Biener > >> wrote: > >> > > >> > On Wed, Dec 6, 2023 at 2:48 PM Manos Anagnostakis > >> > wrote: > >

Re: [PATCH] tree-optimization/PR112774 - SCEV: extend the chrec tree with a nonwrapping flag

2023-12-07 Thread Richard Biener
On Thu, Dec 7, 2023 at 9:59 AM Hao Liu OS wrote: > > > Can you try to do some statistics on say SPEC CPU? I'm usually > > building (with -j1) with -fopt-info-vec and diff build logs, you can then > > see > > how many more loops (and which) we vectorize additionally? > > I tried this option with

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Costas Argyris
Would that be something like this? Although it didn't fix the leak, which was the entire point of this exercise. Maybe because driver::finalize () is not getting called so the call to mdswitches.release () doesn't really happen, which was the reason I went with std::vector in the first place beca

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 02:28:18PM +, Costas Argyris wrote: > Would that be something like this? Yes. Or perhaps even easier just change --- gcc/gcc.cc.jj 2023-12-07 08:31:59.970849379 +0100 +++ gcc/gcc.cc 2023-12-07 15:33:46.616886894 +0100 @@ -11368,6 +11368,7 @@ driver::finalize ()

[PATCH v3 08/11] aarch64: Generalize writeback ldp/stp patterns

2023-12-07 Thread Alex Coplan
Hi, This is a v3 patch which is rebased on top of the SME changes. Otherwise it is the same as v2, posted here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639367.html Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- Thus far the writeba

[PATCH v3 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-07 Thread Alex Coplan
Hi, This is a v3, rebased on top of the SME changes. v2 is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639361.html Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk? Thanks, Alex -- >8 -- This patch overhauls the load/store pair patterns with two main goa

Re: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-07 Thread Jeff Law
On 12/5/23 06:59, Roger Sayle wrote: This patch improves the code generated for bitfield sign extensions on ARC cpus without a barrel shifter. Compiling the following test case: int foo(int x) { return (x<<27)>>27; } with -O2 -mcpu=em, generates two loops: foo:mov lp_count,27

[PATCH v3 10/11] aarch64: Add new load/store pair fusion pass

2023-12-07 Thread Alex Coplan
Hi, This is a v5 of the aarch64 load/store pair fusion pass, rebased on top of the SME changes. v4 is here: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639404.html There are no changes to the pass itself since v4, this is just a rebase. Bootstrapped/regtested as a series on aarch64-l

Re: OpenMP offloading vs. C++ static local variables

2023-12-07 Thread Thomas Schwinge
Hi! Jakub, would you please provide guidance? Elsewhere, I wrote: || I'm working on implementing (some) C++ standard library support for code || offloading in GCC, and ran into the following issue: per || , ||

Re: [PATCH] RISC-V: Support interleave vector with different step sequence for VLA SLP

2023-12-07 Thread Robin Dapp
Sorry for the delay, just a tiny naming/comment nit. Rest LGTM, no need for a v2. > +/* Return true each pattern has different 2 steps. > + TODO: We currently only support NPATTERNS = 2. */ Return true if the permutation consists of two interleaved patterns with a constant step each. > +bool

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Costas Argyris
> Still reachable memory at exit e.g. from valgrind is not a bug. Indeed, this is coming from a valgrind report here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93019 where it was noted that the driver memory leaks could be problematic for JIT. So, since using std::vector did reduce the valg

[PATCH v2 3/3] [GCC] arm: vld1q_types_x4 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1q intrinsic for the arm port. This patch adds the _x4 variants of the vld1q intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.a

[PATCH v2 0/3] [GCC] arm: vld1q_types_xN ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
Add xN variants of vld1q_types intrinsic.

[PATCH v2 2/3] [GCC] arm: vld1q_types_x3 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1q intrinsic for the arm port. This patch adds the _x3 variants of the vld1q intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.a

[PATCH v2 1/3] [GCC] arm: vld1q_types_x2 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1q intrinsic for the arm port. This patch adds the _x2 variants of the vld1q intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.a

[PATCH v2 0/3] [GCC] arm: vst1_types_xN ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
Add xN variants of vst1_types intrinsic.

[PATCH v2 1/3] [GCC] arm: vst1_types_x2 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1 intrinsic for the arm port. This patch adds the _x2 variants of the vst1 intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.arm

[PATCH v2 2/3] [GCC] arm: vst1_types_x3 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1 intrinsic for the arm port. This patch adds the _x3 variants of the vst1 intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.arm

[PATCH v2 3/3] [GCC] arm: vst1_types_x4 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1 intrinsic for the arm port. This patch adds the _x4 variants of the vst1 intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.arm

Re: OpenMP offloading vs. C++ static local variables

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 04:09:04PM +0100, Thomas Schwinge wrote: > > Yeah, I believe we should in the omp_discover_* sub-pass handle with > > a help of a langhook automatically mark the guard variables (possibly > > iff the guarded variable is marked?), > > Looking at 'gcc/omp-offload.cc:omp_disco

Re: [PATCH] gcc: Disallow trampolines when -fhardened

2023-12-07 Thread Eric Botcazou
> I don't know either of these languages to write a test, and I don't see > anything that mentions the word trampoline in gfortran.dg/. Ada has > gnat.dg/trampoline3.adb but: > > $ gcc -c -Wtrampolines trampoline3.adb > trampoline3.adb:6:03: warning: variable "A" is read but never assigned > [-gn

[PATCH v2 2/3] [GCC] arm: vst1q_types_x3 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for the arm port. This patch adds the _x3 variants of the vst1q intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.a

[PATCH v2 3/3] [GCC] arm: vst1q_types_x4 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for the arm port. This patch adds the _x4 variants of the vst1q intrinsic. ACLE: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.arm.com/doc

[PATCH v2 0/3] [GCC] arm: vst1q_types_xN ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
Add xN variants of vst1q_types intrinsic.

[PATCH v2 1/3] [GCC] arm: vst1q_types_x2 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vst1q intrinsic for the arm port. This patch adds the _x2 variants of the vst1q intrinsic. ACLE documents: https://developer.arm.com/documentation/ihi0053/latest/ ISA documents: https://developer.a

[PATCH v2 0/3] [GCC] arm: vld1_types_xN ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
Add xN variants of vld1_types intrinsic.

[PATCH v2 3/3] [GCC] arm: vld1_types_x4 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1 intrinsic for the arm port. This patch adds the _x4 variants of the vld1 intrinsic. The previous vld1_x4 has been updated to vld1q_x4 to take into account that it works with 4-word-length types

[PATCH v2 2/3] [GCC] arm: vld1_types_x3 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1 intrinsic for the arm port. This patch adds the _x3 variants of the vld1 intrinsic. The previous vld1_x3 has been updated to vld1q_x3 to take into account that it works with 4-word-length types

[PATCH v2 1/3] [GCC] arm: vld1_types_x2 ACLE intrinsics

2023-12-07 Thread Ezra.Sitorus
From: Ezra Sitorus This patch is part of a series of patches implementing the _xN variants of the vld1 intrinsic for the arm port. This patch adds the _x2 variants of the vld1 intrinsic. The previous vld1_x2 has been updated to vld1q_x2 to take into account that it works with 4-word-length types

Re: [gcc15] nested functions in C

2023-12-07 Thread Eric Botcazou
> I think from a language standpoint, the general idea that nested > functions are just any functions inside functions (which is how the C > nested functions essentially behave) is too broad and they should be > restricted to minimal implementations that, e.g. don't have side-effects > or if they d

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 03:16:29PM +, Costas Argyris wrote: > > Still reachable memory at exit e.g. from valgrind is not a bug. > > Indeed, this is coming from a valgrind report here: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93019 > > where it was noted that the driver memory leaks

Re: [gcc15] nested functions in C

2023-12-07 Thread Siddhesh Poyarekar
On 2023-12-07 10:42, Eric Botcazou wrote: I think from a language standpoint, the general idea that nested functions are just any functions inside functions (which is how the C nested functions essentially behave) is too broad and they should be restricted to minimal implementations that, e.g. do

[PATCH V3 0/4] OpenMP: Improve data abstractions for context selectors

2023-12-07 Thread Sandra Loosemore
Here is a new version of my context selector implementation cleanup patch set, incorporating comments from Tobias on V2 of part 3. Parts 1 and 2 are unchanged from V1 except that I rebased them so they should apply cleanly to mainline head now. There's a new part 4 that adds new functionality, ha

[PATCH V3 2/4] OpenMP: Unify representation of name-list properties.

2023-12-07 Thread Sandra Loosemore
Previously, name-list properties specified as identifiers were stored in the TREE_PURPOSE/OMP_TP_NAME slot, while those specified as strings were stored in the TREE_VALUE/OMP_TP_VALUE slot. This patch puts both representations in OMP_TP_VALUE with a magic cookie in OMP_TP_NAME. gcc/ChangeLog

[PATCH V3 1/4] OpenMP: Introduce accessor macros and constructors for context selectors.

2023-12-07 Thread Sandra Loosemore
This patch hides the underlying nested TREE_LIST structure of context selectors behind accessor macros that have more meaningful names than the generic TREE_PURPOSE/TREE_VALUE accessors. There is a slight change to the representation in that the score expression in trait-selectors has a distinguis

[PATCH V3 4/4] OpenMP: Permit additional selector properties

2023-12-07 Thread Sandra Loosemore
This patch adds "hpe" to the known properties for the "vendor" selector, and support for "acquire" and "release" for "atomic_default_mem_order". gcc/ChangeLog * omp-general.cc (vendor_properties): Add "hpe". (atomic_default_mem_order_properties): Add "acquire" and "release".

[PATCH V3 3/4] OpenMP: Use enumerators for names of trait-sets and traits

2023-12-07 Thread Sandra Loosemore
This patch introduces enumerators to represent trait-set names and trait names, which makes it easier to use tables to control other behavior and for switch statements to dispatch on the tags. The tags are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or OMP_TS_ID) and are encode

Re: [PATCH] c-family: Fix up -fno-debug-cpp [PR111965]

2023-12-07 Thread Marek Polacek
On Thu, Dec 07, 2023 at 08:53:37AM +0100, Jakub Jelinek wrote: > Hi! > > As can be seen in the second testcase, -fno-debug-cpp is actually > implemented the same as -fdebug-cpp and so doesn't turn the debugging > off. > > The following patch fixes that. > > Bootstrapped/regtested on x86_64-linux

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Costas Argyris
Thanks for all the explanations. In that case I restrict this patch to just freeing the buffer from within driver::finalize only (I think it should be XDELETEVEC instead of XDELETE, no?). On Thu, 7 Dec 2023 at 15:42, Jakub Jelinek wrote: > On Thu, Dec 07, 2023 at 03:16:29PM +, Costas Argyri

RE: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-07 Thread Roger Sayle
Hi Jeff, Doh! Great catch. The perils of not (yet) being able to actually run any ARC execution tests myself. > Shouldn't operands[4] be GEN_INT ((HOST_WIDE_INT_1U << tmp) - 1)? Yes(-ish), operands[4] should be GEN_INT(HOST_WIDE_INT_1U << (tmp - 1)). And the 32s in the test cases need to be 16

Re: [PATCH] driver: Fix memory leak.

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 04:01:11PM +, Costas Argyris wrote: > Thanks for all the explanations. > > In that case I restrict this patch to just freeing the buffer from > within driver::finalize only (I think it should be XDELETEVEC > instead of XDELETE, no?). Both macros are exactly the same, b

Re: [PATCH] aarch64: add -fno-stack-protector to tests

2023-12-07 Thread Richard Sandiford
Marek Polacek writes: > Bootstrapped/regtested on aarch64-pc-linux-gnu, ok for trunk/13? > > -- >8 -- > These tests fail when the testsuite is executed with -fstack-protector-strong. > To avoid this, this patch adds -fno-stack-protector to dg-options. > > The list of FAILs is appended. As you can

Re: [PATCH] aarch64: add -fno-stack-protector to tests

2023-12-07 Thread Marek Polacek
On Thu, Dec 07, 2023 at 04:05:47PM +, Richard Sandiford wrote: > Marek Polacek writes: > > Bootstrapped/regtested on aarch64-pc-linux-gnu, ok for trunk/13? > > > > -- >8 -- > > These tests fail when the testsuite is executed with > > -fstack-protector-strong. > > To avoid this, this patch add

[PATCH] testsuite: add missing dg-require ifunc in pr105554.c

2023-12-07 Thread Marc Poulhiès
The 'target_clones' attribute depends on the ifunc support. gcc/testsuite/ChangeLog: * gcc.target/i386/pr105554.c: Add dg-require ifunc. --- Tested on x86_64-linux and x86_64-elf. Ok for master? gcc/testsuite/gcc.target/i386/pr105554.c | 1 + 1 file changed, 1 insertion(+) diff --git a

[PATCH] testsuite: adjust call to abort in excess-precision-12

2023-12-07 Thread Marc Poulhiès
abort() is not always available, using the builtin as done in other tests. gcc/testsuite/ChangeLog: * g++.target/i386/excess-precision-12.C: call builtin_abort instead of abort. --- Tested on x86_64-linux and x86_64-elf. Ok for master? gcc/testsuite/g++.target/i386/excess-precision-12

[PATCH] testsuite: require avx_runtime for vect-simd-clone-17f

2023-12-07 Thread Marc Poulhiès
The test fails parsing the 'vect' dump when not using -mavx. Make the dependency explicit. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-17f.c: Add dep on avx_runtime. --- Tested on x86_64-linux and x86_64-elf. Ok for master? gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c | 3

Re: [PATCH] testsuite: scev: expect fail on ilp32

2023-12-07 Thread Hans-Peter Nilsson
> Date: Mon, 4 Dec 2023 12:58:03 +0100 (CET) > From: Richard Biener > On Sat, 2 Dec 2023, Hans-Peter Nilsson wrote: > > > Date: Fri, 1 Dec 2023 08:07:14 +0100 (CET) > > > From: Richard Biener > > > I read from your messages that the testcases pass on arm*-*-*? > > Yes: they pass (currently XPASS

Re: [PATCH] testsuite: add missing dg-require ifunc in pr105554.c

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 05:25:39PM +0100, Marc Poulhiès wrote: > The 'target_clones' attribute depends on the ifunc support. > > gcc/testsuite/ChangeLog: > * gcc.target/i386/pr105554.c: Add dg-require ifunc. > --- > Tested on x86_64-linux and x86_64-elf. > > Ok for master? > > gcc/testsui

Re: [PATCH] testsuite: require avx_runtime for vect-simd-clone-17f

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 05:28:09PM +0100, Marc Poulhiès wrote: > The test fails parsing the 'vect' dump when not using -mavx. Make the > dependency explicit. > > gcc/testsuite/ChangeLog: > > * gcc.dg/vect/vect-simd-clone-17f.c: Add dep on avx_runtime. > --- > Tested on x86_64-linux and x86_

Re: [PATCH] testsuite: adjust call to abort in excess-precision-12

2023-12-07 Thread Jakub Jelinek
On Thu, Dec 07, 2023 at 05:27:28PM +0100, Marc Poulhiès wrote: > abort() is not always available, using the builtin as done in other > tests. > > gcc/testsuite/ChangeLog: > > * g++.target/i386/excess-precision-12.C: call builtin_abort instead of > abort. > --- > Tested on x86_64-linux and

Re: [PATCH v2 0/3] [GCC] arm: vst1_types_xN ACLE intrinsics

2023-12-07 Thread Richard Earnshaw
Pushed, thanks. R. On 07/12/2023 15:28, ezra.sito...@arm.com wrote: Add xN variants of vst1_types intrinsic.

  1   2   >