[PATCH] Middle-end: Adjust decrement IV style partial vectorization COST model

2023-12-13 Thread Juzhe-Zhong
Hi, before this patch, a simple conversion case for RVV codegen: foo: ble a2,zero,.L8 addiw a5,a2,-1 li a4,6 bleua5,a4,.L6 srliw a3,a2,3 sllia3,a3,3 add a3,a3,a0 mv a5,a0 mv a4,a1 vse

[PATCH] i386: Make most MD builtins nothrow, leaf [PR112962]

2023-12-13 Thread Jakub Jelinek
Hi! The following patch makes most of x86 MD builtins nothrow,leaf (like most middle-end builtins are). For -fnon-call-exceptions it doesn't nothrow, better might be to still add it if the builtins don't read or write memory and can't raise floating point exceptions, but we don't have such inform

Re: [PATCH v3 3/4] RISC-V: Add crypto machine descriptions

2023-12-13 Thread juzhe.zh...@rivai.ai
+(define_insn "@pred_vandn_scalar" + [(set (match_operand:VI 0 "register_operand" "=vd, vr,vd, vr") +(if_then_else:VI + (unspec: +[(match_operand: 1 "vector_mask_operand" " vm,Wc1,vm,Wc1") + (match_operand 5 "vector_length_operand"" rK, rK,rK,

[PATCH] c++: Fix tinst_level::to_list [PR112968]

2023-12-13 Thread Jakub Jelinek
Hi! With valgrind checking, there are various errors reported on some C++26 libstdc++ tests, like: ==2009913== Conditional jump or move depends on uninitialised value(s) ==2009913==at 0x914C59: gt_ggc_mx_lang_tree_node(void*) (gt-cp-tree.h:107) ==2009913==by 0x8AB7A5: gt_ggc_mx_tinst_level

[PATCH] lower-bitint: Fix lowering of non-_BitInt to _BitInt cast merged with some wider cast [PR112940]

2023-12-13 Thread Jakub Jelinek
Hi! The following testcase ICEs, because a PHI argument from latch edge uses a SSA_NAME set only in a conditionally executed block inside of the loop. This happens when we have some outer cast which lowers its operand several times, under some condition with variable index, under different conditi

Re: [PATCH v2 1/4] RISC-V:Add crypto vector implied ISA info.

2023-12-13 Thread Kito Cheng
LGTM On Wed, Dec 13, 2023 at 5:14 PM Feng Wang wrote: > > Patch v2: Change the implied ISA info using the minimum set and add > dependencies info into the python script. > > Due to the crypto vector entension is depend on the Vector extension, > so the "v" info is added into implied ISA info wit

Re: [PATCH] lower-bitint: Fix lowering of non-_BitInt to _BitInt cast merged with some wider cast [PR112940]

2023-12-13 Thread Richard Biener
On Wed, 13 Dec 2023, Jakub Jelinek wrote: > Hi! > > The following testcase ICEs, because a PHI argument from latch edge > uses a SSA_NAME set only in a conditionally executed block inside of the > loop. > This happens when we have some outer cast which lowers its operand several > times, under so

Re: [PATCH] Middle-end: Adjust decrement IV style partial vectorization COST model

2023-12-13 Thread Richard Biener
On Wed, 13 Dec 2023, Juzhe-Zhong wrote: > Hi, before this patch, a simple conversion case for RVV codegen: > > foo: > ble a2,zero,.L8 > addiw a5,a2,-1 > li a4,6 > bleua5,a4,.L6 > srliw a3,a2,3 > sllia3,a3,3 > add a3,

Re: [PATCH v3 2/4] RISC-V: Add crypto vector builtin function.

2023-12-13 Thread juzhe.zh...@rivai.ai
+multiple_p (GET_MODE_BITSIZE (e.arg_mode (0)), +GET_MODE_BITSIZE (e.arg_mode (1)), &nunits); Change it into gcc_assert (multiple_p (...)) +/* A list of all Vector Crypto intrinsic functions. */ +static function_group_info cryoto_function_groups[] = { +#define DEF_R

Re: Re: [PATCH v2 1/4] RISC-V:Add crypto vector implied ISA info.

2023-12-13 Thread juzhe.zh...@rivai.ai
Hi, Kito. Vector crypto ISA is ratifed, but intrinsics is not. I wonder what the schedule of vector crypto intrinsic ? Will it be ratified before GCC-14 release (I personally think intrinsics stuff can be considered to be merged until the end of GCC-14, like I did in GCC-13 push rvv-intrinsic

RE: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-13 Thread Claudiu Zissulescu
Hi Roger, It looks good to me. Thank you for your contribution, Claudiu -Original Message- From: Roger Sayle Sent: Tuesday, December 5, 2023 4:00 PM To: gcc-patches@gcc.gnu.org Cc: 'Claudiu Zissulescu' Subject: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717. This pa

Re: [PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-13 Thread Alex Coplan
On 12/12/2023 15:58, Richard Sandiford wrote: > Alex Coplan writes: > > Hi, > > > > This is a v2 version which addresses feedback from Richard's review > > here: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html > > > > I'll reply inline to address specific comments. > >

[PATCH] Fix tests for gomp

2023-12-13 Thread Andre Vieira (lists)
Hi, Apologies for the delay and this mixup. I need to do something different This is to fix testisms initially introduced by: commit f5fc001a84a7dbb942a6252b3162dd38b4aae311 Author: Andre Vieira Date: Mon Dec 11 14:24:41 2023 + aarch64: enable mixed-types for aarch64 simdclones gcc/

[r14-6468 Regression] FAIL: std/time/year/io.cc -std=gnu++26 execution test on Linux/x86_64

2023-12-13 Thread haochen.jiang
On Linux/x86_64, a01462ae8bafa86e7df47a252917ba6899d587cf is the first bad commit commit a01462ae8bafa86e7df47a252917ba6899d587cf Author: Jonathan Wakely Date: Mon Dec 11 15:33:59 2023 + libstdc++: Fix std::format output of %C for negative years caused FAIL: std/time/year/io.cc -std

[r14-6470 Regression] FAIL: g++.dg/pr112822.C -std=gnu++98 (test for excess errors) on Linux/x86_64

2023-12-13 Thread haochen.jiang
On Linux/x86_64, 788e0d48ec639d44294434f4f20ae94023c3759d is the first bad commit commit 788e0d48ec639d44294434f4f20ae94023c3759d Author: Peter Bergner Date: Tue Dec 12 16:46:16 2023 -0600 testsuite: Add testcase for already fixed PR [PR112822] caused FAIL: g++.dg/pr112822.C -std=gnu++1

Re: [PATCH] Fix tests for gomp

2023-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2023 at 10:43:16AM +, Andre Vieira (lists) wrote: > Hi, > > Apologies for the delay and this mixup. I need to do something different > > This is to fix testisms initially introduced by: > commit f5fc001a84a7dbb942a6252b3162dd38b4aae311 > Author: Andre Vieira > Date: Mon Dec

Re: [PATCH] Fix tests for gomp

2023-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2023 at 11:55:52AM +0100, Jakub Jelinek wrote: > On Wed, Dec 13, 2023 at 10:43:16AM +, Andre Vieira (lists) wrote: > > --- a/libgomp/testsuite/libgomp.c/declare-variant-1.c > > +++ b/libgomp/testsuite/libgomp.c/declare-variant-1.c > > @@ -40,16 +40,17 @@ f04 (int a) > > int > >

Re: [PATCH] Fix tests for gomp

2023-12-13 Thread Andre Vieira (lists)
On 13/12/2023 10:55, Jakub Jelinek wrote: On Wed, Dec 13, 2023 at 10:43:16AM +, Andre Vieira (lists) wrote: Hi, Apologies for the delay and this mixup. I need to do something different This is to fix testisms initially introduced by: commit f5fc001a84a7dbb942a6252b3162dd38b4aae311 Autho

[PATCH] extend.texi: Fix typos in LSX intrinsics

2023-12-13 Thread Jiajie Chen
Several typos have been found and fixed: missing semicolons, using variable name instead of type and wrong types. gcc/ChangeLog: * doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name. (__lsx_vfrintrm_d, __lsx_vfrintrm_s, __lsx_vfrintrne_d, __lsx_vfrintrne_s, __lsx_vf

Re: [PATCH] Fix tests for gomp

2023-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2023 at 11:03:50AM +, Andre Vieira (lists) wrote: > Hmm I think I understand what you are saying, but I'm not sure I agree. > So before I enabled simdclone testing for aarch64, this test had no target > selectors. So it checked the same for 'all simdclone test targets'. Whic

[PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi

2023-12-13 Thread demin . han
Some toolchain configs would report: fatal error: gnu/stubs-ilp32.h: No such file or directory Fix method suggested by Juzhe-Zhong gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/riscv_vector.h: New file. Signed-off-by: demin.han --- .../gcc.dg/vect/costmodel/riscv/rvv/

Re: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi

2023-12-13 Thread juzhe.zh...@rivai.ai
LGTM. juzhe.zh...@rivai.ai From: demin.han Date: 2023-12-13 19:12 To: gcc-patches@gcc.gnu.org CC: juzhe.zh...@rivai.ai; pan2...@intel.com Subject: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi Some toolchain configs would report: fatal error: gnu/stubs-ilp32.h: No such file or

Re: [PATCH v3] A new copy propagation and PHI elimination pass

2023-12-13 Thread Richard Biener
On Fri, 8 Dec 2023, Filip Kastl wrote: > > > Hi, > > > > > > this is a patch that I submitted two months ago as an RFC. I added some > > > polish > > > since. > > > > > > It is a new lightweight pass that removes redundant PHI functions and as a > > > bonus does basic copy propagation. With Jan

RE: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi

2023-12-13 Thread Li, Pan2
Committed, thanks all. Pan From: juzhe.zh...@rivai.ai Sent: Wednesday, December 13, 2023 7:16 PM To: demin.han ; gcc-patches Cc: Li, Pan2 Subject: Re: [PATCH v2] RISC-V: Fix dynamic lmul tests depended on abi LGTM. juzhe.zh...@rivai.ai

Re: [PATCH] [ICE] Support vpcmov for V4HF/V4BF/V2HF/V2BF under TARGET_XOP.

2023-12-13 Thread Jakub Jelinek
On Fri, Dec 08, 2023 at 03:12:00PM +0800, liuhongt wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ready push to trunk. > > gcc/ChangeLog: > > PR target/112904 > * config/i386/mmx.md (*xop_pcmov_): New define_insn. > > gcc/testsuite/ChangeLog: > > * g++.ta

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
Hi Juzhe, in general looks OK to me. Just a question for understanding: > - if (header_info.valid_p () > - && (anticipated_exp_p (header_info) || block_info.full_available)) Why is full_available true if we cannot use it? > +/* { dg-do compile } */ It would be nice if we could make

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
I don”t choose to run since I didn”t have issue run on my local simulator no matter qemu or spike.So it”s better to check vsetvl asm.full available is not consistent between LCM analysis and earliest fusion,so it”s safe to postpone it. Replied Message FromRobin DappDate12/13/2023 20:08 ToJu

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> I don”t choose to run since I didn”t have issue run on my local > simulator no matter qemu or spike. Yes it was flaky. That's kind of expected with the out-of-bounds writes we did. They can depend on runtime environment and other factors. Of course it's a bit counterintuitive to add a (befo

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
Do you mean add some comments in tests? Replied Message FromRobin DappDate12/13/2023 20:16 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Robin Dapp
Thanks. The attached v2 goes with your suggestion and adds a vec_extractbi expander. Apart from that it keeps the MODE_PRECISION changes from before and uses insn_data[icode].operand[0]'s mode. Apart from that no changes on the riscv side. Bootstrapped and regtested on x86 and aarch64. On cfar

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread chenglulu
在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli. gcc/ChangeLog: PR target/112936 * confi

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread Robin Dapp
> Do you mean add some comments in tests? I meant add it as a run test as well and comment that the test has caused out-of-bounds writes before and passed by the time of adding it (or so) and is kept regardless. Regards Robin

Re: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]

2023-12-13 Thread juzhe.zhong
OK. will add it later. Replied Message FromRobin DappDate12/13/2023 20:23 Tojuzhe.zhong Ccrdapp@gmail.com,gcc-patches@gcc.gnu.org,kito.ch...@gmail.com,kito.ch...@sifive.com,jeffreya...@gmail.comSubjectRe: [PATCH] RISC-V: Postpone full available optimization [VSETVL PASS]> Do you mean ad

[PATCH][0/6][RFC] Relax single-vector-size restriction

2023-12-13 Thread Richard Biener
I've been asked to look into how to best relax the current restriction of the vectorizer that it prefers to use a single vector size throughout loop vectorization. That size is determined by the preferred_simd_mode and the autovectorize_vector_modes hook for other-than-first iterations. The tar

[PATCH 1/6] Reduce the number of get_vectype_for_scalar_type calls

2023-12-13 Thread Richard Biener
The following removes get_vectype_for_scalar_type calls when we already have the vector type computed. It also avoids some premature and possibly redundant or unnecessary check during data-ref analysis for gathers. * tree-vect-data-refs.cc (vect_analyze_data_refs): Do not check fo

[committed] libstdc++: Fix regression in std::format output of %Y for negative years

2023-12-13 Thread Jonathan Wakely
It seems that what I pushed didn't match what I tested, due to testing on a different machine! Tested x86_64-linux, on the right machine this time. Pushed to trunk. -- >8 -- The change in r14-6468-ga01462ae8bafa8 was only supposed to apply to %C formats, not %Y. libstdc++-v3/ChangeLog:

[PATCH 4/6] More explicit vector types

2023-12-13 Thread Richard Biener
This reduces more calls to get_vectype_for_scalar_type. * tree-vect-loop.cc (vect_transform_cycle_phi): Specify the vector type for invariant/external defs. * tree-vect-stmts.cc (vectorizable_shift): For invariant or external shifted operands use the result vector t

[PATCH 2/6] Set LOOP_VINFO_VECT_FACTOR only when it is final

2023-12-13 Thread Richard Biener
The following makes sure to keep LOOP_VINFO_VECT_FACTOR at the indetermined value zero until it is final, making LOOP_VINFO_VECT_FACTOR an rvalue and changing some direct references to use the macro. * tree-vectorizer.h (LOOP_VINFO_VECT_FACTOR): Make an rvalue. * tree-vect-loop.cc

[PATCH 5/6] Allow poly_uint64 for group_size args to vector type query routines

2023-12-13 Thread Richard Biener
The following changes the unsigned group_size argument to a poly_uint64 one to avoid too much special-casing in callers for VLA vectors when passing down the effective maximum desirable vector size to vector type query routines. The intent is to be able to pass down the vectorization factor (times

[PATCH 3/6] Query an appropriate offset vector type in vect_gather_scatter_fn_p

2023-12-13 Thread Richard Biener
The gather_load optab and friends require the offset vector mode to have the same number of lanes as the data vector mode. Restrict the vector type query to that when searching for a proper offset type. * tree-vect-data-refs.cc (vect_gather_scatter_fn_p): Use get_related_vectype_f

[PATCH 6/6] Defer assigning vector types until after VF is determined

2023-12-13 Thread Richard Biener
The following defers, for non-gather/scatter and non-pattern stmts, setting of STMT_VINFO_VECTYPE until after we computed the desired vectorization factor. This allows us to use larger vector types when the vectorization factor and the preferred vector mode allow, reducing the number of vector stm

Re: [PATCH 1/3] LoongArch: Include rtl.h for COSTS_N_INSNS instead of hard coding our own

2023-12-13 Thread chenglulu
LGTM! Thanks. 在 2023/12/10 上午1:03, Xi Ruoyao 写道: With loongarch-def.cc switched from C to C++, we can include rtl.h for COSTS_N_INSNS, instead of hard coding our own. THis is a non-functional change for now, but it will make the code more future-proof in case COSTS_N_INSNS in rtl.h would be ch

Re: [PATCH 3/3] LoongArch: Add alslsi3_extend

2023-12-13 Thread chenglulu
LGTM! Thanks! 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Following the instruction cost fix, we are generating alsl.w $a0, $a0, $a0, 4 instead of li.w $t0, 17 mul.w $a0, $t0 for "x * 4", because alsl.w is 4 times faster than mul.w. But we didn't have a sign-extending pattern for al

Re: Re: [PATCH v3 2/4] RISC-V: Add crypto vector builtin function.

2023-12-13 Thread Feng Wang
2023-12-13 18:18 juzhe.zhong wrote: > > >+    multiple_p (GET_MODE_BITSIZE (e.arg_mode (0)), >+    GET_MODE_BITSIZE (e.arg_mode (1)), &nunits); > >Change it into gcc_assert (multiple_p (...)) > >+/* A list of all Vector Crypto intrinsic functions.  */ >+static function_group_in

[committed] RISC-V:Add crypto vector implied ISA info.

2023-12-13 Thread Feng Wang
Due to the crypto vector entension is depend on the Vector extension, so add the implied ISA info with the corresponding crypto vector extension. gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Modify implied ISA info. * config/riscv/arch-canonicalize: Add crypto vector impl

Re: [PATCH 2/3] LoongArch: Fix instruction costs [PR112936]

2023-12-13 Thread Xi Ruoyao
On Wed, 2023-12-13 at 20:22 +0800, chenglulu wrote: 在 2023/12/10 上午1:03, Xi Ruoyao 写道: Replace the instruction costs in loongarch_rtx_cost_data constructor based on micro-benchmark results on LA464 and LA664. This allows optimizations like "x * 17" to alsl, and "x * 68" to alsl and slli. gcc/Cha

[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread pan2 . li
From: Pan Li Refine the test cases for: * Name convention. * Add run case. PR target/112929 PR target/112988 gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr112929.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr112929-1.c: ...here. * gcc.target

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread juzhe.zhong
lgtm from my side. But I'd like to see Robin's commentsThanks Replied Message Frompan2...@intel.comDate12/13/2023 21:49 Togcc-patches@gcc.gnu.org Ccjuzhe.zh...@rivai.ai,pan2...@intel.com,rdapp@gmail.comSubject[PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

Re: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Robin Dapp
Thanks, LGTM but please add a comment like: These test cases used to cause out-of-bounds writes to the stack and therefore showed unreliable behavior. Depending on the execution environment they can either pass or fail. As of now, with the latest QEMU version, they will pass even without the und

[committed] aarch64 testsuite: Only run aarch64-ssve tests once

2023-12-13 Thread Andrew Carlotti
Results verified by running `RUNTESTFLAGS="aarch64-ssve.exp=*" make -k -j 56 check-gcc` before and after the change. I initally spotted the issue because the tests were being run a nondeterministic number of time during unrelated regresison testing. Committed as obvious. gcc/testsuite/ChangeLog:

Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread Richard Sandiford
Robin Dapp writes: > @@ -1758,16 +1759,19 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 > bitsize, poly_uint64 bitnum, >if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) > { >scalar_mode innermode = GET_MODE_INNER (outermode); >enum insn_code icode > = convert_optab

Re: Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773].

2023-12-13 Thread 钟居哲
Thanks Richard. LGTM for RISC-V part. Thanks Robin for fixing it. juzhe.zh...@rivai.ai From: Richard Sandiford Date: 2023-12-13 22:05 To: Robin Dapp CC: Richard Biener; gcc-patches; juzhe.zhong\@rivai.ai Subject: Re: [PATCH] expmed: Perform mask extraction via QImode [PR112773]. Robin Dapp

RE: [PATCH v1] RISC-V: Refine test cases for both PR112929 and PR112988

2023-12-13 Thread Li, Pan2
Committed with below comments, thanks Juzhe and Robin. Pan -Original Message- From: Robin Dapp Sent: Wednesday, December 13, 2023 9:56 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: rdapp@gmail.com; juzhe.zh...@rivai.ai Subject: Re: [PATCH v1] RISC-V: Refine test cases for both PR112

RE: [PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-12-13 Thread Tamar Christina
> > > else if (vect_use_mask_type_p (stmt_info)) > > > { > > > unsigned int precision = stmt_info->mask_precision; > > > scalar_type = build_nonstandard_integer_type (precision, 1); > > > vectype = get_mask_type_for_scalar_type (vinfo, scalar_type, > > > group_size); > > >

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Peter Bergner
On 12/13/23 2:05 AM, Jakub Jelinek wrote: > On Wed, Dec 13, 2023 at 08:51:16AM +0100, Richard Biener wrote: >> On Tue, 12 Dec 2023, Peter Bergner wrote: >> >>> On 12/12/23 8:36 PM, Jason Merrill wrote: This test is failing for me below C++17, I think you need // { dg-do compile { tar

[PATCH] LoongArch: Use the movcf2gr instruction to implement cstore4

2023-12-13 Thread Xi Ruoyao
We used a branch to load floating-point comparison results into GPR. This is very slow when the branch is not predictable. Use the movcf2gr instruction to implement cstore4 if movcf2gr is fast enough. gcc/ChangeLog: * config/loongarch/genopts/loongarch.opt.in (muse-movcf2gr): New

Re: [PATCH v3 1/6] libgomp: basic pinned memory on Linux

2023-12-13 Thread Andrew Stubbs
On 12/12/2023 09:02, Tobias Burnus wrote: On 11.12.23 18:04, Andrew Stubbs wrote: Implement the OpenMP pinned memory trait on Linux hosts using the mlock syscall.  Pinned allocations are performed using mmap, not malloc, to ensure that they can be unpinned safely when freed. This implementati

Re: [PATCH v4] aarch64: SVE/NEON Bridging intrinsics

2023-12-13 Thread Richard Sandiford
Richard Ball writes: > ACLE has added intrinsics to bridge between SVE and Neon. > > The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and > SVE vectors. > > This patch adds support to GCC for the following 3 intrinsics: > svset_neonq, svget_neonq and svdup_neonq > > gcc/Chan

Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-13 Thread Mark Rutland
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote: > On 2023/12/13 16:48, Dan Li wrote: > > + Likun > > > > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote: > >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra wrote: > >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > >>> > In

[committed v2] aarch64: Add missing driver-aarch64 dependencies

2023-12-13 Thread Andrew Carlotti
On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > The .def files are included in TM_H by: > > TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ > $(srcdir)/config/aarch64/aarch64-tuning-flags.def \ > $(srcdir)/config/aarch64/aarch

Re: [PATCH v2 09/11] aarch64: Rewrite non-writeback ldp/stp patterns

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > On 12/12/2023 15:58, Richard Sandiford wrote: >> Alex Coplan writes: >> > Hi, >> > >> > This is a v2 version which addresses feedback from Richard's review >> > here: >> > >> > https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637648.html >> > >> > I'll reply inline to

[committed v2] aarch64 testsuite: Check entire .arch string

2023-12-13 Thread Andrew Carlotti
Add a terminating newline to various tests, and add missing extensions to some test strings. The current output is broken for options_set_4.c, so this test is left unchanged, to be fixed in a subsequent patch. Committed as obvious, with options_set_4.c removed compared to v1. gcc/testsuite/Chang

[PATCH v2] aarch64: Fix +nocrypto handling

2023-12-13 Thread Andrew Carlotti
Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is retained because removing it would make processing the data in option-extensions.def signific

[PATCH v2] aarch64: Fix +nopredres, +nols64 and +nomops

2023-12-13 Thread Andrew Carlotti
On Sat, Dec 09, 2023 at 07:22:49PM +, Richard Sandiford wrote: > Andrew Carlotti writes: > > ... > > This is the only use of native_detect_p, so it'd be good to remove > the field itself. Done > > ... > > > > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv) > >i

[wwwdocs][patch] gcc-14/changes.html + project/gomp/: Update OpenMP status

2023-12-13 Thread Tobias Burnus
Attached is an in-between update for the release notes and also for the project status page. The latter contains an implementation-status page that is updated based on the libgomp.texi entries; I think there are more issues, but I found an incomplete update which is now fixed. I probably need

[PATCH v2] extend.texi: Fix typos in LSX intrinsics

2023-12-13 Thread Jiajie Chen
Several typos have been found and fixed: missing semicolons, using variable name instead of type, duplicate functions and wrong types. gcc/ChangeLog: * doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name. (__lsx_vfrintrm_d, __lsx_vfrintrm_s, __lsx_vfrintrne_d, __lsx_

[committed] amdgcn: XNACK support

2023-12-13 Thread Andrew Stubbs
Some AMD GCN devices support an "XNACK" mode in which the device can handle page-misses (and maybe other traps in memory instructions), but it's not completely invisible to software. We need this now to support OpenMP Unified Shared Memory (I plan to post updated patches for that in January),

Re: Disable FMADD in chains for Zen4 and generic

2023-12-13 Thread Jan Hubicka
> > The diffrerence is that Cores understand the fact that fmadd does not need > > all three parameters to start computation, while Zen cores doesn't. > > > > Since this seems noticeable win on zen and not loss on Core it seems like > > good > > default for generic. > > > > I plan to commit the pa

Re: [PATCH] tree-optimization/111807 - ICE in verify_sra_access_forest

2023-12-13 Thread Martin Jambor
Hi, sorry for getting to this only so late, my email backlog from my medical leave still isn't empty. On Mon, Oct 16 2023, Richard Biener wrote: > The following addresses build_reconstructed_reference failing to > build references with a different offset than the models and thus > the caller cond

[PATCH v4] A new copy propagation and PHI elimination pass

2023-12-13 Thread Filip Kastl
> > > Hi, > > > > > > this is a patch that I submitted two months ago as an RFC. I added some > > > polish > > > since. > > > > > > It is a new lightweight pass that removes redundant PHI functions and as a > > > bonus does basic copy propagation. With Jan Hubi?ka we measured that it > > > is a

Re: [PATCH] tree-optimization/111807 - ICE in verify_sra_access_forest

2023-12-13 Thread Richard Biener
> Am 13.12.2023 um 17:07 schrieb Martin Jambor : > > Hi, > > sorry for getting to this only so late, my email backlog from my medical > leave still isn't empty. > >> On Mon, Oct 16 2023, Richard Biener wrote: >> The following addresses build_reconstructed_reference failing to >> build refere

Re: [PATCH v4] A new copy propagation and PHI elimination pass

2023-12-13 Thread Richard Biener
> Am 13.12.2023 um 17:12 schrieb Filip Kastl : > >  >> Hi, this is a patch that I submitted two months ago as an RFC. I added some polish since. It is a new lightweight pass that removes redundant PHI functions and as a bonus does basic copy propagat

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jason Merrill
On 12/12/23 21:36, Jason Merrill wrote: On 12/12/23 17:50, Peter Bergner wrote: On 12/12/23 1:26 PM, Richard Biener wrote: Am 12.12.2023 um 19:51 schrieb Peter Bergner : On 12/12/23 12:45 PM, Peter Bergner wrote: +/* PR target/112822 */ Oops, this should be: /* PR tree-optimization/112822

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jakub Jelinek
On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote: > gcc/testsuite/ChangeLog: > > * g++.dg/pr112822.C: Require C++17. > --- > gcc/testsuite/g++.dg/pr112822.C | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/gcc/testsuite/g++.dg/pr112822.C b/gcc/testsuite/g++.dg/pr112822

[pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location information. gcc/cp/ChangeLog: * tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/constexpr-nsdmi7b.C

[pushed 3/4] c++: fix in-charge parm in constexpr

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- I was puzzled by the proposed patch for PR71093 specifically ignoring the in-charge parameter; the problem turned out to be that when cxx_eval_call_expression jumps from the clone to the cloned function, it assumes that the latter has the sa

[pushed 4/4] c++: End lifetime of objects in constexpr after destructor call [PR71093]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. This is modified from Nathaniel's last version by adjusting for my recent CLOBBER changes and removing the special handling of __in_chrg which is no longer needed since my previous commit. -- 8< -- This patch adds checks for using objects after they

[pushed 2/4] c++: constant direct-initialization [PR108243]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- When testing the proposed patch for PR71093 I noticed that it changed the diagnostic for consteval-prop6.C. I then noticed that the diagnostic wasn't very helpful either way; it was complaining about modification of the 'x' variable, but it

[PATCH] middle-end: Fix up constant handling in emit_conditional_move [PR111260]

2023-12-13 Thread Andrew Pinski
After r14-2667-gceae1400cf24f329393e96dd9720, we force a constant to a register if it is shared with one of the other operands. The problem is used the comparison mode for the register but that could be different from the operand mode. This causes some issues on some targets. To fix it, we either

Re: [r14-6468 Regression] FAIL: std/time/year/io.cc -std=gnu++26 execution test on Linux/x86_64

2023-12-13 Thread Jonathan Wakely
On Wed, 13 Dec 2023 at 10:51, haochen.jiang wrote: > > On Linux/x86_64, > > a01462ae8bafa86e7df47a252917ba6899d587cf is the first bad commit > commit a01462ae8bafa86e7df47a252917ba6899d587cf > Author: Jonathan Wakely > Date: Mon Dec 11 15:33:59 2023 + > > libstdc++: Fix std::format outp

Re: [PATCH] c++: Fix tinst_level::to_list [PR112968]

2023-12-13 Thread Jason Merrill
On 12/13/23 04:49, Jakub Jelinek wrote: Hi! With valgrind checking, there are various errors reported on some C++26 libstdc++ tests, like: ==2009913== Conditional jump or move depends on uninitialised value(s) ==2009913==at 0x914C59: gt_ggc_mx_lang_tree_node(void*) (gt-cp-tree.h:107) ==20099

Re: [PATCH v2] aarch64: Fix +nocrypto handling

2023-12-13 Thread Richard Sandiford
Andrew Carlotti writes: > Additionally, replace all checks for the AARCH64_FL_CRYPTO bit with > checks for (AARCH64_FL_AES | AARCH64_FL_SHA2) instead. The value of the > AARCH64_FL_CRYPTO bit within isa_flags is now ignored, but it is > retained because removing it would make processing the data

Re: [committed v2] aarch64: Add missing driver-aarch64 dependencies

2023-12-13 Thread Richard Sandiford
Andrew Carlotti writes: > On Sat, Dec 09, 2023 at 06:42:17PM +, Richard Sandiford wrote: >> Andrew Carlotti writes: >> The .def files are included in TM_H by: >> >> TM_H += $(srcdir)/config/aarch64/aarch64-fusion-pairs.def \ >> $(srcdir)/config/aarch64/aarch64-tuning-flags.def \ >>

Re: [pushed 1/4] c++: copy location to AGGR_INIT_EXPR

2023-12-13 Thread Patrick Palka
On Wed, 13 Dec 2023, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > -- 8< -- > > When building an AGGR_INIT_EXPR from a CALL_EXPR, we shouldn't lose location > information. > > gcc/cp/ChangeLog: > > * tree.cc (build_aggr_init_expr): Copy EXPR_LOCATION. I made

Re: [PATCH] SRA: Force gimple operand in an additional corner case (PR 112822)

2023-12-13 Thread Jason Merrill
On 12/13/23 11:26, Jakub Jelinek wrote: On Wed, Dec 13, 2023 at 11:24:42AM -0500, Jason Merrill wrote: gcc/testsuite/ChangeLog: * g++.dg/pr112822.C: Require C++17. --- gcc/testsuite/g++.dg/pr112822.C | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/g++.dg/pr112822.C

Re: [RFC/RFT,V2] CFI: Add support for gcc CFI in aarch64

2023-12-13 Thread Kees Cook
On Wed, Dec 13, 2023 at 05:01:07PM +0800, Wang wrote: > On 2023/12/13 16:48, Dan Li wrote: > > + Likun > > > > On Tue, 28 Mar 2023 at 06:18, Sami Tolvanen wrote: > >> On Mon, Mar 27, 2023 at 2:30 AM Peter Zijlstra > >> wrote: > >>> On Sat, Mar 25, 2023 at 01:54:16AM -0700, Dan Li wrote: > >>> >

Re: [PATCH] libgccjit: Add ability to get CPU features

2023-12-13 Thread Antoni Boucher
David: Ping. I guess if we want to have this merged for this release, it should be sooner rather than later (if it's still an option). On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote: > On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote: > > Hi. > > This patch adds support for getting

[pushed] c++: TARGET_EXPR location in default arg [PR96997]

2023-12-13 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- My r14-6505-g52b4b7d7f5c7c0 change to copy the location in build_aggr_init_expr reopened PR96997; let's fix it properly this time, by clearing the location like we do for other trees. PR c++/96997 gcc/cp/ChangeLog: * tree.

Re: [PATCH] libcpp: Fix valgrind errors on pr88974.c [PR112956]

2023-12-13 Thread Jason Merrill
On 12/13/23 03:39, Jakub Jelinek wrote: Hi! On the c-c++-common/cpp/pr88974.c testcase I'm seeing ==600549== Conditional jump or move depends on uninitialised value(s) ==600549==at 0x1DD3A05: cpp_get_token_1(cpp_reader*, unsigned int*) (macro.cc:3050) ==600549==by 0x1DBFC7F: _cpp_parse_

Re: [PATCH] c++: unifying constants vs their type [PR99186, PR104867]

2023-12-13 Thread Jason Merrill
On 12/12/23 16:21, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? OK. -- >8 -- When unifying constants we need to generally treat constants of different types but same value as different, in light of auto template parameters. This patch

Fix 'libgomp/config/linux/allocator.c' 'size_t' vs. '%ld' format string mismatch (was: Build breakage)

2023-12-13 Thread Thomas Schwinge
Hi! On 2023-12-13T20:36:40+0100, I wrote: > On 2023-12-13T11:15:54-0800, Jerry D via Gcc wrote: >> I am getting this failure to build from clean trunk. > > This is due to commit r14-6499-g348874f0baac0f22c98ab11abbfa65fd172f6bdd > "libgomp: basic pinned memory on Linux", which supposedly was only

Re: [PATCH v3] c++: fix ICE with sizeof in a template [PR112869]

2023-12-13 Thread Jason Merrill
On 12/12/23 17:48, Marek Polacek wrote: On Fri, Dec 08, 2023 at 11:09:15PM -0500, Jason Merrill wrote: On 12/8/23 16:15, Marek Polacek wrote: On Fri, Dec 08, 2023 at 12:09:18PM -0500, Jason Merrill wrote: On 12/5/23 15:31, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok

RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-13 Thread Thomas Schwinge
Hi Lipeng! On 2023-12-12T02:05:26+, "Zhu, Lipeng" wrote: > On 2023/12/12 1:45, H.J. Lu wrote: >> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng wrote: >> > On 2023/12/9 23:23, Jakub Jelinek wrote: >> > > On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote: >> > > > This patch try to intro

[PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

2023-12-13 Thread Alex Coplan
Hi, In PR112906 we ICE because we try to use force_reg to reload an auto-increment address, but force_reg can't do this. With the aim of fixing the PR by supporting reloading arbitrary addresses in pre-RA splitters, this patch generalizes lra-constraints.cc:emit_inc and makes it available to the

[PATCH 2/2] aarch64: Handle autoinc addresses in ld1rq splitter [PR112906]

2023-12-13 Thread Alex Coplan
This patch uses the new force_reload_address routine added by the previous patch to fix PR112906. Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? Thanks, Alex gcc/ChangeLog: PR target/112906 * config/aarch64/aarch64-sve.md (@aarch64_vec_duplicate_vq_le): Use f

[PATCH V3] RISC-V: XFAIL scan dump fails for autovec PR111311

2023-12-13 Thread Edwin Lu
Clean up scan dump failures on linux rv64 vector targets Juzhe mentioned could be ignored for now. This will help reduce noise and make it more obvious if a bug or regression is introduced. The failures that are still reported are either execution failures or failures that are also present on armv8

Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]

2023-12-13 Thread Peter Bergner
On 11/24/23 3:28 AM, Kewen.Lin wrote: >> + int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode); > > Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16? > I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more, > we have to add one explicit .to_constant () here.

Re: [PATCH v3 10/11] aarch64: Add new load/store pair fusion pass

2023-12-13 Thread Richard Sandiford
Thanks for the update. The new comments are really nice, and I think make the implementation much easier to follow. I was going to say OK with the changes below, but there's one question/ comment near the end about the double list walk. Alex Coplan writes: > +// Convenience wrapper around strip

Re: [PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > Hi, > > In PR112906 we ICE because we try to use force_reg to reload an > auto-increment address, but force_reg can't do this. > > With the aim of fixing the PR by supporting reloading arbitrary > addresses in pre-RA splitters, this patch generalizes > lra-constraints.cc:emit

Re: [PATCH 2/2] aarch64: Handle autoinc addresses in ld1rq splitter [PR112906]

2023-12-13 Thread Richard Sandiford
Alex Coplan writes: > This patch uses the new force_reload_address routine added by the > previous patch to fix PR112906. > > Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk? OK, thanks, and sorry for the breakage. Richard > > Thanks, > Alex > > gcc/ChangeLog: > > PR target/1129

  1   2   >