Re: [PATCH 5/5] passes: Remove limit on the number of params

2024-10-14 Thread Eric Gallager
On Mon, Oct 14, 2024 at 10:17 AM Andrew Pinski wrote: > > On Mon, Oct 14, 2024 at 6:10 AM Richard Biener > wrote: > > > > On Mon, Oct 14, 2024 at 4:32 AM Andrew Pinski > > wrote: > > > > > > Having a limit of 2 params for NEXT_PASS was just done because I didn't > > > think there was > > > a w

[PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-14 Thread James K. Lowden
Consequent to advice, I'm preparing the Cobol front-end patches as a small number of hopefully meaningful patches covering many files. 1. meta files used by autotools etc. 2. gcc/cobol/*.h 3. gcc/cobol/*.{y,l,cc} 4. libgcobol 5. documentation 6. tests The patch below is step #1. It compr

[pushed: r15-4343] diagnostics: fix overload of emit_diagnostic [PR117109]

2024-10-14 Thread David Malcolm
I accidentally broke "make gcc.pot" in r15-4081 by adding a member function diagnostic_context::emit_diagnostic with a gmsgid in a different position to the existing emit_diagnostic functions, which exgettext's parser can't handle. Fixed thusly. Successfully bootstrapped & regrtested on x86_64-pc

[PATCH] Match: Remove dup match pattern for signed_integer_sat_sub [PR117141]

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to fix the warning as below: /home/slyfox/dev/git/gcc/gcc/match.pd:3424:3 warning: duplicate pattern (cond^ (ne (imagpart (IFN_SUB_OVERFLOW:c@2 @0 @1)) integer_zerop) ^ /home/slyfox/dev/git/gcc/gcc/match.pd:3397:3 warning: previous pattern defined here (con

Re: [PATCH] c++: Fix mangling of otherwise unattached class-scope lambdas [PR116568]

2024-10-14 Thread Nathaniel Shead
On Fri, Oct 11, 2024 at 10:37:11AM -0400, Jason Merrill wrote: > On 9/5/24 11:02 AM, Nathaniel Shead wrote: > > Bootstrapped and regtested (so far just dg.exp) on x86_64-pc-linux-gnu, > > OK for trunk if full regtest passes? Or would it be better to try to > > implement all the rules mentioned in

Re: [PATCH] expr: Don't clear whole unions [PR116416]

2024-10-14 Thread Jason Merrill
On 10/7/24 2:45 PM, Marek Polacek wrote: On Wed, Oct 02, 2024 at 05:52:13PM -0400, Jason Merrill wrote: On 10/2/24 3:20 PM, Marek Polacek wrote: On Sat, Sep 28, 2024 at 08:39:12AM +0200, Jakub Jelinek wrote: On Fri, Sep 27, 2024 at 04:01:33PM +0200, Jakub Jelinek wrote: So, I think we should

Re: [pushed] c++: address deduction and concepts [CWG2918]

2024-10-14 Thread Patrick Palka
On Mon, 14 Oct 2024, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > -- 8< -- > > CWG2918 changes deduction from an overload set for the case where multiple > candidates succeed and have the same type; previously this made the overload > set a non-deduced context, now i

Re: Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?

2024-10-14 Thread Sam James
Sam James writes: > Andre Vehreschild writes: > >> Hi all, >> >> please note, that I don't know this bisecting very well, so this may very >> well >> be a wrong blame. During latest regression testing of the Fortran suite I got >> typebound_operator_7.f03 failing with: >> >> typebound_operator_

[PATCH] dce: Remove FIXME that has not been true for years

2024-10-14 Thread Andrew Pinski
This FIXME: FIXME: Aggressive mode before PRE doesn't work currently because the dominance info is not invalidated after DCE1. Has not been true since at least r0-104723-g5ac60b564faa85 which added a call to calculate_dominance_info. Plus we run agressive mode before PRE since r0-8916

[PATCH 0/2] Automate creation of -O2 and -Os multilib variants

2024-10-14 Thread Keith Packard
When building toolchains for embedded development, some projects will want to optimize for speed while others are much more concerned about overall code size. We can do this using the existing GCC multilib infrastructure, adding suitable speed and size variants as another dimension of the multilib

[PATCH 1/2] libgcc: Use -Os/-Oz from CC or CFLAGS

2024-10-14 Thread Keith Packard
Override other optimization settings with any -Os or -Oz found in CC or CFLAGS. libgcc/ChangeLog: * Makefile.in: Use -Os or -Oz from CC or CFLAGS Signed-off-by: Keith Packard --- libgcc/Makefile.in | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/libgcc/Ma

[PATCH 2/2] gcc: Add --enable-multilib-space option

2024-10-14 Thread Keith Packard
This option adds a per-multilib variant that specifies -Os instead of the default. With this, all multilib libraries will be built with -Os as well as -O2; the appropriate one selected automatically at link time based upon the optimization level selected. This could be done in the target configura

[PATCH 2/2] [x86] Canonicalize (vec_merge (fma: op2 op1 op3) (match_dup 1)) mask) to (vec_merge (fma: op1 op2 op3) (match_dup 1)) mask)

2024-10-14 Thread liuhongt
For masked FMA, there're 2 forms of RTL representation 1) (vec_merge (fma: op2 op1 op3) op1) mask) 2) (vec_merge (fma: op1 op2 op3) op1) mask) It's because op1 op2 are communatative in RTL(the second op1 is written as (match_dup 1)) we once tried to replace (match_dup 1) with (match_operand:VFH_AV

[PATCH 1/2] [Middle-end] Canonicalize (vec_merge (fma op2 op1 op3) op1 mask) to (vec_merge (fma op1 op2 op3) op1 mask).

2024-10-14 Thread liuhongt
For x86 masked fma, there're 2 rtl representations 1) (vec_merge (fma op2 op1 op3) op1 mask) 2) (vec_merge (fma op1 op2 op3) op1 mask). 5894(define_insn "_fmadd__mask" 5895 [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v") 5896(vec_merge:VFH_AVX512VL 5897 (fma:VF

[PATCH 0/2] Canonicalize (vec_merge (fma op1 op2 op3) op1 mask) to (vec_merge (fma op1 op2 op3) op1 mask)

2024-10-14 Thread liuhongt
For x86 masked fma, there're 2 rtl representations 1) (vec_merge (fma op2 op1 op3) op1 mask) 2) (vec_merge (fma op1 op2 op3) op1 mask). 5894(define_insn "_fmadd__mask" 5895 [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v,v") 5896(vec_merge:VFH_AVX512VL 5897 (fma:

[PATCH] Support andn_optab for x86

2024-10-14 Thread Cui, Lili
Hi all, This patch is to add andn_optab for x86. Bootstrapped and regtested on x86-64-linux-pc, OK for trunk? Regards, Lili. Add new andn pattern to match the new optab added by r15-1890-gf379596e0ba99d. Only enable 64bit, 128bit and 256bit vector ANDN, X86-64 has mask mov instruction when avx

[PATCH] Introduce TARGET_FMV_ATTR_SEPARATOR

2024-10-14 Thread Yangyu Chen
Some architectures may use ',' in the attribute string, but it is not used as the separator for different targets. To avoid conflict, we introduce a new macro TARGET_FMV_ATTR_SEPARATOR to separate different clones. As an example, according to RISC-V C-API Specification [1], RISC-V allows ',' in th

[PATCH] c, libcpp: Partially implement C2Y N3353 paper [PR117028]

2024-10-14 Thread Jakub Jelinek
Hi! The following patch partially implements the N3353 paper. In particular, it adds support for the delimited escape sequences (\u{123}, \x{123}, \o{123}) which were added already for C++23, all I had to do is split the delimited escape sequence guarding from named universal character escape sequ

[PATCH][wwwdoc] Mention O2 vectorization enhancement.

2024-10-14 Thread liuhongt
--- htdocs/gcc-15/changes.html | 10 ++ 1 file changed, 10 insertions(+) diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index 6dc46a52..8a238256 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -36,6 +36,16 @@ a work-in-progress. General

[pushed] c++: address deduction and concepts [CWG2918]

2024-10-14 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- CWG2918 changes deduction from an overload set for the case where multiple candidates succeed and have the same type; previously this made the overload set a non-deduced context, now it succeeds since the result is consistent between the can

[PATCH 1/2] c++: some further concepts cleanups

2024-10-14 Thread Patrick Palka
This patch further cleans up the concepts code following the removal of Concepts TS support: * concept-ids are now the only kind of "concept check", so we can simplify some code accordingly. In particular resolve_concept_check seems like a no-op and can be removed. * In turn, deduce_c

[PATCH 1/2] c++: some further concepts cleanups

2024-10-14 Thread Patrick Palka
This patch further cleans up the concepts code following the removal of Concepts TS support: * concept-ids are now the only kind of "concept check", so we can simplify some code accordingly. In particular resolve_concept_check seems like a no-op and can be removed. * In turn, deduce_c

[PATCH 2/2] c++: constrained auto NTTP vs associated constraints

2024-10-14 Thread Patrick Palka
According to [temp.param]/11, the constraint on an auto NTTP is an associated constraint and so should be checked as part of satisfaction of the overall associated constraints rather than checked individually during coerion/deduction. In order to implement this we mainly need to make handling of c

[PATCH] c++: unifying lvalue vs rvalue (non-forwarding) ref [PR116710]

2024-10-14 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- When unifying two (non-forwarding) reference types, unify immediately recurses into the reference type without first comparing rvalueness. (Note that at this point forwarding references have already been coll

[PATCH] c++: checking ICE w/ lambda targ inside constexpr if [PR117054]

2024-10-14 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- Here we're tripping over the assert in extract_locals_r which enforces that an extra-args tree appearing inside another extra-args tree doesn't actually have extra args. This invariant no longer always holds

Re: Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?

2024-10-14 Thread Sam James
Andre Vehreschild writes: > Hi all, > > please note, that I don't know this bisecting very well, so this may very well > be a wrong blame. During latest regression testing of the Fortran suite I got > typebound_operator_7.f03 failing with: > > typebound_operator_7.f03:94:25: > >94 | u = (u*

Re: [PATCH 0/5] Provide better definitions of NULL

2024-10-14 Thread Alejandro Colomar
Hi Jason, You've recently touched code about C++ modules. Do you have any idea of why my changes may be introducing regressions in the tests? Have a lovely night! Alex On Sun, Oct 13, 2024 at 11:56:55PM GMT, Alejandro Colomar wrote: > Hi Joseph, > > On Fri, Oct 11, 2024 at 01:44:36PM GMT, Alej

[PATCH] Android: Fix build for Android

2024-10-14 Thread yxj-github-437
gcc/ * config.gcc: fix target aarch64-linux-android, arm-linux-androideabi, i686-linux-android, x86_64-linux-android * config/linux-android.h: fix SPEC based on aarch64-linux-android-clang * config/aarch64/aarch64-elf.h: Add Macro DEFAULT_ASM_SPEC * config/aa

[PATCH] RISC-V: Fix UNRESOLVED testcases for SAT alu vector mode

2024-10-14 Thread pan2 . li
From: Pan Li Some saturation related alu testcases missed additional option for expand check, which result in some UNRESOLVED issues. This patch would like to fix it by adding the option back as other testcases. The below test are passed for this patch. * The rv64gcv fully regression test. It

Re: [PATCH] RISC-V: Fix UNRESOLVED testcases for SAT alu vector mode

2024-10-14 Thread Kito Cheng
LGTM, I just saw that yesterday as well, fortunately, I haven't started fixing yet. :P On Tue, Oct 15, 2024 at 9:43 AM wrote: > > From: Pan Li > > Some saturation related alu testcases missed additional option > for expand check, which result in some UNRESOLVED issues. This > patch would like

RE: [PATCH] RISC-V: Fix UNRESOLVED testcases for SAT alu vector mode

2024-10-14 Thread Li, Pan2
Thanks Kito. Just notice these some silly mistakes this morning, ;)! Pan -Original Message- From: Kito Cheng Sent: Tuesday, October 15, 2024 9:45 AM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; rdapp@gmail.com Subject: Re: [PATCH] RISC-V:

Re: [Patch] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device

2024-10-14 Thread rep . dot . nop
On 14 October 2024 10:23:56 CEST, Thomas Schwinge wrote: >Hi Tobias! > >On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > >> Tobias Burnus wrote: >>> Anyone feeling like reviewing this patch? > >Yes. But please allow for more than 1 1/2 work days. >

[PATCH] RISC-V: Fix feature_bits.c failed to compile on non-Linux targets

2024-10-14 Thread Yangyu Chen
The feature_bits.c file failed to compile on non-Linux targets because we forgot to remove the __riscv_vendor_feature_bits.vendorID set when target is not Linux. This commit fixed this and also has several improvements including: - Initialize all data to zero when syscall is not supported. - Add d

[PATCH]middle-end: copy STMT_VINFO_STRIDED_P when DR is replaced [PR116956]

2024-10-14 Thread Tamar Christina
Hi All, When move_dr copies a DR from one statement to another, it seems we've forgotten to copy the STMT_VINFO_STRIDED_P flag. This leaves the new DR in a broken state where it has a non constant stride but isn't marked as strided. This causes the ICE in the PR because dataref analysis fails du

[PATCH][simplify-rtx]: Fix incorrect folding of shift and AND [PR117012]

2024-10-14 Thread Tamar Christina
Hi All, The optimization added in r15-1047-g7876cde25cbd2f is using the wrong operaiton to check for uniform constant vectors. The Author intended to check that all the lanes in the vector are the same and so used CONST_VECTOR_DUPLICATE_P. However this only checks that the vector is created from

[PATCH]AArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]

2024-10-14 Thread Tamar Christina
Hi All, The psel intrinsics. similar to the pext, should be name psel_lane. This corrects the naming. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR target/116371 * config/aarch64/aarch64-sve-builtins-sve2.cc (c

[PATCH]middle-end: Save VMAT info in stmt_vec_info as well for SLP for costing.

2024-10-14 Thread Tamar Christina
Hi All, While chasing down a costing discrepancy between SLP and non-SLP noticed that costing for different VMATs were not working. It looks like the vectorizer for non-SLP stores the VMAT type in STMT_VINFO_MEMORY_ACCESS_TYPE on the stmt_info, but for SLP it stores it in SLP_TREE_MEMORY_ACCESS_T

Re: [PATCH] RISC-V: Fix feature_bits.c failed to compile on non-Linux targets

2024-10-14 Thread Yangyu Chen
This patch can be dropped. I noticed Kito finally fixed that before committing to master after I submitted this patch. > On Oct 14, 2024, at 18:18, Yangyu Chen wrote: > > The feature_bits.c file failed to compile on non-Linux targets because > we forgot to remove the __riscv_vendor_feature_bits

Re: [Ping*4, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]

2024-10-14 Thread Paul Richard Thomas
Hi Andre, This looks fine to me. OK for mainline. Thanks for the patch and sorry for the wait for review. Paul On Mon, 14 Oct 2024 at 08:50, Andre Vehreschild wrote: > Ping ^ 4. > > Really no one to review this 160 something patch? > > Regtests ok on x86_64-pc-linux-gnu /Fedora 39? Ok for ma

[PATCH] RISC-V: Add detailed comments on processing implied extensions.

2024-10-14 Thread Yangyu Chen
In some cases, we don't need to handle implied extensions. Add detailed comments to help developers understand what implied ISAs should be considered. libgcc/ChangeLog: * config/riscv/feature_bits.c (__init_riscv_features_bits_linux): Add detailed comments on processing implied

[PATCH 2/3]AArch64: support encoding integer immediates using floating point moves

2024-10-14 Thread Tamar Christina
Hi All, This patch extends our immediate SIMD generation cases to support generating integer immediates using floating point operation if the integer immediate maps to an exact FP value. As an example: uint32x4_t f1() { return vdupq_n_u32(0x3f80); } currently generates: f1: adr

[PATCH 3/3]AArch64: use movi d0, #0 to clear SVE registers instead of mov z0.d, #0

2024-10-14 Thread Tamar Christina
Hi All, This patch changes SVE to use Adv. SIMD fmov 0 to clear SVE registers when not in SVE streaming mode. As the Neoverse Software Optimization guides indicate SVE mov #0 is not a zero cost move. When In streaming mode we continue to use SVE's mov to clear the registers. Tests have already

[PATCH 1/4]middle-end: support multi-step zero-extends using VEC_PERM_EXPR

2024-10-14 Thread Tamar Christina
Hi All, This patch series adds support for a target to do a direct convertion for zero extends using permutes. To do this it uses a target hook use_permute_for_promotio which must be implemented by targets. This hook is used to indicate: 1. can a target do this for the given modes. 2. is it p

[PATCH 3/4]AArch64: enable zero-extends using TBLs for Adv. SIMD

2024-10-14 Thread Tamar Christina
Hi All, In this patch series I'm adding support for zero extending using permutes instead of requiring multi-step decomposition. This codegen has the benefit of needing fewer instructions and having much higher throughput than uxtl. We previously replaced pairs of uxtl/uxtl2s with ZIPs to increa

[PATCH 2/4]middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes

2024-10-14 Thread Tamar Christina
Hi All, In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector than the size of the input vectors. However various passes and transformations were not updated to account for this. I have patches in these area that I will be upstreaming with individual patches that expose

[PATCH 4/4]middle-end: create the longest possible zero extend chain after overwidening

2024-10-14 Thread Tamar Christina
Hi All, Consider loops such as: void test9(unsigned char *x, long long *y, int n, unsigned char k) { for(int i = 0; i < n; i++) { y[i] = k + x[i]; } } where today we generate: .L5: ldr q29, [x5], 16 add x4, x4, 128 uaddl v1.8h, v29.8b, v30.8b

[PATCH 03/11] RISC-V: Implement vector SAT_TRUNC for signed integer

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to implement the sstrunc for vector signed integer. Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *i

[PATCH 01/11] Match: Support form 1 for vector signed integer SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li This patch would like to support the form 1 of the vector signed integer SAT_TRUNC. Aka below example: Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT#

[PATCH 09/11] RISC-V: Add testcases for form 6 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 6: #define DEF_VEC_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_6 (NT *out, WT *in, unsigned limit) \ {

[PATCH 06/11] RISC-V: Add testcases for form 3 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 3: #define DEF_VEC_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_3 (NT *out, WT *in, unsigned limit) \ {

[PATCH 07/11] RISC-V: Add testcases for form 4 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 4: #define DEF_VEC_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_4 (NT *out, WT *in, unsigned limit) \ {

[PATCH 04/11] RISC-V: Add testcases for form 1 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 1: #define DEF_VEC_SAT_S_TRUNC_FMT_1(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_1 (NT *out, WT *in, unsigned limit) \ {

fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997]

2024-10-14 Thread Andre Vieira (lists)
Hi, This patch fixes constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets. Regression tested on aarch64be-none-elf. Almost committed this as obvious, but I wanted to double check the testcase with a maintainer. I decided to not make the test be big-endian specific, nor to add any

[PATCH 02/11] Vect: Try the pattern of vector signed integer SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Almost the same as vector unsigned integer SAT_TRUNC, try to match the signed version during the vector pattern matching. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog:

[PATCH 08/11] RISC-V: Add testcases for form 5 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 5: #define DEF_VEC_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_5 (NT *out, WT *in, unsigned limit) \ {

[PATCH 05/11] RISC-V: Add testcases for form 2 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 2: #define DEF_VEC_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_2 (NT *out, WT *in, unsigned limit) \ {

Re: [PATCH 3/4]AArch64: enable zero-extends using TBLs for Adv. SIMD

2024-10-14 Thread Kyrylo Tkachov
Hi Tamar, > On 14 Oct 2024, at 12:56, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > In this patch series I'm adding support for zero extending using permutes > instead of requiring multi-step decomposition. > > This codegen has the bene

[PATCH 10/11] RISC-V: Add testcases for form 7 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 7: #define DEF_VEC_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_7 (NT *out, WT *in, unsigned limit) \ {

[PATCH 11/11] RISC-V: Add testcases for form 8 of vector signed SAT_TRUNC

2024-10-14 Thread pan2 . li
From: Pan Li Form 8: #define DEF_VEC_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \ void __attribute__((noinline))\ vec_sat_s_trunc_##NT##_##WT##_fmt_8 (NT *out, WT *in, unsigned limit) \ {

[PATCH v8] RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

2024-10-14 Thread Yangyu Chen
From: Kito Cheng This provides a common abstraction layer to probe the available extensions at run-time. These functions can be used to implement function multi-versioning or to detect available extensions. The advantages of providing this abstraction layer are: - Easy to port to other new platf

Re: [PATCH][simplify-rtx]: Fix incorrect folding of shift and AND [PR117012]

2024-10-14 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > The optimization added in r15-1047-g7876cde25cbd2f is using the wrong > operaiton to check for uniform constant vectors. > > The Author intended to check that all the lanes in the vector are the same and > so used CONST_VECTOR_DUPLICATE_P. However this only c

Re: [PATCH]AArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]

2024-10-14 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > The psel intrinsics. similar to the pext, should be name psel_lane. This > corrects the naming. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? OK for trunk and for GCC 14, thanks. Richard > Thanks, > Tamar > > > gcc/

Re: [PATCH v7] RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

2024-10-14 Thread Yangyu Chen
> On Oct 14, 2024, at 15:15, Kito Cheng wrote: > >>> I am OK with minimal extension implication rules, like what you implement >>> now, >>> but I am still concerned about implementing full rules. >> >> I didn't know the reason for the concerns. Maybe I didn't tell it >> clearly. > > Current

RE: [PATCH] i386: Fix scalar VCOMSBF16 which only compares low word

2024-10-14 Thread Liu, Hongtao
> -Original Message- > From: Kong, Lingling > Sent: Thursday, October 10, 2024 9:57 AM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; Xu, Liwei > Subject: [PATCH] i386: Fix scalar VCOMSBF16 which only compares low word > > Hi, > > Fixed scalar VCOMSBF16 misused in AVX10.2. > Boot

Re: [PATCH v8] RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

2024-10-14 Thread Kito Cheng
Pushed with minor fixes for non-linux build :) On Mon, Oct 14, 2024 at 4:25 PM Kito Cheng wrote: > > LGTM, will commit after pass my local build/test :) > > On Mon, Oct 14, 2024 at 4:08 PM Yangyu Chen > wrote: > > > > From: Kito Cheng > > > > This provides a common abstraction layer to probe t

[PATCH] libstdc++: debug output in 22_locale/time_get/get/wchar_t/5.cc [PR117135]

2024-10-14 Thread Jonathan Wakely
This is not going to be committed. This is just for arm-none-eabi CI testing, re PR 117135. libstdc++-v3/ChangeLog: * testsuite/22_locale/time_get/get/wchar_t/5.cc: Dump debugging info. --- .../testsuite/22_locale/time_get/get/wchar_t/5.cc | 13 + 1 file changed, 1

Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Revert 'gimple_fold_builtin_acc_on_device' change

2024-10-14 Thread Thomas Schwinge
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > * (new) For OpenACC, use a builtin for acc_on_device + actually do > compile-time optimization when offloading is not configured. > > No. 2. This r

Re: [Ping*4, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]

2024-10-14 Thread Andre Vehreschild
Ping ^ 4. Really no one to review this 160 something patch? Regtests ok on x86_64-pc-linux-gnu /Fedora 39? Ok for mainline? - Andre On Mon, 7 Oct 2024 12:52:29 +0200 Andre Vehreschild wrote: > Hi all, > > this patch somehow slipped my attention. Anyone for a review? Third time ping! > > Rebas

Re: [PATCH v8] RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

2024-10-14 Thread Kito Cheng
LGTM, will commit after pass my local build/test :) On Mon, Oct 14, 2024 at 4:08 PM Yangyu Chen wrote: > > From: Kito Cheng > > This provides a common abstraction layer to probe the available extensions at > run-time. These functions can be used to implement function multi-versioning > or > to

Re: [Patch] Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device

2024-10-14 Thread Thomas Schwinge
Hi Tobias! On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: > Now pushed as r15-4298-g3269a722b7a036. > Tobias Burnus wrote: >> Anyone feeling like reviewing this patch? Yes. But please allow for more than 1 1/2 work days. >> Tobias Burnus write: >>> Tobias Burnus wrote: Sometimes waiti

Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix effective-target keyword in 'libgomp.oacc-fortran/acc_on_device-2.f90'

2024-10-14 Thread Thomas Schwinge
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. > Tested on x86-64 without and with offloading configured, running > with nvptx offloading. I see an UNRESOLVED: +PASS: libgomp.oacc-fortran/acc

Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'

2024-10-14 Thread Thomas Schwinge
Hi! On 2024-10-14T10:23:56+0200, I wrote: > On 2024-10-13T10:21:01+0200, Tobias Burnus wrote: >> Now pushed as r15-4298-g3269a722b7a036. >> --- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 >> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90 >> -! TODO: Have t

[PATCH v3] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2024-10-14 Thread jeevitha
Hi All, The following patch has been bootstrapped and regtested on powerpc64le-linux. PTImode assists in generating even/odd register pairs on 128 bits. When the user specifies PTImode as an attribute, it breaks because there is no internal type to handle this mode. To fix this, we have created a

Re: [PATCH] RISC-V: Add detailed comments on processing implied extensions.

2024-10-14 Thread Kito Cheng
Pushed to the trunk :) On Mon, Oct 14, 2024 at 6:31 PM Yangyu Chen wrote: > > In some cases, we don't need to handle implied extensions. Add detailed > comments to help developers understand what implied ISAs should be > considered. > > libgcc/ChangeLog: > > * config/riscv/feature_bits.c

[PATCH] rs6000: Correct the function code for _AMO_LD_DEC_BOUNDED

2024-10-14 Thread jeevitha
Hi All, Corrected the function code for the Atomic Memory Operation "Fetch and Decrement Bounded", changing it from 0x1A to 0x1C. 2024-10-14 Jeevitha Palanisamy gcc/ * config/rs6000/amo.h (enum _AMO_LD): Correct the function code for _AMO_LD_DEC_BOUNDED. diff --git a/gcc/config

Re: [PATCH 2/4] Match: Support form 3 for vector signed integer SAT_SUB

2024-10-14 Thread Jakub Jelinek
On Sat, Oct 12, 2024 at 02:10:49PM +0200, Richard Biener wrote: > > gcc/ChangeLog: > > > > * match.pd: Add matching pattern for vector signed SAT_SUB form 3. I now see ../../gcc/match.pd:3424:3 warning: duplicate pattern (cond^ (ne (imagpart (IFN_SUB_OVERFLOW:c@2 @0 @1)) integer_zerop)

Re: [PATCH]middle-end: copy STMT_VINFO_STRIDED_P when DR is replaced [PR116956]

2024-10-14 Thread Richard Biener
On Mon, 14 Oct 2024, Tamar Christina wrote: > Hi All, > > When move_dr copies a DR from one statement to another, it seems we've > forgotten to copy the STMT_VINFO_STRIDED_P flag. This leaves the new DR in a > broken state where it has a non constant stride but isn't marked as strided. > > This

Fortran test typebound_operator_7.f03 broken by non-Fortran commit. Confirm anyone?

2024-10-14 Thread Andre Vehreschild
Hi all, please note, that I don't know this bisecting very well, so this may very well be a wrong blame. During latest regression testing of the Fortran suite I got typebound_operator_7.f03 failing with: typebound_operator_7.f03:94:25: 94 | u = (u*2.0*4.0) + u*4.0 |

Re: [PATCH v2] testsuite: Sanitize pacbti test cases for Cortex-M

2024-10-14 Thread Christophe Lyon
On 10/13/24 19:50, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? Changes since v1: - Dropped changes to dg- instructions. These will be addressed in a separate set of patches later. LGTM, let's avoid mixing changes. Thanks, Christophe -- Some of the test cases were scan

Re: [PATCH 03/11] RISC-V: Implement vector SAT_TRUNC for signed integer

2024-10-14 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH] genmatch: Revert recent genmatch changes, instead add custom diag_vfprintf routine [PR117110]

2024-10-14 Thread Richard Biener
On Mon, 14 Oct 2024, Jakub Jelinek wrote: > Hi! > > My recent changes to genmatch apparently broke bootstrap on FreeBSD > and Darwin and perhaps others, and also broke $build != $host > builds including canadian cross. > > The change was to link in libcommon.a into build/genmatch, so that > we c

Re: ping: [PATCH] libcpp: Support extended characters for #pragma {push,pop}_macro [PR109704]

2024-10-14 Thread Lewis Hyatt
On Fri, Oct 11, 2024 at 08:52:45PM +, Joseph Myers wrote: > On Wed, 25 Sep 2024, Lewis Hyatt wrote: > > > Hello- > > > > May I please ping this one? Is there something maybe sub-optimal about > > how I organized it? I can adjust or break it into two maybe if that's > > helpful. Or else, if it

Re: [PATCH 1/3] bpf: make sure CO-RE relocs are never typed with a BTF_KIND_CONST

2024-10-14 Thread Cupertino Miranda
Hi David, On 30-09-2024 18:24, David Faust wrote: On 9/27/24 09:49, Cupertino Miranda wrote: Based on observation within bpf-next selftests and comparisson of GCC and clang compiled code, the BPF loader expects all CO-RE relocations to point to BTF non const type nodes. --- gcc/btfout.cc

[PATCH v2 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-10-14 Thread Evgeny Karpov
Friday, September 13, 2024 Martin Storsjö wrote: >> When the offset is >= 1MB: >> >> adrp x0, symbol + offset % (1 << 20) // it prevents relocation overflow in >> IMAGE_REL_ARM64_PAGEBASE_REL21 >> add x0, x0, (offset & ~0xf) >> 12, lsl #12 // a workaround to support >> 4GB offset >> add x0,

Re: [PATCH] SVE intrinsics: Fold svmul with constant power-of-2 operand to svlsl

2024-10-14 Thread Richard Sandiford
Jennifer Schmitz writes: > [...] > @@ -54,25 +56,121 @@ TEST_UNIFORM_ZX (mul_w0_s16_m_untied, svint16_t, int16_t, >z0 = svmul_m (p0, z1, x0)) > > /* > -** mul_2_s16_m_tied1: > -** mov (z[0-9]+\.h), #2 > +** mul_4dupop1_s16_m_tied1: > +** mov (z[0-9]+)\.h, #4 > +**

Re: [PATCH v4 2/2] arm: [MVE intrinsics] Improve vdupq_n implementation

2024-10-14 Thread Richard Earnshaw (lists)
On 30/07/2024 22:39, Christophe Lyon wrote: > Hi, > > v4 of patch 2/2 fixes a small mistake in 3 testcases, by relaxing the > expected q0 as result register into q[0-9]+ to account for codegen > differences depending on if the test is compiled with > -mfloat-abi=softfp or -mfloat-abi=hard. > > I

Re: [PATCH v2 01/36] arm: [MVE intrinsics] improve comment for orrq shape

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > Add a comment about the lack of "n" forms for floating-point nor 8-bit > integers, to make it clearer why we use build_16_32 for MODE_n. > > 2024-07-11 Christophe Lyon > > gcc/ > * config/arm/arm-mve-builtins-shapes.cc (binary_orrq_def)

Re: [PATCH v2 02/36] arm: [MVE intrinsics] remove useless resolve from create shape

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > vcreateq have no overloaded forms, so there's no need for resolve (). > > 2024-07-11 Christophe Lyon > > gcc/ > * config/arm/arm-mve-builtins-shapes.cc (create_def): Remove > resolve. Wouldn't it be more usual to write (create_de

Re: [PATCH v2 18/36] arm: [MVE intrinsics] add viddup shape

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > This patch adds the viddup shape description for vidup and vddup. > > This requires the addition of report_not_one_of and > function_checker::require_immediate_one_of to > gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE > counter

Re: [PATCH v2 20/36] arm: [MVE intrinsics] update v[id]dup tests

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > Testing v[id]dup overloads with '1' as argument for uint32_t* does not > make sense: instead of choosing the '_wb' overload, we choose the > '_n', but we already do that in the '_n' tests. > > This patch removes all such bogus foo2 functions. > > 2024

Re: [PATCH] libquadmath: Typo in powq summary

2024-10-14 Thread Joseph Myers
On Mon, 14 Oct 2024, Ivan Agarsky wrote: > Hello, > > since step 9 in "Basics: Contributing to GCC in 10 easy steps" says I > should not commit the first few patches, I kindly ask someone to commit > the following for me: > > libquadmath\ChangeLog: > > * math/powq.c: This file comes fr

Re: [PATCH 0/5] Provide better definitions of NULL

2024-10-14 Thread Joseph Myers
On Sun, 13 Oct 2024, Alejandro Colomar wrote: > There are some regressions. Below is the diff of .sum files. In the > .log files, I see some errors saying `CRC mismatch`. Did I do anything > wrong? I am not familiar with the C++ modules implementation and do not have any suggestions for why c

Re: [PATCH v2 19/36] arm: [MVE intrinsics] rework vddup vidup

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > Implement vddup and vidup using the new MVE builtins framework. > > We generate better code because we take advantage of the two outputs > produced by the v[id]dup instructions. > > For instance, before: > ldr r3, [r0] > sub r2, r3

Re: [PATCH v2 21/36] arm: [MVE intrinsics] remove v[id]dup expanders

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > We use code_for_mve_q_u_insn, rather than the expanders used by the > previous implementation, so we can remove the expanders and their > declaration as builtins. > > 2024-08-21 Christophe Lyon > > gcc/ > * config/arm/arm_mve_builtins.d

Re: [PATCH v2 22/36] arm: [MVE intrinsics] fix checks of immediate arguments

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > As discussed in [1], it is better to use "su64" for immediates in > intrinsics signatures in order to provide better diagnostics > (erroneous constants are not truncated for instance). This patch thus > uses su64 instead of ss32 in binary_lshift_unsign

Re: [PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > This patch adds the vidwdup shape description for vdwdup and viwdup. > > It is very similar to viddup, but accounts for the additional 'wrap' > scalar parameter. > > 2024-08-21 Christophe Lyon > > gcc/ > * config/arm/arm-mve-builtins-s

Re: [PATCH v2 25/36] arm: [MVE intrinsics] rework vdwdup viwdup

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > Implement vdwdup and viwdup using the new MVE builtins framework. > > In order to share more code with viddup_impl, the patch swaps operands > 1 and 2 in @mve_v[id]wdupq_m_wb_u_insn, so that the parameter > order is similar to what @mve_v[id]dupq_m_wb_

Re: libstdc++ fetch_add & fenv -- ecosystem questions

2024-10-14 Thread Joseph Myers
On Mon, 14 Oct 2024, Matthew Malcomson wrote: > 4. __atomic_feraiseexcept should be a builtin to avoid previously > unnecessary requirement to link libatomic. libatomic should be linked by default (with --as-needed); see bug 81358. But if your concern is e.g. libstdc++.so having DT_NEEDED fo

Re: [PATCH v2 26/36] arm: [MVE intrinsics] update v[id]wdup tests

2024-10-14 Thread Richard Earnshaw (lists)
On 04/09/2024 14:26, Christophe Lyon wrote: > Testing v[id]wdup overloads with '1' as argument for uint32_t* does > not make sense: this patch adds a new 'unit32_t *a' parameter to foo2 > in such tests. > > The difference with v[id]dup tests (where we removed 'foo2') is that > in 'foo1' we test th

Re: [PATCH v7] RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

2024-10-14 Thread Kito Cheng
> > I am OK with minimal extension implication rules, like what you implement > > now, > > but I am still concerned about implementing full rules. > > I didn't know the reason for the concerns. Maybe I didn't tell it > clearly. Current implementation didn't implement full implication rule accordi

  1   2   >