Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> There were absolutely problems without this. It's a while ago now, so I'm > struggling with the details, but as GCC only applies the mask to selected > operations there were all sorts of issues that crept in. Zeroing the > undefined lanes seemed to match the middle end assumptions (or, at least i

[PATCH] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616]

2024-09-06 Thread Filip Kastl
Hi, bootstrapped and regtested on x86_64-linux. Ok to push? Thanks, Filip Kastl 8< Switch exponential transformation in the switch conversion pass currently generates tmp1 = __builtin_popcount (var); tmp2 = tmp1 == 1; when inserting code to determine if var is power of two. If t

[PATCH] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for xtheadvector

2024-09-06 Thread Jin Ma
Since the THeadVector vsetvli does not support vl as an immediate, we need to convert 0 to zero when outputting asm. Ref: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116592 gcc/ChangeLog: * config/riscv/thead.cc (th_asm_output_opcode): Change '0' to "zero" gcc/testsuite/ChangeL

Re: [PATCH] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616]

2024-09-06 Thread Andrew Pinski
On Fri, Sep 6, 2024 at 12:07 AM Filip Kastl wrote: > > Hi, > > bootstrapped and regtested on x86_64-linux. Ok to push? > > Thanks, > Filip Kastl > > > 8< > > > Switch exponential transformation in the switch conversion pass > currently generates > > tmp1 = __builtin_popcount (var); > tm

Re: [PATCH] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616]

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 12:18:30AM -0700, Andrew Pinski wrote: > You need to do this in an unsigned types. Otherwise you get the wrong > answer and also introduce undefined code. > So you need to use: > tree utype = unsigned_type_for (type); > tree tmp3; > if (types_compatible_p (type, utype) > t

[PATCH] fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

2024-09-06 Thread Andrew Pinski
This is an alternative patch to fix PR tree-optimization/116601 by factoring out the main part of pass_fold_builtins::execute into its own function so that we don't need to repeat the code for doing the eh cleanup. It also fixes the problem I saw with the atomics which might skip over a statement;

Re: [PATCH] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for xtheadvector

2024-09-06 Thread Xi Ruoyao
On Fri, 2024-09-06 at 15:10 +0800, Jin Ma wrote: > Since the THeadVector vsetvli does not support vl as an immediate, we > need to convert 0 to zero when outputting asm. > > Ref: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116592 See the "bug number" section of https://gcc.gnu.org/contribute.h

Re: [PATCH] fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote: > This is an alternative patch to fix PR tree-optimization/116601 by factoring > out the main part of pass_fold_builtins::execute into its own function so that > we don't need to repeat the code for doing the eh cleanup. It also fixes t

[PATCH] ada: Fix gcc-interface/misc.cc compilation on SPARC

2024-09-06 Thread Rainer Orth
This patch commit 72c6938f29cbeddb3220720e68add4cf09ffd794 Author: Eric Botcazou Date: Sun Aug 25 15:20:59 2024 +0200 ada: Streamline handling of low-level peculiarities of record field layout broke the Ada build on SPARC: In file included from ./tm_p.h:4, from /vol/gcc

Re: [PATCH] fab: Cleanup eh after optimize_memcpy [PR116601]

2024-09-06 Thread Richard Biener
On Fri, Sep 6, 2024 at 3:00 AM Andrew Pinski wrote: > > On Thu, Sep 5, 2024 at 12:26 AM Richard Biener > wrote: > > > > On Thu, Sep 5, 2024 at 8:25 AM Andrew Pinski > > wrote: > > > > > > When optimize_memcpy was added in r7-5443-g7b45d0dfeb5f85, > > > a path was added such that a statement was

Re: [PATCH] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616]

2024-09-06 Thread Richard Biener
On Fri, 6 Sep 2024, Jakub Jelinek wrote: > On Fri, Sep 06, 2024 at 12:18:30AM -0700, Andrew Pinski wrote: > > You need to do this in an unsigned types. Otherwise you get the wrong > > answer and also introduce undefined code. > > So you need to use: > > tree utype = unsigned_type_for (type); > > t

Re: [PATCH] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for xtheadvector

2024-09-06 Thread Jin Ma
> See the "bug number" section of https://gcc.gnu.org/contribute.html for > how to refer to a PR correctly, instead of putting an URL here. I am very sorry to make this mistake, thank you for reminding me. I will make corrections. BR Jin

Re: [PATCH] fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

2024-09-06 Thread Richard Biener
On Fri, Sep 6, 2024 at 9:31 AM Jakub Jelinek wrote: > > On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote: > > This is an alternative patch to fix PR tree-optimization/116601 by factoring > > out the main part of pass_fold_builtins::execute into its own function so > > that > > we don

[PATCH v2] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread Jin Ma
Since the THeadVector vsetvli does not support vl as an immediate, we need to convert 0 to zero when outputting asm. PR target/116592 gcc/ChangeLog: * config/riscv/thead.cc (th_asm_output_opcode): Change '0' to "zero" gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xthe

Re: [PATCH] fab: Factor out the main folding part of pass_fold_builtins::execute [PR116601]

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 09:51:38AM +0200, Richard Biener wrote: > On Fri, Sep 6, 2024 at 9:31 AM Jakub Jelinek wrote: > > > > On Fri, Sep 06, 2024 at 12:21:20AM -0700, Andrew Pinski wrote: > > > This is an alternative patch to fix PR tree-optimization/116601 by > > > factoring > > > out the main

Re: [PATCH v2] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread 钟居哲
I think it's better to add a "vsetvli" assembly check in testcase. juzhe.zh...@rivai.ai From: Jin Ma Date: 2024-09-06 15:52 To: gcc-patches CC: jeffreyalaw; juzhe.zhong; pan2.li; kito.cheng; christoph.muellner; shuizhuyuanluo; pinskia; xry111; jinma.contrib; Jin Ma Subject: [PATCH v2] RISC-V:

Re: [PATCH] ada: Fix gcc-interface/misc.cc compilation on SPARC

2024-09-06 Thread Eric Botcazou
> commit 72c6938f29cbeddb3220720e68add4cf09ffd794 > Author: Eric Botcazou > Date: Sun Aug 25 15:20:59 2024 +0200 > > ada: Streamline handling of low-level peculiarities of record field > layout > > broke the Ada build on SPARC: > > In file included from ./tm_p.h:4, > from

Re: [PATCH v2] RISC-V: Fix illegal operands "th.vsetvli zero,0,e32,m8" for XTheadVector

2024-09-06 Thread Jin Ma
> I think it's better to add a "vsetvli" assembly check in testcase. > juzhe.zh...@rivai.ai Yeah, apparently I forgot to modify it :) Thanks. Jin

[PATCH v3] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread Jin Ma
Since the THeadVector vsetvli does not support vl as an immediate, we need to convert 0 to zero when outputting asm. PR target/116592 gcc/ChangeLog: * config/riscv/thead.cc (th_asm_output_opcode): Change '0' to "zero" gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xthe

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Andrew Stubbs
On 06/09/2024 08:06, Robin Dapp wrote: There were absolutely problems without this. It's a while ago now, so I'm struggling with the details, but as GCC only applies the mask to selected operations there were all sorts of issues that crept in. Zeroing the undefined lanes seemed to match the middl

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> > So we only found two instances of this problem and both were related to > > _Bools. In case you have more cases, it would be greatly appreciated > > to verify the series with them. If you don't mind, would it be possible > > to comment out the zeroing, re-run the testsuite and check for FAILs

RE: [nvptx] Pass -m32/-m64 to host_compiler if it has multilib support

2024-09-06 Thread Thomas Schwinge
Hi! On 2024-08-16T15:36:29+, Prathamesh Kulkarni wrote: >> > Am 13.08.2024 um 17:48 schrieb Thomas Schwinge >> : >> > On 2024-08-12T07:50:07+, Prathamesh Kulkarni >> wrote: >> >>> From: Thomas Schwinge >> >>> Sent: Friday, August 9, 2024 12:55 AM >> > >> >>> On 2024-08-08T06:46:25-0700,

Re: [PATCH v3 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2024-09-06 Thread Richard Sandiford
"Kong, Lingling" writes: > Hi, > > This version has added a new optab named 'cfmovcc'. The new optab is used > in the middle end to expand to cfcmov. And simplified my patch by trying to > generate the conditional faulting movcc in noce_try_cmove_arith function. > > All the changes passed bootstra

[PATCH v2] testsuite: Sanitize pacbti test cases for Cortex-M

2024-09-06 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? Changes since v1: - Corrected changelog entry for pac-15.c - Added a tab before all the asm instructions in the pac-*.c and bti-*.c tests - Corrected the expected number of bti instructions for bti-2.c as it previously counted the .file directive -- Some of th

Re: [PATCH RFA] libstdc++: avoid __GLIBCXX__ redefinition

2024-09-06 Thread Jonathan Wakely
On Fri, 6 Sept 2024 at 02:47, Jason Merrill wrote: > > On 8/28/24 6:22 AM, Jason Merrill wrote: > > On 8/28/24 6:09 AM, Jonathan Wakely wrote: > >> On Wed, 28 Aug 2024 at 10:58, Jason Merrill wrote: > >>> > >>> On 8/28/24 5:55 AM, Jonathan Wakely wrote: > On Wed, 28 Aug 2024 at 10:54, Jason

Re: [PATCH 3/3] Handle non-grouped stores as single-lane SLP

2024-09-06 Thread Richard Biener
On Thu, 5 Sep 2024, Richard Biener wrote: > The following enables single-lane loop SLP discovery for non-grouped stores > and adjusts vectorizable_store to properly handle those. > > For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop, > not running into the "not falling back to stri

[patch,reload] PR116326: Add #define IN_RELOAD1_CC

2024-09-06 Thread Georg-Johann Lay
The reason for PR116326 is that LRA and reload require different ELIMINABLE_REGS for a multi-register frame pointer. As ELIMINABLE_REGS is used to initialize static const objects, it is not possible to make ELIMINABLE_REGS to depend on options or patch it in some target hook. It was also conclud

[PATCH] RISC-V: Add more vector-vector extract cases.

2024-09-06 Thread Robin Dapp
Hi, this adds a V16SI -> V4SI and related i.e. "quartering" vector-vector extract expander for VLS modes. It helps with unnecessary spills in x264. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (vec_extract): Add quarter vec-vec extrac

New Ukrainian PO file for 'gcc' (version 14.2.0)

2024-09-06 Thread Translation Project Robot
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Ukrainian team of translators. The file is available at: https://translationproject.org/latest/gcc/uk.po (This file, 'gcc-14.2.0.uk.po', has

[PATCH] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Jin Ma
In the process of DF to SI, we generally use "unsigned_fix" rather than "truncate" for conversion. Although this has no effect in general, unexpected ICE often occurs when precise semantic analysis is required, such as analysis in function "simplify_const_unary_operation" in simplify-rtx.cc. gcc/C

Re: [PATCH v1 4/9] aarch64: Exclude symbols using GOT from code models

2024-09-06 Thread Richard Sandiford
Evgeny Karpov writes: > Monday, September 2, 2024 5:00 PM > Richard Sandiford wrote: > >> I think we should instead patch the callers that are using >> aarch64_symbol_binds_local_p for GOT decisions. The function itself >> is checking for a more general property (and one that could be useful >>

Re: [PATCH] RISC-V: Add more vector-vector extract cases.

2024-09-06 Thread 钟居哲
Thanks. lgtm. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-09-06 17:56 To: gcc-patches CC: pal...@dabbelt.com; kito.ch...@gmail.com; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; pan2...@intel.com; rdapp@gmail.com Subject: [PATCH] RISC-V: Add more vector-vector extract cases. Hi, thi

Re: [PATCH v3] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread 钟居哲
Sorry, I still don't see assembly check. juzhe.zh...@rivai.ai From: Jin Ma Date: 2024-09-06 16:32 To: gcc-patches CC: jeffreyalaw; juzhe.zhong; pan2.li; kito.cheng; christoph.muellner; shuizhuyuanluo; pinskia; xry111; jinma.contrib; Jin Ma Subject: [PATCH v3] RISC-V: Fix illegal operands "th.

Re: [PATCH] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Robin Dapp
> In the process of DF to SI, we generally use "unsigned_fix" rather than > "truncate" for conversion. Although this has no effect in general, > unexpected ICE often occurs when precise semantic analysis is required, > such as analysis in function "simplify_const_unary_operation" in > simplify-rtx.

Re: [PATCH v2] testsuite: Sanitize pacbti test cases for Cortex-M

2024-09-06 Thread Christophe Lyon
On 9/6/24 11:17, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? Changes since v1: - Corrected changelog entry for pac-15.c - Added a tab before all the asm instructions in the pac-*.c and bti-*.c tests - Corrected the expected number of bti instructions for bti-2.c as it previou

[PATCH] c++: Properly mangle CONST_DECL without a INTEGER_CST value [PR116511]

2024-09-06 Thread Simon Martin
We ICE upon the following *valid* code when mangling the requires clause === cut here === template struct s1 { enum { e1 = 1 }; }; template struct s2 { enum { e1 = s1::e1 }; s2() requires(0 != e1) {} }; s2<8> a; === cut here === The problem is that the mangler wrongly assumes that the DEC

Re: [PATCH] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Jin Ma
> Do you have a test case for this or does it fail already in the test suite? > > -- > Regards > Robin Sorry, I'll try to write it. BR Jin

[PATCH v2] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Jin Ma
In the process of DF to SI, we generally use "unsigned_fix" rather than "truncate" for conversion. Although this has no effect in general, unexpected ICE often occurs when precise semantic analysis is required. gcc/ChangeLog: * config/riscv/riscv.md: Change "truncate" to "unsigned_fix" f

[PATCH v2 0/9] SMALL code model fixes, optimization fixes, LTO and minimal C++ enablement

2024-09-06 Thread Evgeny Karpov
Hello, Thank you for reviewing v1! v2 Changes: - Add extra comments and extend patch descriptions. - Extract libstdc++ changes to a separate patch. - Minor style refactoring based on the reviews. - Unify mingw_pe_declare_type for functions and objects. Regards, Evgeny Evgeny Karpov (9): Suppo

[PATCH v2 1/9] Support weak references

2024-09-06 Thread Evgeny Karpov
The patch adds support for weak references. The original MinGW implementation targets ix86, which handles weak symbols differently compared to AArch64. In AArch64, the weak symbols are replaced by other symbols which reference the original weak symbols, and the compiler does not track the original

[PATCH v2 2/9] aarch64: Add debugging information

2024-09-06 Thread Evgeny Karpov
This patch enables DWARF and allows compilation with debugging information by using "gcc -g". The unwind info is disabled for the moment and will be revisited after SEH implementation for the target. gcc/ChangeLog: * config/aarch64/aarch64.cc (TARGET_ASM_UNALIGNED_HI_OP): Enable D

[PATCH v2 4/9] aarch64: Exclude symbols using GOT from code models

2024-09-06 Thread Evgeny Karpov
Symbols using GOT are not supported by the aarch64-w64-mingw32 target and should be excluded from the code models. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_classify_symbol): Disable GOT for PECOFF target. --- gcc/config/aarch64/aarch64.cc | 6 -- 1 file changed, 4

[PATCH v2 3/9] aarch64: Add minimal C++ support

2024-09-06 Thread Evgeny Karpov
The patch resolves compilation issues for the C++ language. Previous patch series contributed to C++ as well, however, C++ could not be tested until we got a C++ compiler and could build at least a "Hello World" C++ program, and in reality, more than that. Another issue has been fixed in the libst

[PATCH v2 5/9] aarch64: Multiple adjustments to support the SMALL code model correctly

2024-09-06 Thread Evgeny Karpov
LOCAL_LABEL_PREFIX has been changed to help the assembly compiler recognize local labels. Emitting locals has been replaced with the .lcomm directive to declare uninitialized data without defining an exact section. Functions and objects were missing declarations. Binutils was not able to distinguis

[PATCH v2 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-06 Thread Evgeny Karpov
aarch64.cc has been updated to prevent emitting "symbol + offset" for SYMBOL_SMALL_ABSOLUTE for the PECOFF target. "symbol + offset" cannot be used in relocations for aarch64-w64-mingw32 due to relocation requirements. Instead, it will adjust the address by an offset with the "add" instruction. T

[PATCH v2 7/9] aarch64: Disable the anchors

2024-09-06 Thread Evgeny Karpov
The anchors have been disabled as they use symbol + offset, which is not applicable for COFF AArch64. gcc/ChangeLog: * config/aarch64/aarch64.cc (TARGET_MIN_ANCHOR_OFFSET): Keep default TARGET_MAX_ANCHOR_OFFSET for PECOFF target. (TARGET_MAX_ANCHOR_OFFSET): Likewise. ---

[PATCH] c++, v2: Implement for static locals CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-06 Thread Jakub Jelinek
Hi! On Wed, Aug 14, 2024 at 06:11:35PM +0200, Jakub Jelinek wrote: > Here is the I believe ABI compatible version, which uses the separate > guard variables, so different structured binding variables can be > initialized in different threads, but the thread that did the artificial > base initializ

[PATCH v2 8/9] Add LTO support

2024-09-06 Thread Evgeny Karpov
The patch reuses the configuration for LTO from ix86 and adds the aarch64 architecture to the list of supported COFF headers. gcc/ChangeLog: * config/aarch64/cygming.h (TARGET_ASM_LTO_START): New. (TARGET_ASM_LTO_END): Likewise. * config/i386/cygming.h (TARGET_ASM_LTO_STAR

[PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-09-06 Thread Evgeny Karpov
In some cases, the alignment can be bigger than BIGGEST_ALIGNMENT. The issue was detected while building FFmpeg. It creates structures, most likely for AVX optimization. For instance: float __attribute__((aligned (32))) large_aligned_array[3]; BIGGEST_ALIGNMENT could be up to 512 bits on x64. Th

Re: [PATCH v2 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-06 Thread Martin Storsjö
On Fri, 6 Sep 2024, Evgeny Karpov wrote: aarch64.cc has been updated to prevent emitting "symbol + offset" for SYMBOL_SMALL_ABSOLUTE for the PECOFF target. "symbol + offset" cannot be used in relocations for aarch64-w64-mingw32 due to relocation requirements. Instead, it will adjust the address

Re: [PATCH] aarch64: Use is_attribute_namespace_p and get_attribute_name inside aarch64_lookup_shared_state_flags [PR116598]

2024-09-06 Thread Richard Sandiford
Andrew Pinski writes: > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the > function type > had a namespace associated with them. But with the addition of > reproducible/unsequenced, > this is not true. > > This fixes the issue by using is_attribute_namespace_p instead of

[PATCH] Fix SLP double-reduction support

2024-09-06 Thread Richard Biener
When doing SLP discovery I forgot to handle double reductions even though they are already queued in LOOP_VINFO_REDUCTIONS. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-vect-slp.cc (vect_analyze_slp): Also handle discovery for double reductions. --- gcc/tre

[PATCH] x86-64: Don't use temp for argument in a TImode register

2024-09-06 Thread H.J. Lu
Don't use temp for a PARALLEL BLKmode argument of an EXPR_LIST expression in a TImode register. Otherwise, the TImode variable will be put in the GPR save area which guarantees only 8-byte alignment. gcc/ PR target/116621 * config/i386/i386.cc (ix86_gimplify_va_arg): Don't use te

[PATCH] match: Change (A * B) + (-C) to (B - C/A) * A, if C multiple of A [PR109393]

2024-09-06 Thread konstantinos . eleftheriou
From: kelefth The following function: int foo(int *a, int j) { int k = j - 1; return a[j - 1] == a[k]; } does not fold to `return 1;` using -O2 or higher. The cause of this is that the expression `4 * j + (-4)` for the index computation is not folded to `4 * (j - 1)`. Existing simplificatio

[PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
Hi All, Because the vect_recog_bool_pattern can at the moment still transition out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can end up using an expression as a mask rather than an SSA_NAME. This adds an explicit check that we have a mask and not an expression. Bootstrapp

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-06 Thread Jonathan Wakely
On 05/09/24 21:44 -0400, Jason Merrill wrote: On 9/4/24 11:02 AM, Marek Polacek wrote: +handle_flag_enum_attribute (tree *node, tree ARG_UNUSED(name), tree args, + int ARG_UNUSED (flags), bool *no_add_attrs) +{ + if (args) +warning (OPT_Wattributes, "%qE attribute

Re: [PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Richard Biener
On Fri, 6 Sep 2024, Tamar Christina wrote: > Hi All, > > Because the vect_recog_bool_pattern can at the moment still transition > out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can > end up using an expression as a mask rather than an SSA_NAME. > > This adds an explicit ch

Re: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-06 Thread Richard Biener
On Tue, 3 Sep 2024, Tamar Christina wrote: > Hi All, > > Currently the vectorizer cheats when lowering COND_EXPR during bool recog. > In the cases where the conditonal is loop invariant or non-boolean it instead > converts the operation back into GENERIC and hides much of the operation from > the

[PATCH v3] GCC Driver : Enable very long gcc command-line option

2024-09-06 Thread Deepthi . Hemraj
From: Deepthi Hemraj For excessively long environment variables i.e >128KB Store the arguments in a temporary file and collect them back together in collect2. This commit patches for COLLECT_GCC_OPTIONS issue: GCC should not limit the length of command line passed to collect2. https://gcc.gnu.o

Re: [PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Kyrylo Tkachov
Hi Tamar, > On 6 Sep 2024, at 14:56, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > Because the vect_recog_bool_pattern can at the moment still transition > out of GIMPLE and back into GENERIC the vect_recog_cond_store_pattern can > end

Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-06 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This adds vector constant simplification for EQ and NE. This is useful since > the vectorizer generates a lot more vector compares now, in particular NE and > EQ > and so these help us optimize cases where the values were not known at GIMPLE > but instead on

RE: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, September 6, 2024 2:09 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in > vect_recog_bool_pattern > > On Tue, 3 Sep 2024

[r15-3509 Regression] FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1 on Linux/x86_64

2024-09-06 Thread haochen.jiang
On Linux/x86_64, d34cda720988674bcf8a24267c9e1ec61335d6de is the first bad commit commit d34cda720988674bcf8a24267c9e1ec61335d6de Author: Richard Biener Date: Fri Sep 29 12:54:17 2023 +0200 Handle non-grouped stores as single-lane SLP caused FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-ob

[PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Jennifer Schmitz
In the pattern X - (X / Y) * Y to X % Y, this patch guards the simplification for vector types by a check for: 1) Support of the mod optab for vectors OR 2) Application during early gimple passes (using PROP_gimple_any). This is to prevent reverting vectorization of modulo to div/mult/sub if the ta

RE: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, September 6, 2024 2:21 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 3/4][rtl]: simplify boolean vector EQ and NE comparisons > > Tamar Christina writes: > > Hi All, > > > > This adds vect

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-06 Thread Qing Zhao
> On Sep 5, 2024, at 18:22, Bill Wendling wrote: > > Hi Qing, > > Sorry for my late reply. > > On Thu, Aug 29, 2024 at 7:22 AM Qing Zhao wrote: >> >> Hi, >> >> Thanks for the information. >> >> Yes, providing a unary operator similar as __counted_by(PTR) as suggested by >> multiple peopl

Re: [PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 01:46:01PM +, Jennifer Schmitz wrote: > In the pattern X - (X / Y) * Y to X % Y, this patch guards the > simplification for vector types by a check for: > 1) Support of the mod optab for vectors OR > 2) Application during early gimple passes (using PROP_gimple_any). > Th

[PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-09-06 Thread Robin Dapp
Hi, PR112694 shows that we try to create sub-vectors of single-element vectors because can_duplicate_and_interleave_p returns true. The problem resurfaced in PR116611. This patch makes can_duplicate_and_interleave_p return false if count / nvectors > 0 and removes the corresponding check in the r

Re: [PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Kyrylo Tkachov
> On 6 Sep 2024, at 16:00, Jakub Jelinek wrote: > > External email: Use caution opening links or attachments > > > On Fri, Sep 06, 2024 at 01:46:01PM +, Jennifer Schmitz wrote: >> In the pattern X - (X / Y) * Y to X % Y, this patch guards the >> simplification for vector types by a check

Re: [PATCH][PR116569] match.pd: Check trunc_mod vector obtap before folding.

2024-09-06 Thread Jakub Jelinek
On Fri, Sep 06, 2024 at 02:10:19PM +, Kyrylo Tkachov wrote: > > This is certainly wrong. > > PROP_gimple_any is set already at the end of gimplification, so certainly > > doesn't include any other early gimple passes. > > And, not all statements are folded during gimplification, e.g. in OpenMP

[r15-3509 Regression] FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1 on Linux/x86_64

2024-09-06 Thread haochen.jiang
On Linux/x86_64, d34cda720988674bcf8a24267c9e1ec61335d6de is the first bad commit commit d34cda720988674bcf8a24267c9e1ec61335d6de Author: Richard Biener Date: Fri Sep 29 12:54:17 2023 +0200 Handle non-grouped stores as single-lane SLP caused FAIL: gcc.dg/vect/fast-math-vect-call-2.c scan

Re: [PATCH 2/4]middle-end: lower COND_EXPR into gimple form in vect_recog_bool_pattern

2024-09-06 Thread Richard Biener
> Am 06.09.2024 um 15:28 schrieb Tamar Christina : > >  >> >> -Original Message- >> From: Richard Biener >> Sent: Friday, September 6, 2024 2:09 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com >> Subject: Re: [PATCH 2/4]middle-end: lower COND_EXP

Re: [PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-09-06 Thread Richard Biener
> Am 06.09.2024 um 16:05 schrieb Robin Dapp : > > Hi, > > PR112694 shows that we try to create sub-vectors of single-element > vectors because can_duplicate_and_interleave_p returns true. Can we avoid querying the function? CCing Richard who should know more about this. Richard > The pr

[PATCH v2 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-06 Thread Evgeny Karpov
Friday, September 6, 2024 Martin Storsjö wrote: > Sorry, but no. > > You can't just redefine how relocations in your object file format works, > just because you feel like it. This patch changes how symbol with offset will be emitted. It will change: adrp x0, symbol + offset to: adrp x0, sym

RE: [PATCH]middle-end: check that the lhs of a COND_EXPR is an SSA_NAME in cond_store recognition [PR116628]

2024-09-06 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, September 6, 2024 2:15 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard Biener > ; j...@ventanamicro.com > Subject: Re: [PATCH]middle-end: check that the lhs of a COND_EXPR is an > SSA_NAME in cond_store recognition [PR1

Re: [PATCH] c++: template depth of lambda in default targ [PR116567]

2024-09-06 Thread Patrick Palka
On Thu, 5 Sep 2024, Jason Merrill wrote: > On 9/5/24 2:28 PM, Patrick Palka wrote: > > On Thu, 5 Sep 2024, Jason Merrill wrote: > > > > > On 9/5/24 1:26 PM, Patrick Palka wrote: > > > > On Thu, 5 Sep 2024, Jason Merrill wrote: > > > > > > > > > On 9/5/24 10:54 AM, Patrick Palka wrote: > > > > >

Re: [PATCH] c++: template depth of lambda in default targ [PR116567]

2024-09-06 Thread Jason Merrill
On 9/6/24 11:19 AM, Patrick Palka wrote: On Thu, 5 Sep 2024, Jason Merrill wrote: On 9/5/24 2:28 PM, Patrick Palka wrote: On Thu, 5 Sep 2024, Jason Merrill wrote: On 9/5/24 1:26 PM, Patrick Palka wrote: On Thu, 5 Sep 2024, Jason Merrill wrote: On 9/5/24 10:54 AM, Patrick Palka wrote: Boo

Re: [PATCH v2 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-06 Thread Martin Storsjö
On Fri, 6 Sep 2024, Evgeny Karpov wrote: Friday, September 6, 2024 Martin Storsjö wrote: Sorry, but no. You can't just redefine how relocations in your object file format works, just because you feel like it. This patch changes how symbol with offset will be emitted. It will change: adrp

Re: [PATCH v2] GCC Driver : Enable very long gcc command-line option

2024-09-06 Thread Dora, Sunil Kumar
Hi Andrew, Thank you for your feedback. Initially, we attempted to address the issue by utilizing GCC’s response files. However, we discovered that the COLLECT_GCC_OPTIONS variable already contains the expanded contents of the response files. As a result, using response files only mitigates th

[PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-06 Thread Jin Ma
When we use flto, the function list of rvv will be generated twice, once in the cc1 phase and once in the lto phase. However, due to the different generation methods, the two lists are different. For example, when there is no zvfh or zvfhmin in arch, it is generated by calling function "riscv_prag

[PATCH v4] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread Jin Ma
Since the THeadVector vsetvli does not support vl as an immediate, we need to convert 0 to zero when outputting asm. PR target/116592 gcc/ChangeLog: * config/riscv/thead.cc (th_asm_output_opcode): Change '0' to "zero" gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xthe

Re: [PATCH v3] RISC-V: Fix illegal operands "th.vsetvli zero,0,e32,m8" for XTheadVector

2024-09-06 Thread Jin Ma
> Sorry, I still don't see assembly check. I am very sorry, I uploaded the wrong patch and I tried to correct it :) By the way, there seems to be no exp here to really test the XTheadVector test cases, which may have been missed when the XTheadVector extension was first implemented. Maybe I s

Re: [PATCH v2] GCC Driver : Enable very long gcc command-line option

2024-09-06 Thread Andrew Pinski
On Fri, Sep 6, 2024, 9:38 AM Dora, Sunil Kumar < sunilkumar.d...@windriver.com> wrote: > Hi Andrew, > > Thank you for your feedback. Initially, we attempted to address the issue > by utilizing GCC’s response files. However, we discovered that the > COLLECT_GCC_OPTIONS variable already contains the

Re: New version of unsiged patch

2024-09-06 Thread Steve Kargl
On Thu, Sep 05, 2024 at 09:07:20AM +0200, Thomas Koenig wrote: > Ping (a little bit)? > > With another weekend coming up, I would have some time to > work on incorporating any feedback, or on putting in > more intrinsics. > In the documentation, you have +Generally, unsigned integers are only p

[PATCH] c++: Fix up pedantic handling of alignas [PR110345]

2024-09-06 Thread Jakub Jelinek
Hi! The following patch on top of the PR110345 P2552R3 series: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661904.html https://gcc.gnu.org/pipermail/gcc-patches/2024-Aug

[PATCH] c++, v3: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-06 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 10:31:48PM +0200, Franz Sirl wrote: > Hmm, it just occured to me, how about adding !NONVIRTUAL here? When > NONVIRTUAL is true, there is no conditional stmt at all, or? Yeah, that makes sense, the problem doesn't happen in that case. Here is an adjusted patch, bootstrapped

[PATCH] libiberty: Fix up > 64K section handling in simple_object_elf_copy_lto_debug_section [PR116614]

2024-09-06 Thread Jakub Jelinek
Hi! cat abc.C #define A(n) struct T##n {} t##n; #define B(n) A(n##0) A(n##1) A(n##2) A(n##3) A(n##4) A(n##5) A(n##6) A(n##7) A(n##8) A(n##9) #define C(n) B(n##0) B(n##1) B(n##2) B(n##3) B(n##4) B(n##5) B(n##6) B(n##7) B(n##8) B(n##9) #define D(n) C(n##0) C(n##1) C(n##2) C(n##3) C(n##4) C(n##5) C

Re: New version of unsiged patch

2024-09-06 Thread Steve Kargl
On Thu, Sep 05, 2024 at 09:07:20AM +0200, Thomas Koenig wrote: > Ping (a little bit)? > > With another weekend coming up, I would have some time to > work on incorporating any feedback, or on putting in > more intrinsics. > Last comment as I've made it to the end of the patch. Your testcases ar

Re: New version of unsiged patch

2024-09-06 Thread Steve Kargl
On Sun, Aug 18, 2024 at 12:10:18PM +0200, Thomas Koenig wrote: > Hello world, > > this version of the patch includes DOT_PRODUCT, MATMUL and quite > a few improvements for simplification. > All, I have gone through Thomas's current patch and sent a few emails with comments to him. To keep thin

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-06 Thread Jason Merrill
On 9/6/24 8:56 AM, Jonathan Wakely wrote: On 05/09/24 21:44 -0400, Jason Merrill wrote: On 9/4/24 11:02 AM, Marek Polacek wrote: +handle_flag_enum_attribute (tree *node, tree ARG_UNUSED(name), tree args, +    int ARG_UNUSED (flags), bool *no_add_attrs) +{ +  if (args) +    warning

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-06 Thread Martin Uecker
Am Freitag, dem 06.09.2024 um 13:59 + schrieb Qing Zhao: > > > On Sep 5, 2024, at 18:22, Bill Wendling wrote: > > > > Hi Qing, > > > > Sorry for my late reply. > > > > On Thu, Aug 29, 2024 at 7:22 AM Qing Zhao wrote: > > > > > > Hi, > > > > > > Thanks for the information. > > > > > > Y

[pushed] c++: adjust testcase to reveal failure [PR107919]

2024-09-06 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- This test appeared to be passing, but only because the warning was suppressed by #pragma system_header. PR tree-optimization/107919 gcc/testsuite/ChangeLog: * g++.dg/warn/Wuninitialized-pr107919-1.C: Add -Wsystem-headers a

[committed] libstdc++: Fix std::chrono::parse for TAI and GPS clocks

2024-09-06 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. This should be backported too. I noticed while testing this that all the from_stream overloads for time_point specializations use time_point_cast to convert to the correct result type. That's wrong, but I'll fix that separately, as it affects more than just GP

[PATCH] libstdc++: Adjust std::span::iterator to be ADL-proof

2024-09-06 Thread Jonathan Wakely
This proposed patch means that span is not an associated type of span::iterator, which means that we won't try to complete T when doing ADL in the constraints for const_iterator. This makes it more reliable to use std::span. See https://github.com/llvm/llvm-project/issues/107215 for more info on t

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-06 Thread Bill Wendling
On Fri, Sep 6, 2024 at 12:32 PM Martin Uecker wrote: > > Am Freitag, dem 06.09.2024 um 13:59 + schrieb Qing Zhao: > > > > > On Sep 5, 2024, at 18:22, Bill Wendling wrote: > > > > > > Hi Qing, > > > > > > Sorry for my late reply. > > > > > > On Thu, Aug 29, 2024 at 7:22 AM Qing Zhao wrote: >

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-06 Thread Jonathan Wakely
On Fri, 6 Sept 2024 at 20:17, Jason Merrill wrote: > > On 9/6/24 8:56 AM, Jonathan Wakely wrote: > > On 05/09/24 21:44 -0400, Jason Merrill wrote: > >> On 9/4/24 11:02 AM, Marek Polacek wrote: > +handle_flag_enum_attribute (tree *node, tree ARG_UNUSED(name), tree > args, > +

[PATCH] gimple-fold: Move optimizing memcpy to memset to fold_stmt from fab

2024-09-06 Thread Andrew Pinski
I noticed this folding inside fab could be done else where and could even improve inlining decisions and a few other things so let's move it to fold_stmt. It also fixes PR 116601 because places which call fold_stmt already have to deal with the stmt becoming a non-throw statement. For the fix for

Re: [PATCH v4] RISC-V: Fix illegal operands "th.vsetvli zero, 0, e32, m8" for XTheadVector

2024-09-06 Thread 钟居哲
LGTM juzhe.zh...@rivai.ai From: Jin Ma Date: 2024-09-07 01:40 To: gcc-patches CC: jeffreyalaw; juzhe.zhong; pan2.li; kito.cheng; jinma.contrib; Jin Ma; nihui Subject: [PATCH v4] RISC-V: Fix illegal operands "th.vsetvli zero,0,e32,m8" for XTheadVector Since the THeadVector vsetvli does not sup

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-06 Thread Qing Zhao
Now, if 1. __builtin_get_counted_by should return a LVALUE instead of a pointer (required by CLANG’s design) And 2. It’s better not to change the behavior of __builtin_choose_expr. Then the solution left is: __builtin_get_counted_by (p->FAM) returns a LVALUE as p->COUNT if p->FAM has a counted

RE: [PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-06 Thread Li, Pan2
> +/* Test that we do not have ice when compile */ > + > +/* { dg-do run } */ > +/* { dg-options "-march=rv64gcv -mabi=lp64d -mrvv-vector-bits=zvl -flto -O2 > -fno-checking" } */ > + > +#include > + > +int > +main () > +{ > + size_t vl = 8; > + vint32m1_t vs1 = {}; > + vint32m1_t vs2 = {}; > +

  1   2   >