Re: [PATCH] PR target/116365: Add user-friendly arguments to --param aarch64-autovec-preference=N

2024-08-21 Thread Jennifer Schmitz
On 21 Aug 2024, at 16:03, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Kyrylo Tkachov writes: >>> On 20 Aug 2024, at 19:11, Richard Sandiford >>> wrot>> Jennifer Schmitz writes: The param aarch64-autovec-preference=N is a useful tool for t

[GCC13/GCC12 PATCH] Fix testcase failure.

2024-08-21 Thread liuhongt
Looks like -mprefer-vector-width=128 doesn't impact store_max/mov_max for GCC13/GCC12 branch, explicitly use -mmov-max=128, -mstore-max=128 for those testcases. Committed as an obvious fix. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Use -mmove-max=256 and -mst

[PATCH v4] RISC-V: Enable -gvariable-location-views by default

2024-08-21 Thread Bernd Edlinger
This affects only the RISC-V targets, where the compiler options -gvariable-location-views and consequently also -ginline-points are disabled by default, which is unexpected and disables some useful features of the generated debug info. Due to a bug in the gas assembler the .loc statement is not u

Re: [PATCH v3] RISC-V: Enable -gvariable-location-views by default

2024-08-21 Thread Bernd Edlinger
Hello Alexandre, On 8/22/24 05:39, Alexandre Oliva wrote: > Hello, Bernd, > > Thanks for the fixes and improvements, your patch looks good to me. I > stand behind its approval by someone with authority to do so. > > I believe location views are somewhat problematic on RISC-V, because of > the o

[PATCH V2] rs6000: add clober and guard for vsx_stxvd2x4_le_const[pr116030]

2024-08-21 Thread Jiufu Guo
Hi, Previous, vsx_stxvd2x4_le_const_ is introduced for 'split1' pass, so it is guarded by "can_create_pseudo_p ()". While, it would be possible to match the pattern of this insn during/after RA, so this insn could be updated to make it work for split pass after RA. And this insn would not be the

[PATCH v2] Do not emit a redundant DW_TAG_lexical_block for inlined subroutines

2024-08-21 Thread Bernd Edlinger
While this already works correctly for the case when an inlined subroutine contains only one subrange, a redundant DW_TAG_lexical_block is still emitted when the subroutine has multiple blocks. Fixes: ac02e5b75451 ("re PR debug/37801 (DWARF output for inlined functions doesn'

[r13-8987 Regression] FAIL: gcc.target/i386/pieces-strcpy-2.c scan-assembler-times vmovdqu[ \\t]+[^\n]*%xmm 4 on Linux/x86_64

2024-08-21 Thread haochen.jiang
On Linux/x86_64, aea374238cec1a1e53fb79575d2f998e16926999 is the first bad commit commit aea374238cec1a1e53fb79575d2f998e16926999 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. caused FAIL: gcc.target/i386/pieces-memcpy-10.c scan-as

Re: [PATCH 3/9] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-21 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: This splits out the building of the allocation and deallocation expressions and runs them early in the ramp build, so that we can exit if they are not usable, before we start building the ramp body. Likewise move checks for other required resources to the b

Re: [PATCH v3] RISC-V: Enable -gvariable-location-views by default

2024-08-21 Thread Alexandre Oliva
Hello, Bernd, Thanks for the fixes and improvements, your patch looks good to me. I stand behind its approval by someone with authority to do so. I believe location views are somewhat problematic on RISC-V, because of the object code relaxations. Its usefulness is also limited without support f

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-08-21 Thread Edwin Lu
Hi, Just wanted to ping this for more guidance. Edwin On 7/24/2024 12:03 PM, Edwin Lu wrote: On 7/24/2024 3:52 AM, Richard Biener wrote: On Wed, Jul 24, 2024 at 1:31 AM Edwin Lu wrote: On 7/23/2024 11:20 AM, Richard Sandiford wrote: Edwin Lu writes: On 7/23/2024 4:56 AM, Richard Biener

[committed][PR rtl-optimization/116437] Fix RTL checking issue in ext-dce

2024-08-21 Thread Jeff Law
Another RTL checking failure in ext-dce. An easy one to fix this time. When we optimize an extension we have to go back and cleanup with SUBREG_PROMOTED state. So we record the destination register into a bitmap as we make changes, then later do a single pass over the IL fixing any associate

Re: [PATCH 2/9] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-21 Thread Jason Merrill
On 8/21/24 3:09 PM, Iain Sandoe wrote: This change is preparation for fixes to the ramp and codegen to follow. The primary motivation is that we have thee activities; analysis, ramp synthesis and outlined coroutine body synthesis. These are currently carried out in sequence in the 'morph_fn_to_

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Bill Wendling
On Wed, Aug 21, 2024 at 2:54 PM Bill Wendling wrote: > > On Wed, Aug 21, 2024 at 10:44 AM Kees Cook wrote: > > > > On Wed, Aug 21, 2024 at 03:27:56PM +, Qing Zhao wrote: > > > > On Aug 21, 2024, at 10:45, Martin Uecker wrote: > > > > > > > > Am Mittwoch, dem 21.08.2024 um 16:34 +0200 schrieb

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Bill Wendling
On Wed, Aug 21, 2024 at 10:52 AM Qing Zhao wrote: > > On Aug 21, 2024, at 11:43, Martin Uecker wrote: > > Am Mittwoch, dem 21.08.2024 um 15:24 + schrieb Qing Zhao: > >>> > >>> But if we changed it to return a void pointer, we could make this > >>> a compile-time check: > >>> > >>> auto ret =

Re: [PATCH 1/9] c++, coroutines: Split the ramp build into a separate function.

2024-08-21 Thread Jason Merrill
On 8/21/24 3:09 PM, Iain Sandoe wrote: This is primarily preparation to partition the functionality of the coroutine transform into analysis, ramp generation and then (later) synthesis of the coroutine body. The patch does fix one latent issue in the ordering of DTORs for frame parameter copies

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Bill Wendling
On Wed, Aug 21, 2024 at 10:44 AM Kees Cook wrote: > > On Wed, Aug 21, 2024 at 03:27:56PM +, Qing Zhao wrote: > > > On Aug 21, 2024, at 10:45, Martin Uecker wrote: > > > > > > Am Mittwoch, dem 21.08.2024 um 16:34 +0200 schrieb Martin Uecker: > > >> Am Mittwoch, dem 21.08.2024 um 14:12 + sc

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Bill Wendling
On Tue, Aug 20, 2024 at 6:41 AM Qing Zhao wrote: > > On Aug 20, 2024, at 05:58, Richard Biener > > wrote: > > > > On Tue, Aug 13, 2024 at 5:34 PM Qing Zhao wrote: > >> > >> With the addition of the 'counted_by' attribute and its wide roll-out > >> within the Linux kernel, a use case has been fo

Re: [PATCH] warn-access: ignore template parameters when matching operator new/delete [PR109224]

2024-08-21 Thread Arsen Arsenović
Jason Merrill writes: > On 8/2/24 4:36 PM, Arsen Arsenović wrote: >> I'm not 100% clear on what the semantics of the matching here are meant >> to be - AFAICT, an operator new/delete pair matches (after falling >> through the other cases) if all their components (besides the actual >> operator na

[PATCH] RISC-V: Fix vector cfi notes for stack-clash protection

2024-08-21 Thread Raphael Moreira Zinsly
The stack-clash code is generating wrong cfi directives in riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign in prologue and starts using REG_CFA_DEF_CFA in the epilogue. gcc/ChangeLog: * config/ris

Re: [pushed] c++, coroutines: Check for malformed functions before splitting.

2024-08-21 Thread Jason Merrill
On 8/21/24 4:28 AM, Iain Sandoe wrote: tested on x86_64-darwin, powerpc64-linux and against cppcoro and folly coroutines tests, pushed to trunk as obvious, thanks, Iain --- 8< --- This performs the same basic check that is done by finish_function to catch cases where the function is so badly ma

[PATCH] libstdc++: Define operator== for hash table iterators [PR115939]

2024-08-21 Thread Jonathan Wakely
Tested x86_64-linux. I plan to push this soon. -- >8 -- Currently iterators for unordered containers do not directly define operator== and operator!= overloads. Instead they rely on the base class defining them, which is done so that iterator and const_iterator comparisons work using the same ove

Re: [PATCH] c++, coroutines: Tidy up awaiter variable checks.

2024-08-21 Thread Jason Merrill
On 8/21/24 4:34 AM, Iain Sandoe wrote: Tested on x86_64-darwin, powerpc64le-linux, and against cppcoro and folly coroutines testsuites, OK for trunk? thanks Iain --- 8< --- When we build an await expression, we might need to materialise the awaiter if it is a prvalue. This re-implements this u

Re: [PATCH] coroutines: diagnose usage of alloca in coroutines

2024-08-21 Thread Jason Merrill
On 8/7/24 9:15 AM, Arsen Arsenović wrote: Tested on x86_64-pc-linux-gnu. OK for trunk? -- >8 -- We do not support it currently, and the resulting memory can only be used inside a single resumption, so best not confuse the user with it. PR c++/115858 - Incompatibility of coroutin

Re: [PATCH] warn-access: ignore template parameters when matching operator new/delete [PR109224]

2024-08-21 Thread Jason Merrill
On 8/2/24 4:36 PM, Arsen Arsenović wrote: I'm not 100% clear on what the semantics of the matching here are meant to be - AFAICT, an operator new/delete pair matches (after falling through the other cases) if all their components (besides the actual operator name, of course) match, and the pair o

Re: [PATCH 2/2] libstdc++: Implement P2997R1 changes to the indirect invocability concepts

2024-08-21 Thread Jonathan Wakely
On Wed, 21 Aug 2024 at 18:18, Patrick Palka wrote: > > On Wed, 21 Aug 2024, Jonathan Wakely wrote: > > > On Wed, 21 Aug 2024 at 01:40, Patrick Palka wrote: > > > > > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > > > 14? > > > > > > -- >8 -- > > > > > > This implements

Re: [PATCH v3 2/2] c++: improve diagnostic of 'return's in coroutines

2024-08-21 Thread Jason Merrill
On 8/21/24 1:52 PM, Arsen Arsenović wrote: We now point out why a function is a coroutine, and where (the first return) is in the function. OK. gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Rename first_coro_keyword -> first_coro_expr. The former name is no

Re: [PATCH v3 1/2] c++: improve location of parsed RETURN_EXPRs

2024-08-21 Thread Jason Merrill
On 8/21/24 3:34 PM, Arsen Arsenović wrote: Jason Merrill writes: On 8/21/24 1:52 PM, Arsen Arsenović wrote: For clarity, here's the entire split-up patch I intend to push, if it looks OK. Tested on x86_64-pc-linux-gnu. I've renamed the field we've discussed and also a few parameters that ref

Re: [PATCH v3 1/2] c++: improve location of parsed RETURN_EXPRs

2024-08-21 Thread Arsen Arsenović
Jason Merrill writes: > On 8/21/24 1:52 PM, Arsen Arsenović wrote: >> For clarity, here's the entire split-up patch I intend to push, if it >> looks OK. Tested on x86_64-pc-linux-gnu. >> I've renamed the field we've discussed and also a few parameters that >> refer to 'kw' to be less specific.

[PATCH 9/9] c++, coroutines: Look through initial_await target exprs [PR110635].

2024-08-21 Thread Iain Sandoe
In the case that the initial awaiter returns an object, the initial await can be a target expression and we need to look at its initializer to cast the await_resume() to void and to wrap in a compoun expression that sets the initial_await_resume_called flag. PR c++/110635 gcc/cp/ChangeLog

[PATCH 3/9] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-21 Thread Iain Sandoe
This splits out the building of the allocation and deallocation expressions and runs them early in the ramp build, so that we can exit if they are not usable, before we start building the ramp body. Likewise move checks for other required resources to the begining of the ramp builder. This is pre

[PATCH 8/9] c++, coroutines: Rework handling of throwing_cleanups [PR102051].

2024-08-21 Thread Iain Sandoe
In the fix for PR95822 (r11-7402) we set throwing_cleanup false in the top level of the coroutine transform code. However, as the current PR shows, that is not sufficient. Any use of cxx_maybe_build_cleanup() can reset the flag, which causes the check_return_expr () logic to try to add a guard va

[PATCH 2/9] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-21 Thread Iain Sandoe
This change is preparation for fixes to the ramp and codegen to follow. The primary motivation is that we have thee activities; analysis, ramp synthesis and outlined coroutine body synthesis. These are currently carried out in sequence in the 'morph_fn_to_coro' code, which means that we are nesti

[PATCH 4/9] c++, coroutines: Fix handling of early exceptions [PR113773].

2024-08-21 Thread Iain Sandoe
The responsibility for destroying part of the frame content (promise, arg copies and the frame itself) transitions from the ramp to the body of the coroutine once we reach the await_resume () for the initial suspend. We added the variable that flags the transition, but failed to act on it. This c

[PATCH 7/9] c++, coroutines: Fix ordering of return object conversions [PR115908].

2024-08-21 Thread Iain Sandoe
[dcl.fct.def.coroutine]/7 says: The expression promise.get_return_object() is used to initialize the returned reference or prvalue result object of a call to a coroutine. The call to get_return_object is sequenced before the call to initial_suspend and is invoked at most once. The issue is about w

[PATCH 5/9] c++, coroutines: Only allow void get_return_object if the ramp is void [PR100476].

2024-08-21 Thread Iain Sandoe
Require that the value returned by get_return_object is convertible to the ramp return. This means that the only time we allow a void get_return_object, is when the ramp is also a void function. We diagnose this early to allow us to exit the ramp build if the return values are incompatible.

[PATCH 6/9] c++, coroutines: Allow convertible get_return_on_allocation_fail [PR109682].

2024-08-21 Thread Iain Sandoe
We have been requiring the get_return_on_allocation_fail() call to have the same type as the ramp. This is not intended by the standard, so relax that to allow anything convertible to the ramp return. PR c++/109682 gcc/cp/ChangeLog: * coroutines.cc (cp_coroutine_transfor

[PATCH 1/9] c++, coroutines: Split the ramp build into a separate function.

2024-08-21 Thread Iain Sandoe
This is primarily preparation to partition the functionality of the coroutine transform into analysis, ramp generation and then (later) synthesis of the coroutine body. The patch does fix one latent issue in the ordering of DTORs for frame parameter copies (to ensure that they are processed in rev

[PATCH 0/9] c++, coroutines: Patch set for ramp function fixes.

2024-08-21 Thread Iain Sandoe
This is a series of patches that addresses the majority of the open PRs related to the coroutine ramp function. It is presented as a series because the actual bug fixes depend on some preparatory patches (which are also used to resolve issues with other PR fixes - e.g. Arsen's fix for PR109867).

Re: [PATCH v3 1/2] c++: improve location of parsed RETURN_EXPRs

2024-08-21 Thread Jason Merrill
On 8/21/24 1:52 PM, Arsen Arsenović wrote: For clarity, here's the entire split-up patch I intend to push, if it looks OK. Tested on x86_64-pc-linux-gnu. I've renamed the field we've discussed and also a few parameters that refer to 'kw' to be less specific. The code is functionally identical.

Re: [PATCH] c++: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-08-21 Thread Jason Merrill
On 8/14/24 3:41 AM, Jakub Jelinek wrote: Hi! The following patch partially implements CWG 2867 - Order of initialization for structured bindings. The DR requires that initialization of e is sequenced before r_i and that r_i initialization is sequenced before r_j for j > i, we already do it that

[PATCH v3 2/2] c++: improve diagnostic of 'return's in coroutines

2024-08-21 Thread Arsen Arsenović
We now point out why a function is a coroutine, and where (the first return) is in the function. gcc/cp/ChangeLog: * coroutines.cc (struct coroutine_info): Rename first_coro_keyword -> first_coro_expr. The former name is no longer accurate. (coro_promise_type_foun

[PATCH v3 1/2] c++: improve location of parsed RETURN_EXPRs

2024-08-21 Thread Arsen Arsenović
For clarity, here's the entire split-up patch I intend to push, if it looks OK. Tested on x86_64-pc-linux-gnu. I've renamed the field we've discussed and also a few parameters that refer to 'kw' to be less specific. The code is functionally identical. OK for trunk? TIA, have a lovely day.

[patch] libgomp.texi: Document OpenMP's Interoperability Routines

2024-08-21 Thread Tobias Burnus
Add documentation for OpenMP's interoperability routines. This obviously, depends on the actual implementation patch, posted at: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661035.html (albeit I will post a v2 in a moment). I am sure there will be comments, suggestions and remarks :

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Qing Zhao
> On Aug 21, 2024, at 11:43, Martin Uecker wrote: > > Am Mittwoch, dem 21.08.2024 um 15:24 + schrieb Qing Zhao: >>> >>> But if we changed it to return a void pointer, we could make this >>> a compile-time check: >>> >>> auto ret = __builtin_get_counted_by(__p->FAM); >>> >>> _Generic(ret

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Kees Cook
On Wed, Aug 21, 2024 at 05:43:42PM +0200, Martin Uecker wrote: > Am Mittwoch, dem 21.08.2024 um 15:24 + schrieb Qing Zhao: > > > > > > But if we changed it to return a void pointer, we could make this > > > a compile-time check: > > > > > > auto ret = __builtin_get_counted_by(__p->FAM); > >

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Kees Cook
On Wed, Aug 21, 2024 at 03:27:56PM +, Qing Zhao wrote: > > On Aug 21, 2024, at 10:45, Martin Uecker wrote: > > > > Am Mittwoch, dem 21.08.2024 um 16:34 +0200 schrieb Martin Uecker: > >> Am Mittwoch, dem 21.08.2024 um 14:12 + schrieb Qing Zhao: > >> > >>> > >>> Yes, I do feel that the ap

Re: [PATCH 2/2] libstdc++: Implement P2997R1 changes to the indirect invocability concepts

2024-08-21 Thread Patrick Palka
On Wed, 21 Aug 2024, Jonathan Wakely wrote: > On Wed, 21 Aug 2024 at 01:40, Patrick Palka wrote: > > > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > > 14? > > > > -- >8 -- > > > > This implements the changes of this C++26 paper as a DR against C++20. > > > > libstdc++

Re: [Fortran, Patch, PR86468, v1] Follow up: Remove obsolete VIEW_CONVERT

2024-08-21 Thread Steve Kargl
On Wed, Aug 21, 2024 at 12:17:46PM +0200, Andre Vehreschild wrote: > > attached small patch removes a VIEW_CONVERT that I erroneously inserted during > patching pr110033. PR86468 fixes the (co-)rank computation and therefore this > VIEW_CONVERT is IMO obsolete. I think it may cause hard to find ru

[PATCH v2] tree-optimization/116024 - match.pd: add 4 int-compare simplifications

2024-08-21 Thread Artemiy Volkov
Hi, sending a v2 of https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659851.html after changing variable types in all new testcases from standard to fixed-width. Could anyone please assist with reviewing and/or pushing to trunk/14 since I don't have commit access? Many thanks, Artemiy

Re: [PATCH] testuite: Accept vmov.f64

2024-08-21 Thread Christophe Lyon
On Wed, 14 Aug 2024 at 22:04, Torbjörn SVENSSON wrote: > > Ok for trunk and releases/gcc-14? > > -- > > On Cortex-M55 with fpv5-d16, the vmov.f64 instruction is used. Hi Torbjorn, Thanks for the patch: after looking further I realized that we can always generate vmov.f64 with MVE, so I propose t

[PATCH] arm: Always use vmov.f64 instead of vmov.f32 with MVE

2024-08-21 Thread Christophe Lyon
With MVE, vmov.f64 is always supported (no need for +fp.dp extension). This patch updates two patterns: - in movdi_vfp, we incorrectly checked TARGET_VFP_SINGLE || TARGET_HAVE_MVE instead of TARGET_VFP_SINGLE && !TARGET_HAVE_MVE, and didn't take into account these two possibilities when comp

Re: [PATCH] c, v2: Add support for unsequenced and reproducible attributes

2024-08-21 Thread Joseph Myers
On Thu, 1 Aug 2024, Jakub Jelinek wrote: > +Unsequenced functions without pointer or reference arguments are similar > +to functions with the @code{const} attribute, except that @code{const} > +attribute also requires finitness. So, both functions with @code{const} s/finitness/finiteness/ (in al

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Martin Uecker
Am Mittwoch, dem 21.08.2024 um 15:24 + schrieb Qing Zhao: > > > > But if we changed it to return a void pointer, we could make this > > a compile-time check: > > > > auto ret = __builtin_get_counted_by(__p->FAM); > > > > _Generic(ret, void*: (void)0, default: *ret = COUNT); > > Is there an

Re: [PATCH v1 2/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 3

2024-08-21 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH v1 1/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 2

2024-08-21 Thread Robin Dapp
LGTM. -- Regards Robin

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Qing Zhao
> On Aug 21, 2024, at 10:45, Martin Uecker wrote: > > Am Mittwoch, dem 21.08.2024 um 16:34 +0200 schrieb Martin Uecker: >> Am Mittwoch, dem 21.08.2024 um 14:12 + schrieb Qing Zhao: >> >>> >>> Yes, I do feel that the approach __builtin_get_counted_by is not very good. >>> Maybe it’s bette

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Qing Zhao
> On Aug 21, 2024, at 10:34, Martin Uecker wrote: > > Am Mittwoch, dem 21.08.2024 um 14:12 + schrieb Qing Zhao: > > ... >> >>> + if (__builtin_get_counted_by (__p->FAM)) \ + *(__builtin_get_counted_by(__p->FAM)) = COUNT; \ How to improve it? (Thanks a lot for your

[PATCH v2] combine.cc (make_more_copies): Copy attributes from the original pseudo, PR115883

2024-08-21 Thread Hans-Peter Nilsson
The only thing that's changed with the patch in v2 since the first version (pinged once) is the commit message. CC to the nexts-of-kin as a heads-up. Regtested cross to cris-elf and native x86_64-linux-gnu at r15-3043-g64028d626a50. The gcc.dg/guality/pr54200.c magically being fixed was also not

Re: [wwwdocs v2] gcc-15: Mention c++ header dependency changes () in porting_to.html

2024-08-21 Thread Filip Kastl
On Wed 2024-08-21 09:50:39, Jonathan Wakely wrote: > On Wed, 21 Aug 2024 at 09:48, Filip Kastl wrote: > > > > Hi, > > > > this is the second version of my patch. See version 1 here: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659584.html > > > > Changes made: > > - Removed plura

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Martin Uecker
Am Mittwoch, dem 21.08.2024 um 16:34 +0200 schrieb Martin Uecker: > Am Mittwoch, dem 21.08.2024 um 14:12 + schrieb Qing Zhao: > > > > > Yes, I do feel that the approach __builtin_get_counted_by is not very good. > > Maybe it’s better to provide > > A. __builtin_set_counted_by > > or > > B.

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Martin Uecker
Am Mittwoch, dem 21.08.2024 um 14:12 + schrieb Qing Zhao: ... > > > > > > + if (__builtin_get_counted_by (__p->FAM)) \ > > > + *(__builtin_get_counted_by(__p->FAM)) = COUNT; \ > > > > > > How to improve it? (Thanks a lot for your suggestion). > > > > There's lack of syntactic guarantee t

[PATCH v3] Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook

2024-08-21 Thread H.J. Lu
This hook allows the BFD linker plugin to distinguish calls to claim_file_handler that know the object is being used by the linker (from ldmain.c:add_archive_element), from calls that don't know it's being used by the linker (from elf_link_is_defined_archive_symbol); in the latter case, the plugin

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-21 Thread Richard Sandiford
Richard Biener writes: > On Wed, Aug 21, 2024 at 8:37 AM Robin Dapp wrote: >> >> Hi, >> >> in get_best_extraction_insn we use smallest_int_mode_for_size with >> struct_bits as size argument. In PR115495 struct_bits = 256 and we >> don't have a mode for that. This patch just bails for such cases

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-21 Thread Qing Zhao
(Resend since the previous one has no subject). > On Aug 21, 2024, at 04:44, Richard Biener wrote: > > On Tue, Aug 20, 2024 at 3:41 PM Qing Zhao wrote: >> >> >> >>> On Aug 20, 2024, at 05:58, Richard Biener >>> wrote: >>> >>> On Tue, Aug 13, 2024 at 5:34 PM Qing Zhao wrote: Wi

Re: [PATCH v2] aarch64: Implement popcountti2 pattern [PR113042]

2024-08-21 Thread Richard Sandiford
Andrew Pinski writes: > When CSSC is not enabled, 128bit popcount can be implemented > just via the vector (v16qi) cnt instruction followed by a reduction, > like how the 64bit one is currently implemented instead of > splitting into 2 64bit popcount. > > Changes since v1: > * v2: Make operand 0 b

Re: [PATCH] PR target/116365: Add user-friendly arguments to --param aarch64-autovec-preference=N

2024-08-21 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 20 Aug 2024, at 19:11, Richard Sandiford >> wrot>> Jennifer Schmitz writes: >>> The param aarch64-autovec-preference=N is a useful tool for testing >>> auto-vectorisation in GCC as it allows the user to force a particular >>> strategy. So far, N could be an numerica

.

2024-08-21 Thread Qing Zhao
> On Aug 21, 2024, at 04:44, Richard Biener wrote: > > On Tue, Aug 20, 2024 at 3:41 PM Qing Zhao wrote: >> >> >> >>> On Aug 20, 2024, at 05:58, Richard Biener >>> wrote: >>> >>> On Tue, Aug 13, 2024 at 5:34 PM Qing Zhao wrote: With the addition of the 'counted_by' attribute a

Re: [RFC/RFA][PATCH v4 06/12] aarch64: Implement new expander for efficient CRC computation

2024-08-21 Thread Richard Sandiford
Mariam Arutunian writes: > This patch introduces two new expanders for the aarch64 backend, > dedicated to generate optimized code for CRC computations. > The new expanders are designed to leverage specific hardware capabilities > to achieve faster CRC calculations, > particularly using the crc32,

Re: [PATCH] Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook

2024-08-21 Thread H.J. Lu
On Wed, Aug 21, 2024 at 6:23 AM Richard Biener wrote: > > On Wed, Aug 21, 2024 at 2:27 PM H.J. Lu wrote: > > > > On Wed, Aug 21, 2024 at 2:38 AM Richard Biener > > wrote: > > > > > > On Tue, Aug 20, 2024 at 3:24 PM H.J. Lu wrote: > > > > > > > > On Tue, Aug 20, 2024 at 2:03 AM Richard Biener >

Re: [PATCH 2/2] libstdc++: Implement P2997R1 changes to the indirect invocability concepts

2024-08-21 Thread Jonathan Wakely
On Wed, 21 Aug 2024 at 01:40, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > 14? > > -- >8 -- > > This implements the changes of this C++26 paper as a DR against C++20. > > libstdc++-v3/ChangeLog: > > * include/bits/iterator_concepts.h (in

[PATCH v1 2/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 3

2024-08-21 Thread pan2 . li
From: Pan Li This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 3. Aka: Form 3: #define DEF_VEC_SAT_U_TRUNC_FMT_3(NT, WT) \ void __attribute__((noinline))\ vec_sat_u_trunc_##NT##_##WT##_fmt_3

[PATCH v1 1/2] RISC-V: Add testcases for unsigned vector .SAT_TRUNC form 2

2024-08-21 Thread pan2 . li
From: Pan Li This patch would like to add test cases for the unsigned vector .SAT_TRUNC form 2. Aka: Form 2: #define DEF_VEC_SAT_U_TRUNC_FMT_2(NT, WT) \ void __attribute__((noinline))\ vec_sat_u_trunc_##NT##_##WT##_fmt_2

Re: [PATCH 1/2] libstdc++: Implement P2609R3 changes to the indirect invocability concepts

2024-08-21 Thread Jonathan Wakely
On Wed, 21 Aug 2024 at 01:40, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > 14? > > -- >8 -- > > This implements the changes of this C++23 paper as a DR against C++20. It's a little unfortunate that we can't bump the __cpp_lib_ranges macro for C+

[PATCH] tree-optimization/116406 - ICE with int<->float punning prevention

2024-08-21 Thread Richard Biener
The following does away with the idea to use non-symmetrical testing of mode_can_transfer_bits in hash-table equality testing. It isn't feasible to always control query order to maintain consistency. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/116406

Re: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-21 Thread Torbjorn SVENSSON
On 2024-08-20 14:37, Tamar Christina wrote: -Original Message- From: Richard Biener Sent: Tuesday, August 20, 2024 12:33 PM To: Torbjorn SVENSSON Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard Earnshaw ; quic_apin...@quicinc.com; yvan.r...@foss.st.com; Tamar Christina Subject: Re:

Re: [PATCH] Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook

2024-08-21 Thread Richard Biener
On Wed, Aug 21, 2024 at 2:27 PM H.J. Lu wrote: > > On Wed, Aug 21, 2024 at 2:38 AM Richard Biener > wrote: > > > > On Tue, Aug 20, 2024 at 3:24 PM H.J. Lu wrote: > > > > > > On Tue, Aug 20, 2024 at 2:03 AM Richard Biener > > > wrote: > > > > > > > > On Wed, Aug 14, 2024 at 3:15 PM H.J. Lu wrot

Re: [PATCH 3/2] libstdc++: Optimize std::projected

2024-08-21 Thread Jonathan Wakely
On Wed, 21 Aug 2024 at 13:58, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk? I'm not > sure if the current specification of 'projected' strictly speaking > allows for this optimization, but it seems like a natural one that > should be allowed. Yeah, I can't s

[PATCH] rs6000: Fix PTImode handling in power8 swap optimization pass [PR116415]

2024-08-21 Thread Peter Bergner
Our power8 swap optimization pass has some special handling for optimizing swaps of TImode variables. The test case reported in bugzilla uses a call to __atomic_compare_exchange, which introduces a variable of PTImode and that does not get the same treatment as TImode leading to wrong code genera

[PATCH 3/2] libstdc++: Optimize std::projected

2024-08-21 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk? I'm not sure if the current specification of 'projected' strictly speaking allows for this optimization, but it seems like a natural one that should be allowed. -- >8 -- Algorithms that are generalized to take projections usually defaul

[patch] libgomp: Add interop types and routines to OpenMP's headers and module

2024-08-21 Thread Tobias Burnus
This patch adds 'interop' to C/C++'s omp.h and Fortran's omp_lib.h and omp_lib module. The implementation should match OpenMP 5.1 (which added interop) and also TR13; the Fortran routine support is new in TR13. It also adds 'hsa' as foreign object enum/paramter, which is currently being added

Re: [PATCH 1/2] SVE intrinsics: Fold constant operands for svdiv

2024-08-21 Thread Richard Sandiford
Jennifer Schmitz writes: > thank you for the feedback. I would like to summarize what I understand from > your suggestions before I start revising to make sure we are on the same page: > > 1. The new setup for constant folding of SVE intrinsics for binary operations > where both operands are con

Re: [PATCH 1/2] Makefile.tpl: drop leftover intermodule cruft

2024-08-21 Thread Richard Biener
On Thu, Aug 15, 2024 at 12:14 AM Sam James wrote: > > intermodule supported was dropped in r0-103106-gde6ba7aee152a0 with some > remaining bits for Fortran removed in r14-1696-gecc96eb5d2a0e5. OK > Remove some small leftovers. > > * Makefile.in: Regenerate. > * Makefile.tpl (STAG

Re: [PING] [PATCH v2] Support if conversion for switches

2024-08-21 Thread Richard Biener
On Tue, Aug 13, 2024 at 7:34 PM Andi Kleen wrote: > > Andi Kleen writes: > > I wanted to ping this patch. I believe Richard ok'ed most of it earlier > but need an ok for the changes resulting from his review too > (but they were mostly only test suite and comment fixes > apart from some minor twe

Re: [PATCH] Update LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook

2024-08-21 Thread H.J. Lu
On Wed, Aug 21, 2024 at 2:38 AM Richard Biener wrote: > > On Tue, Aug 20, 2024 at 3:24 PM H.J. Lu wrote: > > > > On Tue, Aug 20, 2024 at 2:03 AM Richard Biener > > wrote: > > > > > > On Wed, Aug 14, 2024 at 3:15 PM H.J. Lu wrote: > > > > > > > > The new hook allows the linker plugin to distingu

[PATCH] tree-optimization/116380 - bogus SSA update with loop distribution

2024-08-21 Thread Richard Biener
When updating LC PHIs after copying loops we have to handle defs defined outside of the loop appropriately (by not setting them to NULL ...). This mimics how we handle this in the SSA updating code of the vectorizer. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard.

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> And we fail to fold vect_patt_384.36_436 | { 1, ... } to { 1, ... }? > Or is the issue that vector masks contain padding and with > non-zero masking we'd have garbage in the padding and that leaks > here? That is, _47 ? 1 : iftmp.0_113 -> _47 | iftmp.0_113 assumes > there's exactly one bit in a

Re: [Ping, Patch, Fortran, 77871, v1] Allow for class typed coarray parameter as dummy [PR77871]

2024-08-21 Thread Andre Vehreschild
Hi all, pinging this patch for the first time. Rebased and regtested ok on x86_64-pc-linux-gnu / Fedora 39. Ok for mainline? - Andre On Thu, 15 Aug 2024 14:39:25 +0200 Andre Vehreschild wrote: > Hi all, > > attached patch fixes another regression on coarrays. This time for class typed > coarr

Re: [PATCH] Do not emit a redundant DW_TAG_lexical_block for inlined subroutines

2024-08-21 Thread Richard Biener
On Wed, 21 Aug 2024, Bernd Edlinger wrote: > On 8/21/24 10:45, Richard Biener wrote: > > On Wed, 21 Aug 2024, Richard Biener wrote: > > > >> On Tue, 20 Aug 2024, Bernd Edlinger wrote: > >> > >>> On 8/20/24 13:00, Richard Biener wrote: > On Fri, Aug 16, 2024 at 12:49 PM Bernd Edlinger >

RE: Re-compute TYPE_MODE and DECL_MODE while streaming in for accelerator

2024-08-21 Thread Richard Biener
On Wed, 21 Aug 2024, Prathamesh Kulkarni wrote: > > > > -Original Message- > > From: Richard Biener > > Sent: Tuesday, August 20, 2024 10:36 AM > > To: Richard Sandiford > > Cc: Prathamesh Kulkarni ; Thomas Schwinge > > ; gcc-patches@gcc.gnu.org > > Subject: Re: Re-compute TYPE_MODE an

Re: [PATCH] Do not emit a redundant DW_TAG_lexical_block for inlined subroutines

2024-08-21 Thread Bernd Edlinger
On 8/21/24 10:45, Richard Biener wrote: > On Wed, 21 Aug 2024, Richard Biener wrote: > >> On Tue, 20 Aug 2024, Bernd Edlinger wrote: >> >>> On 8/20/24 13:00, Richard Biener wrote: On Fri, Aug 16, 2024 at 12:49 PM Bernd Edlinger wrote: > > While this already works correctly for t

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Richard Biener
On Wed, 21 Aug 2024, Robin Dapp wrote: > > > > > _Bool iftmp.0_113; > > > > > _Bool iftmp.0_114; > > > > > iftmp.0_113 = .MASK_LOAD (_170, 8B, _169, _171(D)); > > > > > iftmp.0_114 = _47 | iftmp.0_113; > > > > _BoolD.2746 _47; > > > iftmp.0_114 = _47 ? 1 : iftmp.0_113; > > > which is

Re: [PATCH] vect: Multistep float->int conversion only with no trapping math

2024-08-21 Thread Richard Biener
On Tue, Aug 20, 2024 at 3:35 PM Juergen Christ wrote: > > Am Tue, Aug 20, 2024 at 02:51:02PM +0200 schrieb Richard Biener: > > On Tue, Aug 20, 2024 at 11:16 AM Juergen Christ > > wrote: > > > > > > Am Tue, Aug 20, 2024 at 10:15:22AM +0200 schrieb Richard Biener: > > > > On Fri, Aug 9, 2024 at 2:

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> > > > _Bool iftmp.0_113; > > > > _Bool iftmp.0_114; > > > > iftmp.0_113 = .MASK_LOAD (_170, 8B, _169, _171(D)); > > > > iftmp.0_114 = _47 | iftmp.0_113; > > _BoolD.2746 _47; > > iftmp.0_114 = _47 ? 1 : iftmp.0_113; > > which is folded into > > iftmp.0_114 = _47 | iftmp.0_113; > >

Re: [RFC] Support single lane SLP early break

2024-08-21 Thread Richard Biener
On Tue, 20 Aug 2024, Tamar Christina wrote: > Hi, > > I've been working on a prototype of moving early break to SLP. > > As we've discussed on IRC I've decided to first try adding the gconds as roots > and start SLP discovery using them as roots. > > This works great and doesn't require any cha

RE: Re-compute TYPE_MODE and DECL_MODE while streaming in for accelerator

2024-08-21 Thread Prathamesh Kulkarni
> -Original Message- > From: Richard Biener > Sent: Tuesday, August 20, 2024 10:36 AM > To: Richard Sandiford > Cc: Prathamesh Kulkarni ; Thomas Schwinge > ; gcc-patches@gcc.gnu.org > Subject: Re: Re-compute TYPE_MODE and DECL_MODE while streaming in for > accelerator > > External emai

Re: [PATCH] libstdc++: Check ios::uppercase for ios::fixed floating-point output [PR114862]

2024-08-21 Thread Jonathan Wakely
This is still pending a decision by LEWG, but I've pushed it to trunk anyway. We can always revert it before GCC 15 is released if the committee decides against it, but this way we might get user feedback on it. On Thu, 1 Aug 2024 at 22:41, Jonathan Wakely wrote: > > Tested x86_64-linux. > > --

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Richard Biener
On Wed, 21 Aug 2024, Robin Dapp wrote: > > > > Why? I don't think the vectorizer relies on a particular else > > > > value? I'd say it would be appropriate for if-conversion to > > > > use "ANY" and for the vectorizer to then pick a supported > > > > version and/or enforce the else value it need

Ping^2: C++/ME patch ping

2024-08-21 Thread Arsen Arsenović
Hi, Pinging these patches again: - https://inbox.sourceware.org/20240807131613.526335-1-ar...@aarsen.me/ - https://inbox.sourceware.org/20240802211503.3992610-2-ar...@aarsen.me/ Thanks in advance, have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-21 Thread Robin Dapp
> > > Why? I don't think the vectorizer relies on a particular else > > > value? I'd say it would be appropriate for if-conversion to > > > use "ANY" and for the vectorizer to then pick a supported > > > version and/or enforce the else value it needs via a blend? > > > > In PR115336 we have some

[Fortran, Patch, PR86468, v1] Follow up: Remove obsolete VIEW_CONVERT

2024-08-21 Thread Andre Vehreschild
Hi all, attached small patch removes a VIEW_CONVERT that I erroneously inserted during patching pr110033. PR86468 fixes the (co-)rank computation and therefore this VIEW_CONVERT is IMO obsolete. I think it may cause hard to find runtime bugs in the future and therefore like to remove it. Regtests

  1   2   >