Re: Fix type of malloc call in trans-expr.cc

2024-11-14 Thread Paul Richard Thomas
Hi Jakub, Good catch! Does it fix any specific PR? If you don't have the time, I would be happy to apply the correction to 13-branch through to mainline. Regards Paul On Thu, 14 Nov 2024 at 22:24, Jakub Jelinek wrote: > On Thu, Nov 14, 2024 at 08:58:26PM +0100, Jan Hubicka wrote: > > fortra

Re: tree-nested: Do not inline or clone functions with nested functions with VM return type [PR117164]

2024-11-14 Thread Richard Biener
On Fri, Nov 15, 2024 at 3:51 AM Joseph Myers wrote: > > Bug 117164 is an ICE on an existing test with -std=gnu23 involving a > nested function returning a variable-size structure (and I think the > last bug needing to be resolved before switching to -std=gnu23 as the > default, as without fixing t

Re: [RFC][RISC-V] Add target dependent pass to optimize related permutation constants

2024-11-14 Thread Richard Biener
On Thu, Nov 14, 2024 at 10:41 PM Jeff Law wrote: > > > Several weeks ago I was looking at SATD and realized that we had loads > of permutation constants that could be implemented as a trivial > adjustment to a prior loaded permutation constant. > > For example if we had loaded a permutation consta

[PATCH v1] RISC-V: Remove unnecessary option for scalar SAT_ADD testcase

2024-11-14 Thread pan2 . li
From: Pan Li After we create a isolated folder to hold all SAT scalar test, we have fully control of what optimization options passing to the testcase. Thus, it is better to remove the unnecessary work around for flto option, as well as the -O3 option for each cases. The riscv.exp will pass sor

Re: [PATCH] Report the section name in case of section type conflicts

2024-11-14 Thread Florian Weimer
* Jeff Law: > On 11/14/24 11:42 AM, Florian Weimer wrote: >> The section name might the user a hint of what is going on. >> I tried to include the flags as well, but there didn't seem to be >> consensus about including the internal section flags in the diagnostics: >>[RFC PATCH] More detailed

[COMMITTED] RISC-V: Move scalar SAT_ADD test cases to a isolated folder

2024-11-14 Thread pan2 . li
From: Pan Li Move the scalar SAT_ADD includes both the signed and unsigned integer to the folder gcc.target/riscv/sat. According to the implementation the below options will be appended for each test cases. * -O2 * -O3 * -Ofast * -Os * -Oz Then we can see the test log similar as below: Execut

Re: [PATCH] Fix test failures for enum-alias-{1,2,3} on arm-eabi [PR117419]

2024-11-14 Thread Thiago Jung Bauermann
Hello, Martin Uecker writes: > I added a max element as suggested by Richard to force > the type to an int. > > Regression tested on x86_64 but needs testing on arm-eabi. > > Thiago, could you test this? > > https://linaro.atlassian.net/browse/GNU-1224 Thanks! I can confirm that this patch fixe

Re: Add testcase that we optimize away empty std::vector

2024-11-14 Thread Marek Polacek
On Wed, Nov 13, 2024 at 02:59:05PM +0100, Jan Hubicka wrote: > > On Tue, Nov 12, 2024 at 04:00:03PM +0100, Jan Hubicka wrote: > > > Hi, > > > with __builtin_operator_new we now can optimize away unused std::vectors. > > > This adds testcases mentioned in the PR. > > > > > > Regtested x86_64-linux

[PATCH 2/2] RISC-V: Use dynamic shadow offset

2024-11-14 Thread Kito Cheng
Switch to dynamic offset so that we can support Sv39, Sv48, and Sv57 at the same time without building multiple libasan versions! [1] https://github.com/llvm/llvm-project/commit/da0c8b275564f814a53a5c19497669ae2d99538d gcc/ChangeLog: * config/riscv/riscv.cc (riscv_asan_shadow_offset): U

[PATCH 1/2] asan: Support dynamic shadow offset

2024-11-14 Thread Kito Cheng
AddressSanitizer has supported dynamic shadow offsets since 2016[1], but GCC hasn't implemented this yet because targets using dynamic shadow offsets, such as Fuchsia and iOS, are mostly unsupported in GCC. However, RISC-V 64 switched to dynamic shadow offsets this year[2] because virtual memory s

[PATCH] aarch64: Use SVE SUBR instruction with Neon modes

2024-11-14 Thread Soumya AR
The SVE SUBR instruction performs a reversed subtract from an immediate. This patches enables the emission of SUBR for Neon modes and avoids the need to materialise an explicit constant. For example, the below test case: typedef long long __attribute__ ((vector_size (16))) v2di; v2di subr_v2di

tree-nested: Do not inline or clone functions with nested functions with VM return type [PR117164]

2024-11-14 Thread Joseph Myers
Bug 117164 is an ICE on an existing test with -std=gnu23 involving a nested function returning a variable-size structure (and I think the last bug needing to be resolved before switching to -std=gnu23 as the default, as without fixing this would be a clear regression from a change in default). The

Re: tree-nested: Do not inline or clone functions with nested functions with VM return type [PR117164]

2024-11-14 Thread Sam James
Joseph Myers writes: > Bug 117164 is an ICE on an existing test with -std=gnu23 involving a > nested function returning a variable-size structure (and I think the > last bug needing to be resolved before switching to -std=gnu23 as the > default, as without fixing this would be a clear regression

Re: [PATCH v2] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-11-14 Thread Jeff Law
On 11/11/24 9:32 PM, Eikansh Gupta wrote: Hi, >          It seems to me this ought to work when the min/max reversed as well, or >am I missing something? Yes, it should work when min/max are reversed. So are you going to add support for both? Seems to me like we'd want to support b

Re: [PATCH v2] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-11-14 Thread Andrew Pinski
On Thu, Nov 14, 2024 at 5:30 PM Jeff Law wrote: > > > > On 11/11/24 9:32 PM, Eikansh Gupta wrote: > > Hi, > > > > > It seems to me this ought to work when the min/max reversed > > as well, or > > >am I missing something? > > > > Yes, it should work when min/max are reversed. > So ar

Re: [PATCH V2 3/11] Do not allow -mvsx to boost processor to power7.

2024-11-14 Thread Peter Bergner
On 11/8/24 1:48 PM, Michael Meissner wrote: > This patch restructures the code so that -mvsx for example will not silently > convert the processor to power7. The user must now use -mcpu=power7 or > higher. > This means if the user does -mvsx and the default processor does not have VSX > support,

Re: [PATCH V2 10/11] Add support for -mcpu=future

2024-11-14 Thread Peter Bergner
On 11/8/24 1:56 PM, Michael Meissner wrote: > This patch adds the support that can be used in developing GCC support for > future PowerPC processors. We used to have support for -mcpu=future. Unfortunately when we added Power10 support, rather than adding new support for the -mcpu=power10 option,

Re: [PATCH] Report the section name in case of section type conflicts

2024-11-14 Thread Jeff Law
On 11/14/24 11:42 AM, Florian Weimer wrote: The section name might the user a hint of what is going on. I tried to include the flags as well, but there didn't seem to be consensus about including the internal section flags in the diagnostics: [RFC PATCH] More detailed diagnostics for sect

Re: [PATCH V2 11/11] Add -mcpu=future tuning support.

2024-11-14 Thread Peter Bergner
On 11/8/24 1:57 PM, Michael Meissner wrote: > This patch makes -mtune=future use the same tuning decision as -mtune=power11. > > 2024-11-06 Michael Meissner > > gcc/ > > * config/rs6000/power10.md (all reservations): Add future as an > alterntive to power10 and power11. Obviously

Re: [PATCH] RSIC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2024-11-14 Thread Jeff Law
On 11/13/24 3:04 AM, Jin Ma wrote: Since XTheadvector does not support vsetivli, vl needs to be put into registers during the expand phase. PR 116593 gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (function_expander::add_input_operand): Put const to GPR for

Re: [PATCH] RISC-V: Tie MUL and DIV masks to the M extension

2024-11-14 Thread Jeff Law
On 11/13/24 9:50 AM, Dimitar Dimitrov wrote: When configuring GCC for RV32EC with: ./configure \ --target=riscv32-none-elf \ --with-multilib-generator="rv32ec-ilp32e--" \ --with-abi=ilp32e \

Re: [PATCH] RISC-V: Tie MUL and DIV masks to the M extension

2024-11-14 Thread Jeff Law
On 11/13/24 9:50 AM, Dimitar Dimitrov wrote: When configuring GCC for RV32EC with: ./configure \ --target=riscv32-none-elf \ --with-multilib-generator="rv32ec-ilp32e--" \ --with-abi=ilp32e \

Re: [PATCH v2] RISC-V:Auto vect for vector-bfloat16

2024-11-14 Thread Jeff Law
On 11/12/24 11:02 PM, wangf...@eswincomputing.com wrote: On 2024-11-13 07:30  Edwin Lu wrote: I took a look at the CI errors today since I remember Jeff checking the CI output. I don't remember if the errors were the main things blocking the patch or if there just wasn't any follow up. Juz

Re: [PATCH V2 9/11] Update tests to work with architecture flags changes.

2024-11-14 Thread Peter Bergner
On 11/8/24 1:55 PM, Michael Meissner wrote: > Two tests used -mvsx to raise the processor level to at least power7. These > tests were rewritten to add cpu=power7 support. Again, this cleanup patch like the TARGET_ -> TARGET_ patches is independent of the main patches in this series (ie, patche 1

Re: [PATCH v2] RISC-V: Improve vsetvl vconfig alignment

2024-11-14 Thread Jeff Law
On 10/8/24 2:11 PM, Jeff Law wrote: On 10/2/24 6:27 AM, Dusan Stojkovic wrote: This patch is a new version of: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662745.html  > Can you elaborate a bit on that?  

Re: [PATCH V2 4/11] Change TARGET_POPCNTB to TARGET_POWER5

2024-11-14 Thread Peter Bergner
On 11/8/24 1:49 PM, Michael Meissner wrote: > As part of the architecture flags patches, this patch changes the use of > TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA > 2.02 > (power5). I like what this patch and the other related clean up patches are doing, namely ch

[committed] libstdc++: Fix indentation in std::list::emplace_back

2024-11-14 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/stl_list.h (list::emplace_back): Fix indentation. --- Committed as obvious. libstdc++-v3/include/bits/stl_list.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libstdc++-v3/include/bits/stl_list.h b/libstdc++-v3/include/bits/s

Re: [PATCH 3/8] ipa: Skip type conversions in jump function constructions

2024-11-14 Thread Martin Jambor
Hi, On Tue, Nov 05 2024, Jan Hubicka wrote: >> gcc/ChangeLog: >> >> 2024-11-01 Martin Jambor >> >> * ipa-prop.cc (skip_a_conversion_op): New function. >> (ipa_compute_jump_functions_for_edge): Use it. >> >> gcc/testsuite/ChangeLog: >> >> 2024-11-01 Martin Jambor >> >> * g

Re: Fix type of malloc call in trans-expr.cc

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 08:58:26PM +0100, Jan Hubicka wrote: > fortran produces malloc call with signed size instead of unsigned. This > in turn makes gimple_call_builtin_p to fail type checking and we do not > treat the call as malloc call. > > regtested x86_64-linux, OK? > > gcc/fortran/ChangeL

[RFC][RISC-V] Add target dependent pass to optimize related permutation constants

2024-11-14 Thread Jeff Law
Several weeks ago I was looking at SATD and realized that we had loads of permutation constants that could be implemented as a trivial adjustment to a prior loaded permutation constant. For example if we had loaded a permutation constant like 1, 3, 1, 3, 5, 7, 5, 7 and we needed 0, 2, 0, 2,

Re: Minor cleanup to cp/decl.cc

2024-11-14 Thread Jason Merrill
On 11/14/24 4:04 PM, Jan Hubicka wrote: Hi, this patch refactors slightly cp/decl.cc so code producing new and delete operators is not duplicated multiple times. I also broke out cxx_init_operator_new_delete_decls which I originally inteded to use for builtion_operator_new/delete which is solved

Re: [PATCH] Fortran: fix passing of NULL() actual argument to character dummy [PR104819]

2024-11-14 Thread Harald Anlauf
Hi Jerry, Am 14.11.24 um 01:17 schrieb Jerry D: On 11/13/24 2:26 PM, Harald Anlauf wrote: Dear all, the attached patch is the third part of a series to fix the handling of NULL() passed to pointer dummy arguments.  This one addresses character dummy arguments (scalar, assumed-shape, assumed-ra

Re: [PATCH 09/11] doc: Mention floating point atomic fetch_add etc in docs

2024-11-14 Thread Gerald Pfeifer
On Thu, 14 Nov 2024, Sandra Loosemore wrote: > I am 100% in favor of Oxford commas. :-) : >> I'm not sure I understand what "same format" means here? Do we need this >> sentence, or can we actually drop it? > I think "same format" means "same arguments", but the whole sentence can > be dropped. >

Re: [PATCH] rs6000, fix test builtins-1-p10-runnable.c

2024-11-14 Thread Carl Love
Ping 5 On 11/5/24 8:27 AM, Carl Love wrote: Ping 4 On 10/28/24 4:28 PM, Carl Love wrote: Ping 3 On 10/17/24 1:31 PM, Carl Love wrote: Ping 2 On 10/9/24 7:43 AM, Carl Love wrote: Ping, FYI this is a fairly simple fix to a testcase. On 10/3/24 8:11 AM, Carl Love wrote: GCC maintainer

Re: [PATCH ver2 0/4] rs6000, remove redundant built-ins and add more test cases

2024-11-14 Thread Carl Love
Ping 5 On 11/5/24 8:28 AM, Carl Love wrote: Ping 4 On 10/28/24 4:29 PM, Carl Love wrote: Ping 3 On 10/17/24 1:31 PM, Carl Love wrote: Ping 2 On 10/9/24 7:44 AM, Carl Love wrote: Ping On 10/1/24 8:12 AM, Carl Love wrote: GCC maintainers: The following version 2 of a series of patc

Re: [PATCH 1/8] ipa: Fix jump function copying

2024-11-14 Thread Jan Hubicka
> Hello Josef, > > On Tue, Nov 05 2024, Josef Melcr wrote: > > Hi! > > > > On 11/5/24 12:06, Martin Jambor wrote: > >> +/* Copy information from SRC_JF to DST_JF which correstpond to call graph > >> edges > >> + SRC and DST. */ > >> + > >> +static void > >> +ipa_duplicate_jump_function (cgraph

Minor cleanup to cp/decl.cc

2024-11-14 Thread Jan Hubicka
Hi, this patch refactors slightly cp/decl.cc so code producing new and delete operators is not duplicated multiple times. I also broke out cxx_init_operator_new_delete_decls which I originally inteded to use for builtion_operator_new/delete which is solved better by Jakub's patch, but it seems to

Re: [patch,lra] PR117191 remove unnecessary CLOBBER insns after LRA

2024-11-14 Thread Denis Chertykov
чт, 14 нояб. 2024 г. в 23:10, Vladimir Makarov : > > > On 11/13/24 14:10, Denis Chertykov wrote: > > The fix for PR117191 > > > > Wrong code appears after dse2 pass because it removes necessary insns. > > (ie insn 554 - store to frame spill slot) > > This happened because LRA pass doesn't cleanup t

[COMMITTED] gcc: regenerate configure

2024-11-14 Thread Sam James
r15-5257-g56ded80b96b0f6 didn't regenerate configure correctly. See https://inbox.sourceware.org/gcc-patches/zzzf69gorvpro...@zen.kayari.org/. gcc/ChangeLog: * configure: Regenerate. --- Pushed as obvious. gcc/configure | 14 -- 1 file changed, 8 insertions(+), 6 deletions(

Re: [PATCH v4 01/23] aarch64: Add -mbranch-protection=gcs option

2024-11-14 Thread Jonathan Wakely
On 14/11/24 12:36 +, Yury Khrustalev wrote: From: Szabolcs Nagy This enables Guarded Control Stack (GCS) compatible code generation. The "standard" branch-protection type enables it, and the default depends on the compiler default. gcc/ChangeLog: * config/aarch64/aarch64-protos.h

[PING #3][PATCH v2] Add new warning Wmissing-designated-initializers [PR39589]

2024-11-14 Thread Peter Frost
Hi all, Pinginghttps://gcc.gnu.org/pipermail/gcc-patches/2024-September/662590.html for a review if anyone has a moment. Many thanks, Peter

Re: [PATCH] libgccjit: Add support for machine-dependent builtins

2024-11-14 Thread Antoni Boucher
It seems we don't need to do the cleanup in i386-builtins.cc anymore, so I removed it. David: Is it possible that your recent fixes for the GC within libgccjit also fixed the issue here? Here's the updated patch and answers below. (GitHub link if you find it easier for review: https://github.

[to-be-committed][RISC-V][V2] Fix type on vector move patterns

2024-11-14 Thread Jeff Law
Updated version of my prior patch to fix type attributes on the pre-allocation vector move pattern. This version just adds a suitable set of attributes to a second pattern that was obviously wrong. Passed on my tester for rv64 and rv32 crosses. Bootstrapped and regression tested on riscv64-l

Fix type of malloc call in trans-expr.cc

2024-11-14 Thread Jan Hubicka
Hi, fortran produces malloc call with signed size instead of unsigned. This in turn makes gimple_call_builtin_p to fail type checking and we do not treat the call as malloc call. regtested x86_64-linux, OK? gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_subcomponent_assign): Convert m

Re: [PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 08:38:38PM +0100, Jan Hubicka wrote: > Concerning the other uses outside of inliner: > - tree_inlinable_function_p and expand_call_inline check it to warn. We >could silence the wraning via NO_INLINE_WRANING flag. I know and initially I've even had that flag set in c-

[PATCH v4 2/5] aarch64: specify fpm mode in function instances and groups

2024-11-14 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

Re: [RFA/RFC][RISC-V] Fix type on vector move pattern

2024-11-14 Thread Jeff Law
On 11/12/24 7:32 AM, Robin Dapp wrote: So I was looking into a horrific schedule for SAD a week or so ago and came across this gem. Basically we were treating a vector load as a vector move from a scheduling standpoint during sched1. Naturally we didn't expose much ILP during sched1. That i

Re: [PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jan Hubicka
> > If it is in ipa_fn_summary, where would I do the lookup_attribute? It is constructed in ipa-fnsummary.cc:analyze_function and then needs to be duplicated by the duplicate hooks > > Anyway, looking at the spots where I've used DECL_OPTIMIZABLE_INLINE_P > (agree it is a bad name), I think tree_

[PATCH v4 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svdot[_f32_mf8]_fpm - svdot_lane[_f32_mf8]_fpm - svdot[_f16_mf8]_fpm - svdot_lane[_f16_mf8]_fpm The first two are available under a combination of the FP8DOT4 and SVE2 features. Alternatively under the SSVE_FP8DOT4 feature under streaming m

[PATCH v4 3/5] aarch64: add svcvt* FP8 intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v4 4/5] aarch64: add SVE2 FP8 multiply accumulate intrinsics

2024-11-14 Thread Claudio Bantaloukas
This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svml

[PATCH v4 0/5] aarch64: Add fp8 sve foundation

2024-11-14 Thread Claudio Bantaloukas
The ACLE defines a new set of fp8 vector types and intrinsics that operate on these, some of them operating on the vectors as if they were bags of bits and some requiring an additional argument of type fpm_t. The following patches introduce: - the types - intrinsics that operate without the fpm_

Re: [PATCH] c: Introduce -Wmissing-parameter-name

2024-11-14 Thread Joseph Myers
On Thu, 14 Nov 2024, Florian Weimer wrote: > * c-family/c-opts.cc (c_common_post_options): Initialize > warn_missing_parameter_name. > * c-family/c.opt (Wmissing-parameter-name): New. > * c/c-decl.cc (store_parm_decls_newstyle): Use > OPT_Wmissing_parameter_name for m

Re: [PATCH] c, v2: Add _Decimal64x support

2024-11-14 Thread Joseph Myers
On Thu, 14 Nov 2024, Jakub Jelinek wrote: > Here is an updated patch, in addition to removing the DEC64X_SUBNORMAL_MIN > redefinition I've also adjusted the c23-decimal64x-4.c test to use > the __STDC_WANT_IEC_60559_TYPES_EXT__ macro which is now needed and > test DEC64X_TRUE_MIN rather than DEC64

Re: [PATCH] c: Introduce -Wmissing-parameter-name

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 07:09:04PM +, Joseph Myers wrote: > On Thu, 14 Nov 2024, Florian Weimer wrote: > > > * c-family/c-opts.cc (c_common_post_options): Initialize > > warn_missing_parameter_name. > > * c-family/c.opt (Wmissing-parameter-name): New. > > * c/c-decl.cc (store_p

Re: [patch,lra] PR117191 remove unnecessary CLOBBER insns after LRA

2024-11-14 Thread Vladimir Makarov
On 11/13/24 14:10, Denis Chertykov wrote: The fix for PR117191 Wrong code appears after dse2 pass because it removes necessary insns. (ie insn 554 - store to frame spill slot) This happened because LRA pass doesn't cleanup the code exactly like reload does. The reload1.c has a special pass f

Re: [PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 07:07:48PM +0100, Jan Hubicka wrote: > > Hi! > > > > The inlining heuristics uses DECL_DECLARED_INLINE_P (whether a function > > has been explicitly marked inline; that can be inline keyword, or for C++ > > also constexpr keyword or defining a function inside of a class def

[PATCH] Report the section name in case of section type conflicts

2024-11-14 Thread Florian Weimer
The section name might the user a hint of what is going on. I tried to include the flags as well, but there didn't seem to be consensus about including the internal section flags in the diagnostics: [RFC PATCH] More detailed diagnostics for section type conflicts

Re: [PATCH 08/11] c: c++: flag to disable fetch_op handling fenv exceptions

2024-11-14 Thread Joseph Myers
On Thu, 14 Nov 2024, mmalcom...@nvidia.com wrote: > N.b. I would appreciate any feedback about how one should handle such a > situation when working with C11 _Atomic types. They have the same > problem that they require libatomic and sometimes libatomic is not > available. Is this just something

Re: [PATCH] [RFC] Single iteration peeling for gaps is sufficient with loop masking

2024-11-14 Thread Richard Sandiford
Richard Biener writes: >> Am 14.11.2024 um 17:38 schrieb Richard Sandiford : >> >> Richard Biener writes: >>> When we do loop masking via mask or length a single scalar iteration >>> should be sufficient to avoid excess accesses. This fixes the last >>> known FAILs with --param vect-force-slp=

[PATCH 2/3] AArch64: Add FULLY_PIPELINED_FMA to tune baseline

2024-11-14 Thread Wilco Dijkstra
Add FULLY_PIPELINED_FMA to tune baseline - this is a generic feature that is already enabled for some cores, but benchmarking it shows it is faster on all modern cores (SPECFP improves ~0.17% on Neoverse V1 and 0.04% on Neoverse N1). Passes regress & bootstrap, OK for commit? gcc/ChangeLog:

Re: [PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jan Hubicka
> > +/* Nonzero if inlining should prefer inlining this function. > > + Shorthand for DECL_DECLARED_INLINE_P && !DECL_FEEBLE_INLINE_P. */ > > +#define DECL_OPTIMIZABLE_INLINE_P(NODE) \ > > + (DECL_DECLARED_INLINE_P (NODE) && !DECL_FEEBLE_INLINE_P (NODE)) > > We have 10 bits left, but it would

Re: [PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jan Hubicka
> Hi! > > The inlining heuristics uses DECL_DECLARED_INLINE_P (whether a function > has been explicitly marked inline; that can be inline keyword, or for C++ > also constexpr keyword or defining a function inside of a class definition) > heavily to increase desirability of inlining a function etc.

[PATCH 1/3] AArch64: Add baseline tune

2024-11-14 Thread Wilco Dijkstra
Cleanup the extra tune defines by introducing AARCH64_EXTRA_TUNE_BASE as a common base supported by all modern cores. Initially set it to AARCH64_EXTRA_TUNE_CHEAP_SHIFT_EXTEND. No change in generated code. Passes regress & bootstrap, OK for commit? gcc/ChangeLog: * config/aarch64/aarc

[committed] libstdc++: Fix get<0> constraint for lvalue ranges::subrange (LWG 3589)

2024-11-14 Thread Jonathan Wakely
Apprived at October 2021 plenary. libstdc++-v3/ChangeLog: * include/bits/ranges_util.h (subrange::begin): Fix constraint, as per LWG 3589. * testsuite/std/ranges/subrange/lwg3589.cc: New test. --- Tested x86_64-linux. Pushed to trunk. libstdc++-v3/include/bits/ranges_ut

[PATCH 3/3] AArch64: Add SVE vector cost to baseline tuning

2024-11-14 Thread Wilco Dijkstra
Add AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS and AARCH64_EXTRA_TUNE_MATCHED_VECTOR_THROUGHPUT to the baseline tuning since all modern cores use it. Fix the neoverse512tvb tuning to be like Neoverse V1/V2. gcc/ChangeLog: * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNE_BASE

Re: [PATCH] libstdc++: Make equal and is_permutation short-circuit (LWG 3560)

2024-11-14 Thread Jonathan Wakely
On Thu, 14 Nov 2024 at 17:13, Jonathan Wakely wrote: > > We already implement short-circuiting for random access iterators, but > we also need to do so for ranges::equal and ranges::is_permutation when > given sized ranges that are not random access ranges (e.g. std::list). > > libstdc++-v3/Change

Re: [PATCH] c: Add _Decimal64x support

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 12:22:37PM +, Joseph Myers wrote: > On Thu, 14 Nov 2024, Jakub Jelinek wrote: > > > --- gcc/ginclude/float.h.jj 2024-10-25 10:00:29.472767800 +0200 > > +++ gcc/ginclude/float.h2024-11-13 17:50:46.625592746 +0100 > > @@ -515,51 +515,63 @@ see the files COPYING3 and C

[PATCH] c, v2: Add _Decimal64x support

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 05:10:53PM +, Joseph Myers wrote: > On Thu, 14 Nov 2024, Jakub Jelinek wrote: > > > Should DEC64X_SUBNORMAL_MIN/DEC64X_TRUE_MIN macros be defined at all and if > > yes, under the same conditions as the rest? > > C23 only talks about *_TRUE_MIN, not about *_SUBNORMAL_MIN

[PATCH] c: Introduce -Wmissing-parameter-name

2024-11-14 Thread Florian Weimer
Empirically, omitted parameter names are difficult to catch in code review. With this change, projects can build with -Werror=missing-parameter-name, to avoid this unnecessary incompatibility with older GCC versions. The existing -pedantic-errors option is too broad for that because it also flags

Re: [PATCH] testsuite: arm: Prune incremental link warning

2024-11-14 Thread Torbjorn SVENSSON
On 2024-11-14 16:53, Christophe Lyon wrote: On Sun, 10 Nov 2024 at 17:44, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- When the feature "needs_status_wrapper" in dejagnu is used, the resulting gcc_tg.o file is a regular object file and thus the following warning will be e

[PATCH] ranger, v2: Handle nonnull_if_nonzero attribute [PR117023]

2024-11-14 Thread Jakub Jelinek
On Thu, Nov 14, 2024 at 10:05:05AM -0500, Andrew MacLeod wrote: > The inferred range mechanism is also initialized using cfun, so again > introducing a use of cfun shouldnt be an issue. > > Something like this ought to work I think? Thanks. Seems to work in quick smoke testing and I've added a t

[pushed] libstdc++: stdc++.h and

2024-11-14 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk as obvious. -- 8< -- r13-3036 moved #include into the new freestanding section, but also moved it from a C++20 section to a C++23 section. This patch moves it back. Incidentally, I'm curious why a few headers were removed from the hosted section (i

Re: [PATCH] libstdc++: Implement LWG 3563 changes to keys_view and values_view

2024-11-14 Thread Jonathan Wakely
On Thu, 14 Nov 2024 at 16:18, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps > 14/13? IIRC alias CTAD didn't work correctly in 12 so it's not worth > backportig there. OK for trunk and 13/14, thanks. > > -- >8 -- > > This LWG issue corrects the

Re: [wwwdocs][committed] projects/gomp/: Update for OpenMP 6.0 spec release

2024-11-14 Thread Tobias Burnus
Hi all, maybe doing parallel work doesn't work well. The previously attached diff is obviously not mine but an older one, commit before mine. The patch itself was also committed+pushed, but seemingly only after extracting the commit. See attachment for (hopefully) the proper patch. Tobias B

[pushed] c++: module dialect tweak

2024-11-14 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- Coroutines have been enabled by -std=c++20 since GCC 11. gcc/cp/ChangeLog: * module.cc (module_state_config::get_dialect): Expect coroutines in C++20. --- gcc/cp/module.cc | 5 +++-- 1 file changed, 3 insertions(+), 2 dele

[PATCH] libstdc++: Make equal and is_permutation short-circuit (LWG 3560)

2024-11-14 Thread Jonathan Wakely
We already implement short-circuiting for random access iterators, but we also need to do so for ranges::equal and ranges::is_permutation when given sized ranges that are not random access ranges (e.g. std::list). libstdc++-v3/ChangeLog: * include/bits/ranges_algo.h (__is_permutation_fn::

Re: [PATCH] c: Add _Decimal64x support

2024-11-14 Thread Joseph Myers
On Thu, 14 Nov 2024, Jakub Jelinek wrote: > Should DEC64X_SUBNORMAL_MIN/DEC64X_TRUE_MIN macros be defined at all and if > yes, under the same conditions as the rest? > C23 only talks about *_TRUE_MIN, not about *_SUBNORMAL_MIN. So there should be no DEC64X_SUBNORMAL_MIN (but should be DEC64X_TRUE

[PATCH v3 0/8] SMALL code model fixes, optimization fixes, LTO and minimal C++ enablement

2024-11-14 Thread Evgeny Karpov
Thursday, November 14, 2024 Richard Sandiford wrote: > Ah, ok, great! Could you add yourself to MAINTAINERS in that case? Sure, I will do that. Regards, Evgeny

Re: [PATCH v3] testsuite: arm: Use effective-target for attr-neon* tests

2024-11-14 Thread Torbjorn SVENSSON
On 2024-11-14 16:16, Christophe Lyon wrote: Hi Torbjörn, On Sun, 10 Nov 2024 at 10:09, Torbjörn SVENSSON wrote: Changes since v1: - Changed from arm_neon to arm_arch_v7a for the required effective target. Changes since v2: - Added arm_libc_fp_abi as an required effective taret. - Remove

Re: [PATCH] testsuite: arm: Use effective-target for pr68674.c test

2024-11-14 Thread Torbjorn SVENSSON
On 2024-11-14 16:26, Christophe Lyon wrote: On Fri, 8 Nov 2024 at 18:54, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? Can you describe what problem you are trying to fix? I'm guessing it's similar to your other patch for attr-neon* tests? And that the best / easiest course

contrib: Add 2 further ignored commits

2024-11-14 Thread Jeff Law
I goof'd and double-reverted a change. Add those to the ignore list, leaving the final reversion as-is. Jeff commit c924a03ae1c176bf1ead2aa31a96acaa7f22e719 Author: Jeff Law Date: Thu Nov 14 09:43:37 2024 -0700 contrib: Add 2 further ignored commits I goof'd and doub

Re: [PATCH] [RFC] Single iteration peeling for gaps is sufficient with loop masking

2024-11-14 Thread Richard Biener
> Am 14.11.2024 um 17:38 schrieb Richard Sandiford : > > Richard Biener writes: >> When we do loop masking via mask or length a single scalar iteration >> should be sufficient to avoid excess accesses. This fixes the last >> known FAILs with --param vect-force-slp=1. >> >> Bootstrap and reg

[pushed] c++: fix namespace alias export

2024-11-14 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- This affected std::views in module std. gcc/cp/ChangeLog: * name-lookup.cc (do_namespace_alias): set_originating_module after pushdecl. gcc/testsuite/ChangeLog: * g++.dg/modules/namespace-7_a.C: New test.

Re: [PATCH] [RFC] Single iteration peeling for gaps is sufficient with loop masking

2024-11-14 Thread Richard Sandiford
Richard Biener writes: > When we do loop masking via mask or length a single scalar iteration > should be sufficient to avoid excess accesses. This fixes the last > known FAILs with --param vect-force-slp=1. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Do we know of a case w

[PATCH] libstdc++: Implement LWG 3563 changes to keys_view and values_view

2024-11-14 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 14/13? IIRC alias CTAD didn't work correctly in 12 so it's not worth backportig there. -- >8 -- This LWG issue corrects the definition of these alias templates to make them eligible for alias CTAD. libstdc++-v3/ChangeLog:

Re: [PATCH v3 0/8] SMALL code model fixes, optimization fixes, LTO and minimal C++ enablement

2024-11-14 Thread Richard Sandiford
Evgeny Karpov writes: > Tuesday, November 12, 2024 > Richard Sandiford wrote: >> I can't remember if I've asked this before, but would you like commit >> access? If so, please follow the process on >> https://gcc.gnu.org/gitwrite.html >> listing me as sponsor. > > Thank you, I already have an a

Re: [PATCH v4 00/23] aarch64: Add support for Guarded Control Stack extension

2024-11-14 Thread Richard Sandiford
Yury Khrustalev writes: > This patch series adds support for the Guarded Control Stack extension [1]. > > GCS marking for binaries is specified in [2]. > ACLE intrinsics are discussed in [3]. > > Regression tested on AArch64 and no regressions have been found. > Applies to 5a674367c6d in trunk. >

Re: [PATCH] testsuite: arm: Use effective-target for unsigned-extend-1.c

2024-11-14 Thread Torbjorn SVENSSON
On 2024-11-14 16:32, Christophe Lyon wrote: On Fri, 8 Nov 2024 at 19:49, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- A long time ago, this test forced -march=armv6. With -marm, the generated assembler is: foo: sub r0, r0, #48 cmp r0, #9

[wwwdocs][committed] projects/gomp/: Update for OpenMP 6.0 spec release

2024-11-14 Thread Tobias Burnus
Updated https://gcc.gnu.org/projects/gomp/ as now OpenMP 6.0 has been released: https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-6-0.pdf Which is under spec releases and changing TR13 to OpenMP 6.0 Tobias commit cb607960883c6528a1c43d4e59b80a626dc6a386 Author: Tobias Burnus

[PATCH v2 1/2] aarch64: Use standard names for saturating arithmetic

2024-11-14 Thread Akram Ahmad
This renames the existing {s,u}q{add,sub} instructions to use the standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and IFN_SAT_SUB. The NEON intrinsics for saturating arithmetic and their corresponding builtins are changed to use these standard names too. Using the standard names for

[PATCH] Introduce feeble_inline attribute [PR93008]

2024-11-14 Thread Jakub Jelinek
Hi! The inlining heuristics uses DECL_DECLARED_INLINE_P (whether a function has been explicitly marked inline; that can be inline keyword, or for C++ also constexpr keyword or defining a function inside of a class definition) heavily to increase desirability of inlining a function etc. In most cas

Re: [PATCH 09/11] doc: Mention floating point atomic fetch_add etc in docs

2024-11-14 Thread Sandra Loosemore
On 11/14/24 07:51, Gerald Pfeifer wrote: On Thu, 14 Nov 2024, mmalcom...@nvidia.com wrote: gcc/ChangeLog: * doc/extend.texi: Document ability to use floating point atomic fetch_add/fetch_sub/add_fetch/sub_fetch builtins. +Moreover, the @samp{__atomic_fetch_add}, @samp{__atomi

[PATCH v3 0/8] SMALL code model fixes, optimization fixes, LTO and minimal C++ enablement

2024-11-14 Thread Evgeny Karpov
Tuesday, November 12, 2024 Richard Sandiford wrote: > Thanks for the update. As usual, I can only comment on the GCC internals, > without much knowledge of how mingw actually behaves. On that basis, > I've replied to two of the patches individually, but the series otherwise > looks good to me.

[PATCH v2 2/2] aarch64: Use standard names for SVE saturating arithmetic

2024-11-14 Thread Akram Ahmad
Rename the existing SVE unpredicated saturating arithmetic instructions to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB. gcc/ChangeLog: * config/aarch64/aarch64-sve.md: Rename insns gcc/testsuite/ChangeLog: * gcc/testsuite/gcc.target/aarch64/sve/saturating_ar

[PATCH v2 0/2] aarch64: Use standard names for saturating arithmetic

2024-11-14 Thread Akram Ahmad
Hi all, This patch series introduces standard names for scalar, Adv. SIMD, and SVE saturating arithmetic instructions in the aarch64 backend. Additional tests are added for scalar saturating arithmetic, as well as to test that the auto-vectorizer correctly inserts NEON instructions or scalar inst

Re: [PATCH] testsuite: arm: Prune incremental link warning

2024-11-14 Thread Christophe Lyon
On Sun, 10 Nov 2024 at 17:44, Torbjörn SVENSSON wrote: > > Ok for trunk and releases/gcc-14? > > -- > > When the feature "needs_status_wrapper" in dejagnu is used, the > resulting gcc_tg.o file is a regular object file and thus the following > warning will be emitted if doing an incremental link:

[committed] libstdc++: Add missing constraint to operator+ for std::move_iterator

2024-11-14 Thread Jonathan Wakely
This constraint was added by the One Ranges proposal (P0896R4) and then fixed by LWG 3293, but it was missing from libstdc++. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (operator+): Add constraint to move_iterator operator. * testsuite/24_iterators/move_iterator

Re: [PATCH] testsuite: arm: Use effective-target for unsigned-extend-1.c

2024-11-14 Thread Christophe Lyon
On Fri, 8 Nov 2024 at 19:49, Torbjörn SVENSSON wrote: > > Ok for trunk and releases/gcc-14? > > -- > > A long time ago, this test forced -march=armv6. > > With -marm, the generated assembler is: > foo: > sub r0, r0, #48 > cmp r0, #9 > movhi r0, #0 > movls

[committed] libstdc++: Use requires-clause for __normal_iterator constructor

2024-11-14 Thread Jonathan Wakely
This is a very minor throughput optimization, to avoid instantiating std::enable_if and std::is_convertible when concepts are available. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (__normal_iterator): Replace enable_if constraint with requires-clause. --- Tested x86_64

  1   2   >