Re: [PATCH] vxworks: libstdc++: include ioLib.h for dup()

2025-05-08 Thread Jonathan Wakely
On Fri, 9 May 2025, 03:29 Alexandre Oliva, wrote: > > vxworks's dup function is not declared in unistd.h, but c++23/print.cc > expects to be able to call it if unistd.h is available. On vxworks, > the function is only declared in ioLib.h, so arrange to include it. > > Tested with gcc-14 targetin

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-08 Thread Tomasz Kaminski
On Wed, May 7, 2025 at 11:37 AM Luc Grosheintz wrote: > > On 5/6/25 2:47 PM, Tomasz Kaminski wrote: > > On Tue, May 6, 2025 at 1:39 PM Luc Grosheintz > > wrote: > > > >> > >> On 5/6/25 11:28 AM, Tomasz Kaminski wrote: > >>> For better reference, here is illustration of the design I was thinking

Re: [PATCH] ctf: emit CTF_K_ARRAY for GNU vector types

2025-05-08 Thread Indu
On 2025-05-01 2:34 p.m., Bruce McCulloch wrote: Currently, there is a check in gen_ctf_array_type that prevents GNU vectors generated by the vector attribute from being emitted (e.g. typedef int v8si __attribute__ ((vector_size (32)));). Because this check happens in dwarf2ctf.cc, this prevents G

[PATCH v2] MIPS: Fix the issue with the '-fpatchable-function-entry=' feature.

2025-05-08 Thread Lulu Cheng
From: ChengLulu PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the functionality of '-fpatchable-function-entry='. (mips_print_patchable_function_entry): Define empty function. (TARGET_ASM_PRINT_PATCHABLE

[PATCH] match: Don't allow folling statements that can throw internally [PR119903]

2025-05-08 Thread Andrew Pinski
This removes the ability to follow statements that can throw internally. This was suggested in bug report as a way to solve the issue here. The overhead is not that high since without non-call exceptions turned on, there is an early exit for non-calls. PR tree-optimization/119903 gcc/Chan

[PATCH] [testsuite] [ppc] pr110071 requires power6 for shrink-wrapping

2025-05-08 Thread Alexandre Oliva
The test's expectation of shrink-wrapping is only met starting at power6. At earlier CPUs, the register allocator prefers to preserve an incoming argument around a call in a call-saved register, rather than in a stack slot, and that prevents shrink-wrapping. Tested with gcc-14 targeting ppc-vx7

[PATCH] add explicit ABI and align options to pr88233.c

2025-05-08 Thread Alexandre Oliva
We've observed failures of this test on powerpc configurations that default to different calling conventions and alignment requirements. Both settings are needed for the original expectations to be met. The test was later modified to have different expectations for big and little endian code gen

[PATCH] [testsuite] [ppc] adjust vsx-builtin-7.c xxpermdi/rldic counts

2025-05-08 Thread Alexandre Oliva
xxpermdi (and rldic) instruction counts are slightly lower than expected, because icf turns insert_di_0_v2 into a insert_di_0 tail call. Adjust. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux-gnu, and with gcc-14 targeting powerpc-elf. Ok to ins

[PATCH] add explicit ABI and align options to pr88233.c

2025-05-08 Thread Alexandre Oliva
We've observed failures of this test on powerpc configurations that default to different calling conventions and alignment requirements. Both settings are needed for the original expectations to be met. The test was later modified to have different expectations for big and little endian code gen

[PATCH] [testsuite] [ppc] add -mpowerpc-gfxopt or -mcmpb to copysign tests

2025-05-08 Thread Alexandre Oliva
When it comes to ifn_copysign on ppc, for SFmode and DFmode, the conditions are quite elaborate. It takes hard_float in addition to any of -mcmpb, vsx vectors for the mode, or -mpowerpc-gfxopt with fast-math (-ffinith-math-only and -fno-signed-zeros). A number of ifn_copysign tests add custom o

[PATCH] match: Remove (ne (cmp) 0) and (eq (cmp) 1) patterns

2025-05-08 Thread Andrew Pinski
These patterns are not needed any more. There were already 2 patterns which did `(ne bool_var 0)` into `bool_var` and `(eq bool_var 1)` into `bool_var`. Just they were after the pattern that did `(cmp (cond @0 @1 @2) @3)` simplification but that pattern is now after the ones. Also these patterns wi

Re: [PATCH] [testsuite] [ppc] require float128 available for copysign

2025-05-08 Thread Alexandre Oliva
On Apr 11, 2025, Alexandre Oliva wrote: > for gcc/testsuite/ChangeLog > * lib/target-supports.exp (check_effective_target_ifn_copysign): > Require float128 on ppc. I hereby withdraw this patch, it was based on a misunderstanding. -- Alexandre Oliva, happy hackerhttps:

[PATCH] [vxworks] wrap base/b_NULL.h to override NULL

2025-05-08 Thread Alexandre Oliva
Some versions of vxworks define NULL to __nullptr in C++, assuming C++11, which breaks at least a number of analyzer tests that get exercised in C++98 mode. Wrap the header that defines NULL so that, after including it, we override the NULL definition with the one provided by stddef.h. That req

[PATCH] [testsuite] [vxworks] skip macros from implicitly-included vxConfig.h

2025-05-08 Thread Alexandre Oliva
On vxworks, vxConfig.h is implicitly included, and it defines multiple macros in the namespace reserved for the implementation. g++.dg/modules/macro-5_a.H tests that macros from the command-line do not make the module output, but it can't tell them from macros from implicitly-included headers, s

[PATCH] libstdc++-v3: testsuite: lengthen stop_request wait_until timeout

2025-05-08 Thread Alexandre Oliva
30_threads/condition_variable_any/stop_token/wait_on.cc's test_wait_until occasionally fails on vxworks under very high load, in a way that suggests wait_until times out before the main thread requests it to stop. Extend the timeouts to make more room for the stop request. Tested with gcc-14 ta

[PATCH] [testsuite] [ppc] expect vectorization in gen-vect-11c.c

2025-05-08 Thread Alexandre Oliva
The first loop in main gets stores "vectorized" on powerpc into full-word stores, even without any vector instruction support, so the test's expectation of no loop vectorization is not met. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux-gnu, and

[PATCH] [testsuite] [ppc] disable strict align for block-cmp-[14].c

2025-05-08 Thread Alexandre Oliva
The expected memcmp inline expansion assumes -mno-strict-align, so make it explicit in case strict-align is enabled by default. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux-gnu, and with gcc-14 targeting powerpc-elf. Ok to install? for gcc/t

[PATCH] vxworks: undefine TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL

2025-05-08 Thread Alexandre Oliva
config.gcc arranges for vxworks 7r2+ targets to include linux.h, because of the similarity, but linux.h defines TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL to a function declared in linux-protos.h, and defined in linux.cc, neither of which vxworks targets include. Undefine it in vxworks.h. Tested with

[PATCH] [testsuite] [vxworks] add -gno-strict-dwarf to pr111409.c

2025-05-08 Thread Alexandre Oliva
The expected macro debug information is not issued with -gstrict-dwarf, and ports such as vxworks default to that. Allow non-strict dwarf for the test. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux-gnu, and with gcc-14 targeting powerpc-elf. Ok

[PATCH] libstdc++-v3: testsuite: increase future/members/poll timing tolerance

2025-05-08 Thread Alexandre Oliva
In 30_threads/future/members/poll.c, despite the calibration and the large tolerance, wait_until_sys_min has occasionally come up to almost 320 times as long as ready. Tolerate that much measurement noise. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64l

[PATCH] libstdc++-v3: no -latomic on vxworks

2025-05-08 Thread Alexandre Oliva
libatomic is disabled on vxworks because it's part of libc, and not very granular there, so a separately-built libatomic often triggers link errors over duplicate definitions. So, don't link with -latomic, but keep atomic tests enabled. Unfortunately, some fence and flag primitives that are dec

[PATCH] [testsuite] [analyzer] [vxworks] define __STDC_WANT_LIB_EXT1__ to 1

2025-05-08 Thread Alexandre Oliva
vxworks' headers use #if instead of #ifdef to test for __STDC_WANT_LIB_EXT1__, so the definition in the analyzer test strotok-cppreference.c catches a bug there, but not something it's meant to catch or that we could fix in GCC, so amend the definition to sidestep the libc bug. Tested with gcc-1

[PATCH] [testsuite] [vxworks] netinet includes atomic, reqs c++11

2025-05-08 Thread Alexandre Oliva
On vxworks, the included netinet/in.h header indirectly includes , that fails on C++ <11. Skip the test. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux-gnu, and with gcc-14 targeting powerpc-elf. Ok to install? for gcc/testsuite/ChangeLog

[PATCH] vxworks: libstdc++: include ioLib.h for dup()

2025-05-08 Thread Alexandre Oliva
vxworks's dup function is not declared in unistd.h, but c++23/print.cc expects to be able to call it if unistd.h is available. On vxworks, the function is only declared in ioLib.h, so arrange to include it. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64

[PATCH] vxworks: libgcc: include string.h for memset

2025-05-08 Thread Alexandre Oliva
gthr-vxworks-thread.c calls memset in __ghtread_cond_signal, but it fails ot include , where this function is declared, and GCC 14 rejects calls of undeclared functions. Include the required header. Tested with gcc-14 targeting ppc-vx7r2 and ppc64-vx7r2. Also tested with trunk on ppc64le-linux

[AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-08 Thread Kugan Vivekanandarajah
Add support for autoprofiledbootstrap in aarch64. This is similar to what is done for i386. Added gcc/config/aarch64/gcc-auto-profile for aarch64 profile creation. How to run: configure --with-build-config=bootstrap-lto make autoprofiledbootstrap Regression tested on aarch64-linux-gnu with no ne

[AUTOFDO] Merge profiles of clones before annotating

2025-05-08 Thread Kugan Vivekanandarajah
This patch add support for merging profiles from multiple clones. That is, when optimized binaries have clones such as IPA-CP clone or SRA clones, genarted gcov will have profiled them spereately. Currently we pick one and ignore the rest. This patch fixes this by merging the profiles. Regression

[AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-08 Thread Kugan Vivekanandarajah
This patch fixes wrong annotation of profiles when call statement is de-duplicated. i.e., when we may have same stmt executing from more than one path (by jumping to same statment). Thus, the profile we get will be for multiple paths and would make the annotated profile wrong. As a fix, we dont ann

Re: [PATCH v2] RISC-V: Fix missing implied Zicsr from Zve32x

2025-05-08 Thread Nelson Chu
I think this should be sent to gcc-patches@gcc.gnu.org rather than binut...@sourceware.org, so redirect it to the right place. Nelson On Wed, Apr 30, 2025 at 10:30 AM Jerry Zhang Jian < jerry.zhangj...@sifive.com> wrote: > The Zve32x extension depends on the Zicsr extension. > Currently, enablin

RE: [PATCH v1 0/5] Add testcases for another case of vec_duplicate + vadd.vv combine

2025-05-08 Thread Li, Pan2
> OK, understood. I think that's expected given the fine granularity of the > tests. IMHO nothing that should block progress. Thanks Robin, then we can move to other vx/vf insns. Pan -Original Message- From: Robin Dapp Sent: Thursday, May 8, 2025 11:44 PM To: Li, Pan2 ; Robin Dapp ;

[pushed: r16-487] diagnostics: convert HTML output test plugin to 'experimental-html' sink [PR116792]

2025-05-08 Thread David Malcolm
In r15-3752-g48261bd26df624 I added a test plugin that overrode the regular output, instead emitting diagnostics in crude HTML form. In r15-4760-g0b73e9382ab51c I added support for multiple kinds of diagnostic output simultaneously, adding -fdiagnostics-add-output=DIAGNOSTICS-OUTPUT-SPEC -fdiagn

PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2025-05-08 Thread Michael Meissner
This patch has been submitted previously, but it was not responded to. We had optimizations for splat of a vector extract for the other vector types, but we missed having one for V2DI and V2DF. This patch adds a combiner insn to do this optimization. In looking at the source, we had similar opti

PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2025-05-08 Thread Michael Meissner
This patch was previous submitted during the GCC 15 time frame. The multibuff.c benchmark attached to the PR target/117251 compiled for Power10 PowerPC that implement SHA3 has a slowdown in the current trunk and GCC 14 compared to GCC 11 - GCC 13, due to excessive amounts of spilling. The main fu

PR target/108958 -- use mtvsrdd to zero extend GPR DImode to VSX TImode

2025-05-08 Thread Michael Meissner
This is an old patch that has been submitted off and on, and I'm resubmitting it again. Previously GCC would zero externd a DImode GPR value to TImode by first zero extending the DImode value into a GPR TImode value, and then do a MTVSRDD to move this value to a VSX register. This patch does the

Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-08 Thread Michael Meissner
This has been posted previously. This patch includes fixing some typos that Bernhard Reutner-Fischer suggested. In bug PR target/118541 on power9, power10, and power11 systems, for the function: extern double __ieee754_acos (double); double __acospi (double x) {

Re: [GCC16,RFC,V2 08/14] aarch64: memtag: implement target hooks

2025-05-08 Thread Indu Bhagat
On 5/1/25 11:48 AM, Richard Sandiford wrote: Indu Bhagat writes: MEMTAG sanitizer, which is based on the HWASAN sanitizer, will invoke the target-specific hooks to create a random tag, add tag to memory address, and finally tag and untag memory. Implement the target hooks to emit MTE instructi

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-08 Thread Richard Sandiford
Dhruv Chawla writes: > This patch modifies Advanced SIMD assembly generation to emit an LDR > instruction when a vector is created using a load to the first element with > the > other elements being zero. > > This is similar to what *aarch64_combinez already does. > > Example: > > uint8x16_t foo(

[to-be-committed][V2][RISC-V] Synthesize more efficient IOR/XOR sequences

2025-05-08 Thread Jeff Law
Bah! I hand-edited the patch to fix some missing HOST_WIDE_INT_UC macros I saw and botched it. While I was at it, I fixed various lint issues. No functional changes though. -- So mvconst_internal's primary benefit is in constant synthesis not impacting the combine budget in terms of the nu

[PATCH, committed] Fortran: parsing issue with DO CONCURRENT; ENDDO on same line [PR120179]

2025-05-08 Thread Harald Anlauf
Dear all, the attached patch fixes a 15/16 regression for parsing DO CONCURRENT when there was another statement following on the same line after a semicolon, because gfc_match_eos was called twice instead of just once. The patch was OK'ed by Jerry in the PR, regtested and pushed to mainline so

[to-be-committed][RISC-V] Synthesize more efficient IOR/XOR sequences

2025-05-08 Thread Jeff Law
This is Shreya's next packet of work -- infrastructure for removing mvconst_internal which would ultimately make Vineet happier :-) -- So mvconst_internal's primary benefit is in constant synthesis not impacting the combine budget in terms of the number of instructions it is willing to comb

Re: [PATCH 2/3] gimple-fold: Return early for GIMPLE_COND with true/false

2025-05-08 Thread Andrew Pinski
On Wed, Apr 23, 2025 at 2:03 AM Richard Biener wrote: > > On Wed, Apr 23, 2025 at 5:59 AM Andrew Pinski > wrote: > > > > To speed up things slightly so not needing to call all the way through > > to match and simplify, we should return early for true/false on GIMPLE_COND. > > I think we'd still

[PATCH RFC] libstdc++: run testsuite with -Wabi

2025-05-08 Thread Jason Merrill
Tested x86_64-pc-linux-gnu. Does this make sense for trunk? -- 8< -- I added this locally to check whether the PR120012 fix affects libstdc++ (it doesn't) but it seems generally useful to catch whether compiler ABI changes have library impact. libstdc++-v3/ChangeLog: * testsuite/lib/li

[pushed] c++: adjust PR99599/CWG2369 workaround

2025-05-08 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- This tweak to CWG2369 has gotten more discussion lately in CWG, including in P3606. In those discussions, it occurred to me that having the check depend on whether a class has been instantiated yet is unstable, that it should only check for

[PATCH 0/9] AArch64: CMPBR support

2025-05-08 Thread Karl Meakin
This patch series adds support for the CMPBR extension. It includes the new `+cmpbr` option and rules to generate the new instructions when lowering conditional branches. Karl Meakin (9): AArch64: place branch instruction rules together AArch64: reformat branch instruction rules AArch64: ren

[PATCH 6/9] AArch64: recognize `+cmpbr` option

2025-05-08 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH 8/9] AArch64: rules for CMPBR instructions

2025-05-08 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Mmit CMPBR instructions if possible. (BRANCH_LEN_P_1Kib): New constant. (BRANCH_LEN_N_1Kib): Likewise. (cbranch4): New ex

[PATCH 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-05-08 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH 7/9] AArch64: precommit test for CMPBR instructions

2025-05-08 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cmpbr.c: New test. --- gcc/testsuite/gcc.target/aarch64/cmpbr.c | 1378

[PATCH 3/9] AArch64: rename branch instruction rules

2025-05-08 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

[PATCH 4/9] AArch64: add constants for branch displacements

2025-05-08 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH 5/9] AArch64: make `far_branch` attribute a boolean

2025-05-08 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cb1): Likewise. (

[PATCH 1/9] AArch64: place branch instruction rules together

2025-05-08 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH 2/9] AArch64: reformat branch instruction rules

2025-05-08 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH] gimple-fold: Don't replace `{true/false} != false` with `true/false` inside GIMPLE_COND

2025-05-08 Thread Andrew Pinski
This is like the patch where we don't want to replace `bool_name != 0` with `bool_name` but for instead for INTEGER_CST. The only thing difference is there are a few different forms for always true/always false; only handle it if it was in the canonical form. A few new helpers are added for the can

Re: [PATCH] libstdc++: Update rows in C++17 status table

2025-05-08 Thread Jonathan Wakely
On Thu, 8 May 2025 at 18:57, Björn Schäpers wrote: > > Am 08.05.2025 um 15:50 schrieb Jonathan Wakely: > > Document that std::to_chars and std::from_chars are complete, mentioning > > the libraries used for floating-point types. > > > > libstdc++-v3/ChangeLog: > > > > * doc/xml/manual/status

RE: [PATCH ]RISCV :Added MIPS P8700 Subtarget

2025-05-08 Thread Palmer Dabbelt
On Thu, 08 May 2025 08:53:18 PDT (-0700), ukala...@mips.com wrote: Hi All , We have couple of patch series that enables the P8700 tune for RISCV core to upstream for GCC mainline. It will be good to hear from you guys on the patch feedback It's kind of hard to read because your patch is get

Re: [PATCH v1 0/5] Add testcases for another case of vec_duplicate + vadd.vv combine

2025-05-08 Thread Robin Dapp
it's just a vector cost model issue and some loops are not profitable to vectorize? Yes. For example, when gpr2vr is 1, int8_t cannot vectorize while uint8_t can. OK, understood. I think that's expected given the fine granularity of the tests. IMHO nothing that should block progress. -- R

Re: [PATCH] libstdc++: Update rows in C++17 status table

2025-05-08 Thread Björn Schäpers
Am 08.05.2025 um 15:50 schrieb Jonathan Wakely: Document that std::to_chars and std::from_chars are complete, mentioning the libraries used for floating-point types. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2017.xml: Update status for std::to_chars and std::from_chars.

Re: [PATCH] libstdc++: Use scope guard for deallocating nodes in deque.

2025-05-08 Thread Jonathan Wakely
On Fri, 18 Apr 2025 at 10:03, Tomasz Kamiński wrote: > > This patch adds a _Guard_nodes scope guard nested to the _Deque_base, > that deallocates the range of nodes, and replaces __try/__catch block > with approparietly constructed guard object. "appropriately" > > libstdc++-v3/ChangeLog: > >

Re: [PATCH] libstdc++: Use _Padding_sink in __formatter_chrono to produce padded output.

2025-05-08 Thread Jonathan Wakely
On Wed, 7 May 2025 at 12:00, Tomasz Kamiński wrote: > > Formatting code is extracted to _M_format_to function, that produced output > to specified iterator. This function is now invoked either with __fc.out() > directly (if width is not specified) or _Padding_sink::out(). > > This avoid formatting

Re: [PATCH v2] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-08 Thread Jonathan Wakely
On Tue, 6 May 2025 at 13:30, Tomasz Kamiński wrote: > > This patch provides _M_discarding functiosn for _Sink_iter and _Sink function > that returns true, if any further writes to the _Sink_iter and underlying > _Sink, > will be discared, and thus can be omitted. > > Currently only the _Padding_s

[PATCH] vect: Improve vectorization for small-trip-count loops using subvectors

2025-05-08 Thread Pengfei Li
This patch improves the auto-vectorization for loops with known small trip counts by enabling the use of subvectors - bit fields of original wider vectors. A subvector must have the same vector element type as the original vector and enough bits for all vector elements to be processed in the loop.

[PATCH v2] match.pd: Fold (x + y) >> 1 into IFN_AVG_FLOOR (x, y) for vectors

2025-05-08 Thread Pengfei Li
This patch folds vector expressions of the form (x + y) >> 1 into IFN_AVG_FLOOR (x, y), reducing instruction count on platforms that support averaging operations. For example, it can help improve the codegen on AArch64 from: add v0.4s, v0.4s, v31.4s ushrv0.4s, v0.4s, 1 to:

Re: [PATCH] Fix tree-ssa/pr31261.c testcase after r16-400 [PR120168]

2025-05-08 Thread Richard Biener
> Am 08.05.2025 um 18:19 schrieb Andrew Pinski : > > AFter r16-400-g5e363ffefaceb9, on targets where char is unsigned by > default, tree-ssa/pr31261.c testcase started to fail: > FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original "return > (char) -(unsigned char) c

[PATCH] Fix tree-ssa/pr31261.c testcase after r16-400 [PR120168]

2025-05-08 Thread Andrew Pinski
AFter r16-400-g5e363ffefaceb9, on targets where char is unsigned by default, tree-ssa/pr31261.c testcase started to fail: FAIL: gcc.dg/tree-ssa/pr31261.c scan-tree-dump-times original "return (char) -(unsigned char) c & 31;" 1 This is because the casts are no longer needed as both

Re: [PATCH 2/2] gensupport: validate compact constraint modifiers

2025-05-08 Thread Richard Sandiford
Richard Earnshaw writes: > For constraints there are operand modifiers and constraint qualifiers. > Operand modifiers apply to all alternatives and must appear, in > traditional syntax before the first alternative. Constraint > qualifiers, on the other hand must appear in each alternative to whic

RE: [PATCH ]RISCV :Added MIPS P8700 Subtarget

2025-05-08 Thread Umesh Kalappa
Hi All , We have couple of patch series that enables the P8700 tune for RISCV core to upstream for GCC mainline. It will be good to hear from you guys on the patch feedback Thank you in advance ~U -Original Message- From: Umesh Kalappa Sent: 03 May 2025 11:27 To: Jeff Law ; gcc-pat

RE: [PATCH v1 0/5] Add testcases for another case of vec_duplicate + vadd.vv combine

2025-05-08 Thread Li, Pan2
> it's just a vector cost model issue and some loops are not profitable > to vectorize? Yes. For example, when gpr2vr is 1, int8_t cannot vectorize while uint8_t can. +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl128b -mabi=lp64d --param=gpr2vr-cost=1" } */ + +#include "vx_binary.h

Re: [PATCH] emit-rtl: Add extra checks for paradoxical hardware subregs [PR119966]

2025-05-08 Thread Richard Sandiford
Dimitar Dimitrov writes: > On Tue, May 06, 2025 at 01:17:40PM +0100, Richard Sandiford wrote: >> Dimitar Dimitrov writes: >> > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the >> > following RTL for pru-unknown-elf: >> > >> > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0

Re: [PATCH v2] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-05-08 Thread Richard Sandiford
Konstantinos Eleftheriou writes: > During the base register initialization, when we are eliminating the load > instruction, we were calling `emit_move_insn` on registers of the same > size but of different mode in some cases, causing an ICE. > > This patch uses `lowpart_subreg` for the base regist

[PATCH] libstdc++: Update C++23 status table

2025-05-08 Thread Jonathan Wakely
This should have been updated for the GCC 15.1 release. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2023.xml: Update status of proposals implemented after GCC 14.2 release. * doc/html/manual/status.html: Regenerate. --- libstdc++-v3/doc/html/manual/status.html

Re: [PATCH] libstdc++: Update rows in C++17 status table

2025-05-08 Thread Jonathan Wakely
On Thu, 8 May 2025 at 14:59, Jakub Jelinek wrote: > > On Thu, May 08, 2025 at 02:50:27PM +0100, Jonathan Wakely wrote: > > Document that std::to_chars and std::from_chars are complete, mentioning > > the libraries used for floating-point types. > > > > libstdc++-v3/ChangeLog: > > > > * doc/x

[PATCH] libstdc++: Make dg-require-namedlocale work for more targets [PR65909]

2025-05-08 Thread Jonathan Wakely
As noted in the PR, some embedded targets do not support command-line arguments, which means that the dg-require-namedlocale check always fails. Use Sandra's suggestion of hardcoding the argument into the executable instead of passing it as a command-line argument. Realistically, those embedded ta

Re: [PATCH] libstdc++: Update rows in C++17 status table

2025-05-08 Thread Jakub Jelinek
On Thu, May 08, 2025 at 02:50:27PM +0100, Jonathan Wakely wrote: > Document that std::to_chars and std::from_chars are complete, mentioning > the libraries used for floating-point types. > > libstdc++-v3/ChangeLog: > > * doc/xml/manual/status_cxx2017.xml: Update status for > std::to_c

Re: [PATCH] testsuite: g++.dg/cpp2a/decomp2.C requires tls_runtime

2025-05-08 Thread Jakub Jelinek
On Thu, May 08, 2025 at 03:07:29PM +0200, Christophe Lyon wrote: > Ping? > > Le jeu. 17 avr. 2025, 11:21, Christophe Lyon a > écrit : > > > Since this test is a 'dg-do run', it requires tls_runtime rather than > > just tls. > > > > This makes the test UNSUPPORTED on targets such as arm-non-eabi,

[PATCH] libstdc++: Update rows in C++17 status table

2025-05-08 Thread Jonathan Wakely
Document that std::to_chars and std::from_chars are complete, mentioning the libraries used for floating-point types. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2017.xml: Update status for std::to_chars and std::from_chars. * doc/html/manual/*: Regenerate. --- Pat

Re: [PATCH] testsuite: g++.dg/cpp2a/constinit16.C requires tls

2025-05-08 Thread Jakub Jelinek
On Thu, May 08, 2025 at 03:07:50PM +0200, Christophe Lyon wrote: > Ping? > > Le jeu. 17 avr. 2025, 11:21, Christophe Lyon a > écrit : > > > This test is 'dg-do compile', so require tls instead of tls_runtime. > > > > This enables it on targets such as arm-none-eabi configured with > > --enable-t

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-08 Thread Yangyu Chen
> On 8 May 2025, at 18:36, Richard Sandiford wrote: > > Yangyu Chen writes: >>> On 6 May 2025, at 17:49, Alfie Richards wrote: >>> >>> On 06/05/2025 09:36, Yangyu Chen wrote: > On 6 May 2025, at 16:01, Alfie Richards wrote: > > Additionally, I think ideally the file can expres

[PATCH v1 3/5] RISC-V: Add testcases for vec_duplicate + vadd.vv combine case 1 with GR2VR cost 0

2025-05-08 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine case 1 to vadd.vx. The late-combine will take action when GR2VR cost is 0, because the vmv and the vadd.vx will consume the same cost of GR2VR. Aka: Before: L1: vmv.v.x vadd.vv J L1 After: L1: vadd.vx J L1 The b

[PATCH 3/8] RISC-V: Generate extension table in documentation from riscv-ext.def

2025-05-08 Thread Kito Cheng
Automatically build the ISA extension reference table in invoke.texi from the unified riscv-ext.def metadata, ensuring documentation stays in sync with extension definitions and reducing manual maintenance. gcc/ChangeLog: * doc/invoke.texi: Replace hand‑written extension table with

Re: [PATCH] testsuite: arm: Fix unsigned-extend-2.c [PR116445]

2025-05-08 Thread Christophe Lyon
Ping? Le ven. 11 avr. 2025, 18:36, Christophe Lyon a écrit : > The test was designed to pass with thumb2, but code generation changed > with the introduction of Low Overhead Loops, so the test can fail if > one overrides the flags when running the testsuite. > > In addition, useless subtract / e

Re: [PATCH] testsuite: g++.dg/cpp2a/decomp2.C requires tls_runtime

2025-05-08 Thread Christophe Lyon
Ping? Le jeu. 17 avr. 2025, 11:21, Christophe Lyon a écrit : > Since this test is a 'dg-do run', it requires tls_runtime rather than > just tls. > > This makes the test UNSUPPORTED on targets such as arm-non-eabi, > instead of FAIL/UNRESOLVED because __aeabi_read_tp is not provided > (e.g. when

Re: [PATCH] testsuite: g++.dg/cpp2a/constinit16.C requires tls

2025-05-08 Thread Christophe Lyon
Ping? Le jeu. 17 avr. 2025, 11:21, Christophe Lyon a écrit : > This test is 'dg-do compile', so require tls instead of tls_runtime. > > This enables it on targets such as arm-none-eabi configured with > --enable-threads=no. > > gcc/testsuite/ChangeLog: > > * g++.dg/cpp2a/constinit16.C: R

[14.x PATCH] c: Allow bool and enum null pointer constants [PR112556]

2025-05-08 Thread Sam James
From: Joseph Myers As reported in bug 112556, GCC wrongly rejects conversion of null pointer constants with bool or enum type to pointers in convert_for_assignment (assignment, initialization, argument passing, return). Fix the code there to allow BOOLEAN_TYPE and ENUMERAL_TYPE; it already allow

[PATCH] tree-optimization/119960 - failed external SLP promotion

2025-05-08 Thread Richard Biener
The following addresses a too conservative sanity check of SLP nodes we want to promote external. The issue lies in code generation for such external which relies on get_later_stmt to figure an insert location. But get_later_stmt relies on the ability to totally order stmts, specifically implemen

Re: [PATCH v1 0/5] Add testcases for another case of vec_duplicate + vadd.vv combine

2025-05-08 Thread Robin Dapp
This patch series would like to add the testcases for this. However, some test results is not that tidy, and we need more tuning for the vector cost model. The test adjustments LGTM but what do you mean by not tidy? I see you're scanning just for the presence of "vx" instead of an exact numbe

[PATCH] tree-optimization/116352 - amend previous fix

2025-05-08 Thread Richard Biener
The previous fix restricted external vector builds to defs from the same basic-block. That turns out too restrictive so we have to mitigate the original issue in a different way which is restricting it to the original case where all defs are in the same basic-block. Bootstrapped and tested on x86

[PATCH][v2] tree-optimization/120043 - bogus conditional store elimination

2025-05-08 Thread Richard Biener
The following fixes conditional store elimination to properly check for conditional stores to readonly memory which we can obviously not store to unconditionally. The tree_could_trap_p predicate used is only considering rvalues and the chosen approach mimics that of loop store motion. Bootstrappe

Re: [RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-05-08 Thread Richard Sandiford
Kyrylo Tkachov writes: > In Hi Richard, > >> On 6 May 2025, at 12:34, Richard Sandiford wrote: >> >> writes: >>> From: Soumya AR >>> >>> Hi, >>> >>> This RFC and subsequent patch series introduces support for printing and >>> parsing >>> of aarch64 tuning parameters in the form of JSON. >>

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-08 Thread Richard Sandiford
Yangyu Chen writes: >> On 6 May 2025, at 17:49, Alfie Richards wrote: >> >> On 06/05/2025 09:36, Yangyu Chen wrote: On 6 May 2025, at 16:01, Alfie Richards wrote: Hello, I like this idea. I have a couple thoughts to add. On 05/05/2025 09:46, Yangyu Chen wro

Re: [PATCH] AArch64: Optimize SVE loads/stores with ptrue predicates to unpredicated instructions.

2025-05-08 Thread Richard Sandiford
Sorry for the slow review. Jennifer Schmitz writes: > SVE loads and stores where the predicate is all-true can be optimized to > unpredicated instructions. For example, > svuint8_t foo (uint8_t *x) > { > return svld1 (svptrue_b8 (), x); > } > was compiled to: > foo: > ptrue p3.b, all >

Re: [PATCH] testsuite: Skip pr119160 for RISC-V backend.

2025-05-08 Thread Richard Biener
On Thu, May 8, 2025 at 10:02 AM Jiawei wrote: > > RISC-V backend don't support '-mgeneral-regs-only' option, skip it. > https://godbolt.org/z/38M8vPW74 The test should instead use /* { dg-additional-options "-mgeneral-regs-only" { target { x86_64-*-* i?86-*-* } } } */ OK with that change. Rich

Re: [PATCH] testsuite: Skip pr119160 for RISC-V backend.

2025-05-08 Thread Konstantinos Eleftheriou
Hi, This should be restricted to arm/aarch64 and x86. So it should be: /* { dg-additional-options "-mgeneral-regs-only" { target { x86_64-*-* i?86-*-* aarch64*-*-* arm*-*-* } } } */ Konstantinos On Thu, May 8, 2025 at 11:36 AM jiawei wrote: > > > 在 2025/5/8 16:25, Richard Biener 写道: > > On Thu,

Re: [PATCH 00/13] arm: Remove iWMMXT code generation

2025-05-08 Thread Richard Earnshaw (lists)
On 08/05/2025 10:21, Kyrylo Tkachov wrote: > Hi Richard, > >> On 7 May 2025, at 18:15, Richard Earnshaw wrote: >> >> >> The header file for the Arm implementation of mmintrin.h was changed in >> GCC-15 >> to disable access to the intrinsics. This patch removes the internal code >> as well. >> >

[PATCH 2/2] gensupport: validate compact constraint modifiers

2025-05-08 Thread Richard Earnshaw
For constraints there are operand modifiers and constraint qualifiers. Operand modifiers apply to all alternatives and must appear, in traditional syntax before the first alternative. Constraint qualifiers, on the other hand must appear in each alternative to which they apply. There's no easy way

[PATCH 1/2] aarch64: Fix up commutative and early-clobber markers on compact insns

2025-05-08 Thread Richard Earnshaw
For constraints there are operand modifiers and constraint qualifiers. Operand modifiers apply to all alternatives and must appear, in traditional syntax before the first alternative. Constraint qualifiers, on the other hand must appear in each alternative to which they apply. There's no easy way

Re: [PATCH 00/13] arm: Remove iWMMXT code generation

2025-05-08 Thread Kyrylo Tkachov
Hi Richard, > On 7 May 2025, at 18:15, Richard Earnshaw wrote: > > > The header file for the Arm implementation of mmintrin.h was changed in GCC-15 > to disable access to the intrinsics. This patch removes the internal code > as well. > > We still allow -mcpu/-march options for the wmmx cpus,

Re: [PATCH] testsuite: Limit option '-mgeneral-regs-only' backends in pr119160.

2025-05-08 Thread Richard Biener
On Thu, May 8, 2025 at 11:04 AM Jiawei wrote: > > Limit option '-mgeneral-regs-only' to those in supported backends. > > Version log: > https://patchwork.sourceware.org/project/gcc/patch/20250508080102.1340059-1-jia...@iscas.ac.cn/ OK. > gcc/testsuite/ChangeLog: > > * gcc.dg/pr11

[PATCH] testsuite: Limit option '-mgeneral-regs-only' backends in pr119160.

2025-05-08 Thread Jiawei
Limit option '-mgeneral-regs-only' to those in supported backends. Version log: https://patchwork.sourceware.org/project/gcc/patch/20250508080102.1340059-1-jia...@iscas.ac.cn/ gcc/testsuite/ChangeLog: * gcc.dg/pr119160.c: Limit backends. --- gcc/testsuite/gcc.dg/pr119160.c | 3

Re: [PATCH 7/8] AArch64: precommit test for CMPBR instructions

2025-05-08 Thread Richard Earnshaw (lists)
On 07/05/2025 18:21, Richard Sandiford wrote: > Richard Earnshaw writes: >> On 07/05/2025 17:28, Richard Earnshaw (lists) wrote: >>> On 07/05/2025 16:54, Richard Sandiford wrote: Richard Earnshaw writes: > On 07/05/2025 13:57, Richard Sandiford wrote: >> Kyrylo Tkachov writes: >

  1   2   >