在 2025/11/10 下午6:34, Xi Ruoyao 写道:
On Mon, 2025-11-10 at 17:49 +0800, mengqinggang wrote:
-(define_insn "atomic_fetch_"
+(define_expand "atomic_fetch_"
+ [(match_operand:GPR 0 "register_operand") ;; old
value at mem
+ (any_atomic:GPR (match_operand:GPR 1 "memory_operand") ;
Pushed to r16-5148...r16-5160
在 2025/10/28 下午4:20, Lulu Cheng 写道:
1. Fixed ICE issue caused by passing invalid string to attribute target,
for example: "__attribute__ ((target ("arch")))"
2. Add the option of extended instructions of la64v1.1 to the attribute
support list.
3. Add FMV support f
On Tue, 11 Nov 2025, Jakub Jelinek wrote:
> Hi!
>
> Since r11-2238-ge443d8213864ac337c29092d4767224f280d2062 the C++ FE
> emits clobbers like *_1 = {CLOBBER}; where *_1 MEM_REF has some scalar
> type like int for -flifetime-dse={1,2} and most of the compiler manages
> to cope with that.
> If we a
On Tue, Nov 11, 2025 at 1:53 AM Andrew Pinski
wrote:
>
> So factor_out_operators will factor out some expressions but in the case
> of BIT_FIELD_REF and BIT_INSERT_EXPR, this only allowed for operand 0 as the
> other operands need to be constant.
>
> Bootstrapped and tested on x86_64-linux-gnu.
O
在 2025/11/10 下午6:34, Xi Ruoyao 写道:
On Mon, 2025-11-10 at 17:49 +0800, mengqinggang wrote:
-(define_insn "atomic_fetch_"
+(define_expand "atomic_fetch_"
+ [(match_operand:GPR 0 "register_operand") ;; old
value at mem
+ (any_atomic:GPR (match_operand:GPR 1 "memory_operand") ;
Hi!
Since r11-2238-ge443d8213864ac337c29092d4767224f280d2062 the C++ FE
emits clobbers like *_1 = {CLOBBER}; where *_1 MEM_REF has some scalar
type like int for -flifetime-dse={1,2} and most of the compiler manages
to cope with that.
If we are very unlucky, we trigger an ICE while trying to regimp
在 2025/11/8 下午12:29, Xi Ruoyao 写道:
As [1] says, we cannot mix up lock-free and locking atomics for one
object. For example assume atom = (0, 0) initially, if we have a
locking "atomic" xor running on T0 and a lock-free store running on T1
concurrently:
T0| T1
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
-- >8 --
The ICE in the PR is because we're attempting to create a binding for an
imported declaration. This is problematic because if there are
duplicates we'll stream via a tt_entity, but won't enable deduplication
on the relevan
Just realized I sent this only to Harald... added his test case, patch
should be fine now.
Best,
Chris
Begin forwarded message:
Date: Mon, 10 Nov 2025 22:07:51 +0100
From: Christopher Albert
To: Harald Anlauf
Subject: [PATCH v3] fortran: Fix ICE and self-assignment bugs with
recursive allocata
On Sat, 2025-11-08 at 00:12 +0100, Martin Jambor wrote:
[...snip...]
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-from-cst-agg-3.c
> b/gcc/testsuite/gcc.dg/tree-ssa/vrp-from-cst-agg-3.c
> new file mode 100644
> index 000..d45928e0a25
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-
More infrastructure on the way to eliminating the define_insn_and_split
for zero-extensions.
Exposing the shift-pair approach in the expander may change the order in
which operands appear in later RTL. In the case of packw detection
order matters. It shouldn't, it's an IOR after all, but it
I found that the title might be misleading, so I’ll fix it in v2.
The problem actually also occurs when the input is a preprocessed file
(e.g. .i or .ii).
$ riscv64-unknown-elf-gcc x.i
cc1: error: /home/scratch/build/install/riscv64-unknown-elf/usr/local/include:
Permission denied
cc1: error: /hom
This fixes a permission error that occurs when cross-compiling with
-save-temps and a relocated toolchain, where the original build path
exists but is inaccessible.
The issue only happened when:
- Building the toolchain at /home/scratch/build/
- Installing it to another location like /home/user/rv
> On 10 Nov 2025, at 01:39, Jason Merrill wrote:
>
> On 11/10/25 4:17 PM, Jakub Jelinek wrote:
>> On Mon, Nov 10, 2025 at 03:48:01PM +0530, Jason Merrill wrote:
>>> On 11/10/25 12:06 PM, Jakub Jelinek wrote:
Trivial relocation was voted out of C++26, the following patch
removes it (n
When a major version program suffix is specified, along with
--with-gcc-major-version-only, GCC tries to install $TRIPLE-gcc-tmp into
the destination BINDIR and link it to TRIPLE-gcc-SUFFIX. However this
executable is installed in the previous step, thus leaving the gcc-tmp
unmodified.
This is bec
Some SVE features in the toolchain need to be enabled when either of two
different kernel HWCAPS (and corresponding cpuinfo strings) are enabled
(one for non-streaming mode and one for streaming mode).
Add support for using "|" to separate alternative lists of required
features.
Bootstrapped and
On Mon, Nov 10, 2025 at 3:35 PM Hu, Lin1 wrote:
>
> Hi,
>
> The AMX intrinsics previously used string concatenation with the '#'
> operator to construct register names, which prevented their use with
> C++ template non-type parameters. This patch converts all AMX intrinsics
> to use inline assembl
So factor_out_operators will factor out some expressions but in the case
of BIT_FIELD_REF and BIT_INSERT_EXPR, this only allowed for operand 0 as the
other operands need to be constant.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/122629
gcc/ChangeLog:
* tre
> diff --git a/gcc/hierarchical_discriminator.h
> b/gcc/hierarchical_discriminator.h
> new file mode 100644
> index 000..dd3cb1b0ae7
> --- /dev/null
> +++ b/gcc/hierarchical_discriminator.h
> @@ -0,0 +1,75 @@
> +/* Copyright The GNU Toolchain Authors
> +
> +This file is part of GCC.
> +
>
Ping please? It would be great to tie up this loose end. Thanks!
https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692261.html
-Lewis
On Sun, Oct 12, 2025 at 9:44 AM Lewis Hyatt wrote:
>
> Hello-
>
> https://gcc.gnu.org/pipermail/gcc-patches/2025-August/692261.html
>
> Can I ping this one p
Hi Joseph,
On Sun, Nov 02, 2025 at 07:45:36PM +0100, Alejandro Colomar wrote:
> No functional change intended.
>
> gcc/c-family/ChangeLog:
>
> * c-warn.cc (warn_parms_array_mismatch): Reduce scope of local
> variable.
>
> Signed-off-by: Alejandro Colomar
Ping. I think these 2 are
Bootstrapped and regtested on x86_64-pc-linux-gnu, and also ran dg.exp
with -fmodules enabled. Pushing as obvious.
-- >8 --
I'd missed a STRIP_TEMPLATE when attempting to query whether DECL was
an imported entity.
PR c++/122628
gcc/cp/ChangeLog:
* module.cc (instantiating_tu_l
> Hi,
>
> I've implemented hierarchical discriminators for AutoFDO
> This helps AutoFDO profile accuracy by:
> - Loop iterations are now uniquely identifiable in profile data
> - Distinguishes which iteration of an unrolled loop executed hotly and so on.
>
> The discriminator in AutoFDO is is ext
On Mon, Nov 10, 2025 at 10:28 AM Karl Meakin via Sourceware Forge
wrote:
>
> Hi gcc-patches mailing list,
> Karl Meakin has requested that the following
> forgejo pull request
> be published on the mailing list.
>
> Created on: 2025-09-30 16:40:31+00:00
> Latest update: 2025-11-10 18:26:02+00:00
On Mon, Nov 10, 2025 at 07:28:20PM +, Joseph Myers wrote:
> On Fri, 7 Nov 2025, Alejandro Colomar wrote:
>
> > PR c/122591
> >
> > gcc/c-family/ChangeLog:
> >
> > * c-common.cc (c_countof_type): Convert return value to size_t.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/c
On Mon, Nov 10, 2025 at 09:29:49PM +, Joseph Myers wrote:
> On Sat, 8 Nov 2025, Alejandro Colomar wrote:
>
> > Store the 'rid' value in a local variable, and pass it to functions that
> > handle various keywords. This simplifies the code, and removes some
> > wrappers.
> >
> > No functional
On Mon, 10 Nov 2025, Qing Zhao wrote:
> > struct { int a; char b[] __attribute__ ((counted_by (a))); } *x;
>
> For such case, usually how to initialize the above x?
> Since the anonymous structure has a FAM field, usually we need to use malloc
> to initialize it, is this correct?
> Or is there
Hi!
On Mon, Nov 10, 2025 at 11:25:29PM +0530, Surya Kumari Jangala wrote:
> I believe the global variable rs6000_cpu can be used, at least in some
> places in this patch wherever TARGET_FUTURE is being used. In other places
> too, perhaps we can avoid this variable? The main issue is it is not
>
On Mon, 10 Nov 2025, Qing Zhao wrote:
> Jakub and Joseph,
>
> Could you please review this patch?
>
> Sid has reviewed and okayed with this version.
>
> Let me know if you have more comments and suggestions.
The C front-end changes are OK.
--
Joseph S. Myers
[email protected]
On Sat, 8 Nov 2025, Alejandro Colomar wrote:
> Store the 'rid' value in a local variable, and pass it to functions that
> handle various keywords. This simplifies the code, and removes some
> wrappers.
>
> No functional change intended.
>
> gcc/c/ChangeLog:
>
> * c-parser.cc (c_parser_si
Hi, Joseph,
Thanks a lot for your comments.
I do have some questions below since I am not very sure how to
come up with good testing cases for these cases.
> On Nov 10, 2025, at 14:47, Joseph Myers wrote:
>
> On Fri, 7 Nov 2025, Qing Zhao wrote:
>
>> + gcc_assert (TYPE_NAME (outmost_struct
Am 10.11.25 um 21:41 schrieb Harald Anlauf:
Hi Chris!
Hmm, this works for scalar instances, but does not get the bounds
right for arrays.
Example: add to test_self_assign of finalizer_self_assign.f90:
block
type(node_t), allocatable :: b(:), c(:)
allocate (b(5:5))
b = (b)
Hi Chris!
Am 10.11.25 um 00:45 schrieb Christopher Albert:
Thanks, Harald!
On Sun, 9 Nov 2025 22:57:29 +0100
Harald Anlauf wrote:
Am 08.11.25 um 18:03 schrieb Jerry D:
On 11/7/25 8:30 PM, Christopher Albert wrote:
Derived types with recursive allocatable components and FINAL
procedures tri
gcc/analyzer/ChangeLog
PR other/122243
* analyzer.opt.urls: Regenerated.
gcc/c-family/ChangeLog
PR other/122243
* c.opt.urls: Regenerated.
gcc/cobol/ChangeLog
PR other/122243
* lang.opt.urls: Regenerated.
gcc/ChangeLog
PR other/122243
On Fri, 7 Nov 2025, Qing Zhao wrote:
> + gcc_assert (TYPE_NAME (outmost_struct_type) != NULL);
> + /* If the type of the containing structure is an anonymous struct/union,
> + get the first outer named structure/union type. */
> + while (TYPE_NAME (type) == NULL_TREE)
I'm not sure that T
On Fri, 7 Nov 2025, Alejandro Colomar wrote:
> PR c/122591
>
> gcc/c-family/ChangeLog:
>
> * c-common.cc (c_countof_type): Convert return value to size_t.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/countof-compile.c (type): Test return type of _Countof.
>
> Reported-by: Sam J
Hi!
I know P2758R5 didn't make it into C++26, but on IRC Ville said it would be
useful anyway, so here is a quick attempt at implementing it.
Not adding anything on the libstdc++ side, because I don't know where
experimental stuff like that should go, whether it would be in the
implementation nam
From: Karl Meakin
The `movcc` expander was not used anywhere. Delete
it.
gcc/ChangeLog:
* config/aarch64/aarch64.md (movcc): Delete.
---
gcc/config/aarch64/aarch64.md | 19 ---
1 file changed, 19 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aar
From: Karl Meakin
Deduplicate the checks against `ccmode` by extracting to a new
predicate.
gcc/ChangeLog:
* config/aarch64/aarch64.md(movcc): Use new predicate.
(movcc): Likewise.
(cc): Likewise.
* config/aarch64/predicates.md (aarch64_comparison_operator_cc):
From: Karl Meakin
The checks for `code == UNEQ || code == LTGT` are unecessary, because
they are already excluded by `aarch64_comparison_operator`
gcc/ChangeLog:
* config/aarch64/aarch64.md (mov): Delete
redundant check.
(movcc): Likewise.
(cc): Likewise.
---
g
Hi gcc-patches mailing list,
Karl Meakin has requested that the following forgejo
pull request
be published on the mailing list.
Created on: 2025-09-30 16:40:31+00:00
Latest update: 2025-11-10 18:26:02+00:00
Changes: 5 changed files, 59 additions, 59 deletions
Head revision: karmea01/gcc-TEST re
From: Karl Meakin
Apply the same fix from bc11cbff9e648fdda2798bfa2d7151d5cd164b87
("aarch64: Fix condition accepted by movcc") to `MOVcc`.
Fixes ICEs when compiling code such as `cmpbr-4.c` and `cmpbr-5.c` with
`+cmpbr`.
gcc/ChangeLog:
* config/aarch64/aarch64.md(movcc): Accept MODE_
From: Karl Meakin
The bodies of `movcc` and `movcc` are identical, so merge
them by using a new mode iterator that combines `ALLI` and `GPF`.
gcc/ChangeLog:
* config/aarch64/aarch64.md (movcc): Merge with ...
(movcc): ... this.
* config/aarch64/iterators.md(ALLI_GPF): N
On 11/9/25 1:23 PM, Mark Wielaard wrote:
Commit a1fe2cfa8965 ("fortran: [PR121628]") regenerated libgfortran
Makefile.an and aclocal.m4 files with automake 1.15 instead of 1.15.1.
Run autoreconf version 2.69 with automake 1.15.1 inside libgfortran.
Thanks for taking care of that. I was off by
Hi Mike,
On 08/11/25 2:53 am, Michael Meissner wrote:
> I originally made a more complicated patch (V5) on September 22nd, 2025 that
> tried to do infrastructure cleanup as well as adding -mcpu=future. This patch
> is a more limited patch in that it just adds the -mcpu=future patch, and it
> doe
The iteration count profitability check is irrelevant for uncounted
loops given that, even at runtime, the number of iterations is unknown
at the start of loop execution.
Likewise, the test for skipping the vectorized version of the loop is
based on whether the number of iterations that will be ru
Issues with alias list pruning for uncounted loops was found to cause
as-of-yet unresolved issues in the execution of SpecV6. Disable this
while a reduced testcase is developed and a solution implemented.
Test derived from "omp_get_partition_place_nums" from libgomp "icv.c":
unsigned len = 8;
v
In `vect_do_peeling' and `vect_transform_loop', there are several bits
of logic reliant on niters that need to be handled differently in the
case of uncounted loops.
Firstly When we peel the loop, adding a prolog, we subtract the
prolog peeling factor from the original number of iterations for th
Given how present requirements for loops, early-break or otherwise, to
have a known iteration count, there is currently no need for
single-exit loops to reset induction variables and accumulators prior
to entering the exit loop.
For multiple-exit uncounted loops, there are provisions in the code
f
For multi-exit loops where the IV value used outside of the loop is
may differ depending on which exit is taken, with it being updated
prior to some exits, as per the example given below, we find that its
value is not properly reset upon entering the epilog loop if the IV is
updated prior to the ex
In its current implementation, the loop vectorizer requires the main
exit be the counting IV exit. With uncounted loops we no longer need
to have any counting IV exits. Furthermore, it is possible to have
reached this stage with malformed loops with no exits at all.
Consequently, we need an appro
Default types:
--
While the primary exit condition for loops is no longer tied to some
upper limit in the number of executed iterations, similar limits are
still required for vectorization. One example of this is with prolog
peeling. The prolog will always have an IV exit associated
Given the current reliance of masking on niters and the fact this is
undetermined for uncounted loops, we circumvent this limitation by
disabling the use of partial vectors when vectorizing loops with an
unkown upper bound.
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_2): Disable
At present, there are several places in the code where, while
analyzing the outer loop in a nested pair of loops, when we're
traversing bbs in a loop for some analysis purpose, the analysis
crosses the inter-loop boundary into the inner loop, thus drawing
erroneous conclusions on the outer loop bas
This patch series extends the GCC vectorizer's capability as to be
able to vectorize uncounted loops, as per the following example:
while (str[i] != 0)
str[i] ^=0x20;
Though this implementation has been demonstrated not to cause any
regressions, either in the GCC testsuite or in performance, th
At present we reject uncounted loops outright when doing initial loop
analysis in `vect_analyze_loop_form'.
We have the following gating condition that causes rejection of a
given loop:
if (integer_zerop (info->assumptions)
|| !info->number_of_iterations
|| chrec_contains_undetermine
For uncounted loops, given how no information can be derived from the
max number of iterations, we wish to make no distinction between the
"main" exit and any additional exits.
That is, an epilogue is required across all exits to establish when
the exit condition was met within the final vectorize
Splitting a CONST_INT address into base and offset can be beneficial
when accessing multiple addresses in the same UBYTE region. The base
constant load can be shared among those accesses.
There is no regression for single accesses per UBYTE memory region.
The transformation by TARGET_ADDR_SPACE_L
Fix the following warning:
.../gcc/gcc/config/pru/pru.h:249:43: warning: narrowing conversion of ‘-1’
from ‘int’ to ‘unsigned int’ [-Wnarrowing]
249 | /* ALL_REGS */ { ~0,~0, ~0, ~0, ~0} \
Pushed to trunk as r16-5124-gc6fce499ba17f3.
gcc/ChangeLog:
* c
So I was trying to untangle our define_insn_and_split situation for
zero-extensions and stumbled over some code we need to adjust &
simplify in the RISC-V backend. I probably should have caught this earlier.
riscv_extend_to_xmode_reg is just a poor implementation of
convert_modes; we can repl
On Mon, Nov 03, 2025 at 01:45:13PM +, Alfie Richards wrote:
> Changes the "sve2-sm4", "sve2-sha3", "sve2-bitperm", and "sve2-aes"
> to be aliases which imply both "sve2" and the new option "sve-sm4",
> "sve-sha3", "sve-bitperm", or "sve-aes" respectively.
If we want to treat "+sve2-bitperm" as
On 10/11/2025 10:44, Tobias Burnus wrote:
Andrew Stubbs wrote:
Subject: [PATCH v3] openmp, nvptx: ompx_gnu_managed_mem_alloc
This adds support for using Cuda Managed Memory with omp_alloc. AMD
support
will be added in a future patch.
There is one new predefined allocator, "ompx_gnu_managed
On 10/11/2025 16:13, Christopher Bazley wrote:
On 10/11/2025 14:59, Christopher Bazley wrote:
On 07/11/2025 13:57, Richard Biener wrote:
On Wed, 5 Nov 2025, Christopher Bazley wrote:
On 28/10/2025 13:29, Richard Biener wrote:
On Tue, 28 Oct 2025, Christopher Bazley wrote:
+/* Materializ
Hi All,
Ping again for this.
KR,
Alfie
On 24/10/2025 22:11, Alfie Richards wrote:
Hi All,
Embarrassingly, I had run the regression tests for V1, but apparently not
looked at the results.
Sorry about that, here's a not totally broken version.
Regression tested (properly) for AArch64.
Okay f
On 11/10/25 03:55, Richard Biener wrote:
On Fri, 7 Nov 2025, Andrew MacLeod wrote:
On 11/7/25 13:28, Richard Biener wrote:
Am 07.11.2025 um 15:46 schrieb Andrew MacLeod :
On 11/7/25 08:29, Richard Biener wrote:
When feeding non-SSA names to range_on_edge we degrade to a
non-contextual quer
On 04/11/2025 12:27, Tejas Belagod wrote:
Hi,
Thanks for the all the reviews so far. Here is v4 of the patch series:
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/696741.html
This incorporates review comments from Tamar, Jason and Jakub.
I'll be doing Tamar's EQ -> NE optimisation su
On 10/11/2025 10:06, [email protected] wrote:
Points to highlight:
- I have not got a testing environment for gcn/rtems/nvptx targets.
I have made the changes that should allow them to build and checked
that they do indeed build, but would appreciate relevant maintainers
performing t
On 10/11/2025 14:59, Christopher Bazley wrote:
On 07/11/2025 13:57, Richard Biener wrote:
On Wed, 5 Nov 2025, Christopher Bazley wrote:
On 28/10/2025 13:29, Richard Biener wrote:
On Tue, 28 Oct 2025, Christopher Bazley wrote:
+/* Materialize length number INDEX for a group of scalar stmts
Jakub and Joseph,
Could you please review this patch?
Sid has reviewed and okayed with this version.
Let me know if you have more comments and suggestions.
Thanks a lot.
Qing
> On Oct 31, 2025, at 09:45, Qing Zhao wrote:
>
> Hi,
>
> this is the 5th version of the patch.
> compared to the
On 07/11/2025 13:57, Richard Biener wrote:
On Wed, 5 Nov 2025, Christopher Bazley wrote:
On 28/10/2025 13:29, Richard Biener wrote:
On Tue, 28 Oct 2025, Christopher Bazley wrote:
+/* Materialize length number INDEX for a group of scalar stmts in SLP_NODE
that
+ operate on NVECTORS vectors
On 10/11/2025 14:52, Richard Earnshaw via Sourceware Forge wrote:
> Hi gcc-patches mailing list,
> Richard Earnshaw has requested that the following
> forgejo pull request
> be published on the mailing list.
>
> Created on: 2025-11-10 14:51:54+00:00
> Latest update: 2025-11-10 14:52:36+00:00
> C
Hi gcc-patches mailing list,
Richard Earnshaw has requested that the following
forgejo pull request
be published on the mailing list.
Created on: 2025-11-10 14:51:54+00:00
Latest update: 2025-11-10 14:52:36+00:00
Changes: 1 changed files, 3 additions, 3 deletions
Head revision: rearnsha/gcc-TEST
From: Richard Earnshaw
The define_expand patterns for movdfcc, movsfcc and movhfcc had overly
tight contstraints that could cause the compiler to reject these
patterns when re-ordering the operands could lead to a successful
match. Relax the initial predicate test and rely on the test after
arm
On Sat, 8 Nov 2025, Martin Jambor wrote:
> Hi,
>
> this patch adds the ability to infer ranges from loads from global
> constant static aggregates which have static initializers. Even when
> the load has one or more ARRAY_REFs with an unknown index and thus we
> do not know the particular consta
Hi Tobias,
On Tue, Nov 4, 2025 at 9:10 PM Tobias Burnus wrote:
> If you go for that route, I think we want to have a sorry
> for the FIXME issues in expr-1 and those in expr-3. And for
> the code in 'conv_dummy_value', I think a comment would be good
> why that's called for conditional expr, poss
On 11/10/25 6:14 PM, Nathaniel Shead wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
OK.
-- >8 --
In PR c++/100134, tsubst_friend_function was adjusted to ensure that
instantiating a friend function in an unopened namespace still correctly
marked the namespace as purvi
On 08/11/2025 18:54, Kito Cheng wrote:
I incline to set default abi by march if mabi is not given, rather than by mcpu.
one point is I don’t want to introduce an incompatible behavior with clang, and
the rule of
inference abi from march is already in clang.
makes sense (didn't know of that b
Hi,
a gentle ping for this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698696.html
Best regards,
Josef
On Thu, Nov 6, 2025 at 11:47 PM Jakub Jelinek wrote:
>
> On Thu, Nov 06, 2025 at 10:49:01PM +0800, Kito Cheng wrote:
> > This patch implements _BitInt support for RISC-V target by defining the
> > type layout and ABI requirements. The limb mode selection is based on
> > the bit width, using appro
Instruments those things that needed to be instrumented
in order to develop predicated tails for basic block
SLP (superword-level parallelism).
---
gcc/tree-vect-loop.cc | 10 ++
gcc/tree-vect-slp.cc | 6 ++
gcc/tree-vect-stmts.cc | 31 ++-
3 files chang
New tests verify that GCC can generate predicated vector-length
specific code for AArch64 if the specified vector length is
shorter than, equal to, or longer than the number of elements to
be processed (including if the specified length is sufficient but
the minimum length would not be); other test
vect_create_constant_vectors is updated to pad with zeros
between the end of a group and the end of a vector of the type
chosen for the SLP node, when used for BB SLP. This function
calls gimple_build_vector, which also has to be updated for
SVE vector types (by using the lower bound as the number
Moved existing code to determine the partial vector
style for a load or store into a new function
that will be reused for BB SLP with predicated tails.
---
gcc/tree-vect-stmts.cc | 71 ++
1 file changed, 58 insertions(+), 13 deletions(-)
diff --git a/gcc/tr
This enables use of a predicate mask or length limit for
vectorization of basic blocks in cases where previously only the
equivalent rolled (i.e. loop) form of some source code would have
been vectorized. Predication is only used for groups whose size
is not neatly divisible into vectors of lengths
Calls to vect_(get|record)_loop_(mask|len) are replaced
with calls to new wrappers that have an extra (SLP node)
parameter and which can operate on any vec_info, not just
a loop_vec_info. These wrappers pass calls through to the
original functions (and ignore the SLP node) when invoked
with a loop_
Update all callers to pass a pointer to the vectorizer state
into this helper function. Its value is temporarily unused
but will be required for BB SLP with predicated tails.
---
gcc/tree-vect-loop.cc | 18 +++---
gcc/tree-vect-slp.cc | 2 +-
gcc/tree-vect-stmts.cc | 56 +++
To decide whether to create a new SLP instance for BB SLP,
vect_analyze_slp_instance will need the minimum number of lanes
in the SLP tree, which must not be less than the group size
(otherwise "unrolling" is required). All usage of max_nunits
is therefore replaced with a new class that encapsulate
For basic block superword-level parallelism, modify the
definition of the recently-introduced wrapper functions,
vect_record_(len|mask), to simply set one of two flags
to indicate that a mask or length should be used for a
given SLP node. The passed-in vec_info is ignored.
Likewise, implement vect
GCC already supports fully-predicated vectorisation for loops, both
using "traditional" loop vectorisation and loop-aware SLP
(superword-level parallelism). For example, GCC can vectorise:
void
foo (char *x)
{
for (int i = 0; i < 6; i += 2)
{
x[i] += 1;
x[i + 1] += 2;
}
}
fr
On Mon, 10 Nov 2025, Richard Biener wrote:
> On Mon, 10 Nov 2025, Tamar Christina wrote:
>
> > First issue is that there's a latent bug exposed by this patch in that
> > this example
> >
> > integer (8) b, c
> > integer d
> > c = 10
> > d = 2
> > call e ((/ (b, b = j, c, d) /), 0_8, c,
On Mon, 10 Nov 2025, Tamar Christina wrote:
> First issue is that there's a latent bug exposed by this patch in that
> this example
>
> integer (8) b, c
> integer d
> c = 10
> d = 2
> call e ((/ (b, b = j, c, d) /), 0_8, c, d + 0_8)
> contains
> subroutine e (a, f, g, h)
> integer
> Ah I see what you were trying to say above, the issue isn't that we can't
> calculate
> the right number of vector iterations, It's that for partial vectors the
> number of
> iterations are scalar iterations because
> vect_set_loop_condition_partial_vectors
> requires the number of scalar it
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
-- >8 --
In PR c++/100134, tsubst_friend_function was adjusted to ensure that
instantiating a friend function in an unopened namespace still correctly
marked the namespace as purview. This adjusts the fix to also apply
to nested na
On Mon, 10 Nov 2025, Tamar Christina wrote:
> Consider this simple loop
>
> long long arr[1024];
> long long *f()
> {
> int i;
> for (i = 0; i < 1024; i++)
> if (arr[i] == 42)
> break;
> return arr + i;
> }
>
> where today we generate this at -O3:
>
> .L2:
> ad
On 11/10/25 4:17 PM, Jakub Jelinek wrote:
On Mon, Nov 10, 2025 at 03:48:01PM +0530, Jason Merrill wrote:
On 11/10/25 12:06 PM, Jakub Jelinek wrote:
Trivial relocation was voted out of C++26, the following patch
removes it (note, the libstdc++ part was still waiting for patch review
and so doesn
Ah, yes -- apologies.
There are two patches I built on top of. I would very much appreciate
target maintainer attention to both of these as well.
I split them out into independent patches and forgot to mention them in
the email (plus they didn't get properly sent due to mail server problems)
From: Matthew Malcomson
Apologies for the re-send: There is a flaky bug me and my collegues are
having w.r.t. emails having incorrect headers and getting rejected from
gcc-patches mailing list.
Re-sending including Cc's to target maintainers.
>8 --- 8< ---
From: Matthew Malcomson
Apologies for the re-send: There is a flaky bug me and my collegues are
having w.r.t. emails having incorrect headers and getting rejected from
gcc-patches mailing list.
Re-sending including Cc's to target maintainers.
>8 --- 8< ---
On Mon, Nov 10, 2025 at 03:48:01PM +0530, Jason Merrill wrote:
> On 11/10/25 12:06 PM, Jakub Jelinek wrote:
> > Trivial relocation was voted out of C++26, the following patch
> > removes it (note, the libstdc++ part was still waiting for patch review
> > and so doesn't need to be removed).
> >
> >
I don't seem to be able to apply your patches. Did I miss a prerequisite?
Specifically, the hunks in gomp_team_barrier_wait_end and
gomp_team_barrier_wait_cancel_end have context that does not match mainline.
Andrew
On 10/11/2025 10:06, [email protected] wrote:
From: Matthew Malcomson
1 - 100 of 165 matches
Mail list logo