date:20250715

Re: [PATCH] varasm: Reject non-constant vector types [PR121091]

2025-07-15 Thread Andrew Pinski

On Tue, Jul 15, 2025 at 12:18 PM Andrew Pinski  wrote:
>
> The switch conversion code asks initializer_constant_valid_p
> if the expression is a valid initializer constant and for
> vector types which have a non-constant elements initializer_constant_valid_p
> would return it is valid.  But it should be rejected as not being
> valid for an initializer.
>
> Build and tested for aarch64-linux-gnu.

Whoops I didn't see a regression until now. This actually regresses PR
116595. So I have to hold back on this for now.
I will debug and figure out how to solve both at the same time.
initializer_constant_valid_p should be rejecting it but at the same
time something else needs to change in the front-end I think.

Thanks,
Andrew Pinski

>
> PR tree-optimization/121091
>
> gcc/ChangeLog:
>
> * varasm.cc (initializer_constant_valid_p_1): Reject vector types 
> which
> have a non-constant number of elements.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/pr121091-1.c: New test.
>
> Signed-off-by: Andrew Pinski 
> ---
>  .../gcc.target/aarch64/sve/pr121091-1.c   | 25 +++
>  gcc/varasm.cc |  5 
>  2 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c
> new file mode 100644
> index 000..ea2e5ce6b6a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* PR tree-optimization/121091 */
> +/* Switch conversion would convert this
> +   into static constant array but since
> +   svbool_t is a VLA type, it can't be
> +   stored in a constant pool. */
> +
> +#include "arm_sve.h"
> +
> +svbool_t e(int mode, svbool_t pg) {
> +switch (mode) {
> +case 0:
> +pg = svptrue_pat_b16(SV_VL6);
> +break;
> +case 1:
> +pg = svpfalse_b();
> +break;
> +case 2:
> +pg = svptrue_pat_b16(SV_VL2);
> +break;
> +}
> +  return pg;
> +}
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 10c1d2e3137..36b7cd8812d 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -5001,6 +5001,11 @@ initializer_constant_valid_p_1 (tree value, tree 
> endtype, tree *cache)
>  {
>tree ret;
>
> +  /* Variable sized vectors are never valid for initializers.  */
> +  if (VECTOR_TYPE_P (endtype)
> +  && !TYPE_VECTOR_SUBPARTS (endtype).is_constant ())
> +return NULL_TREE;
> +
>switch (TREE_CODE (value))
>  {
>  case CONSTRUCTOR:
> --
> 2.43.0
>

[pushed-15] [PATCH] PR modula2/120389 ICE if assigning a constant char to an integer array

2025-07-15 Thread Gaius Mulley



This patch fixes an ICE which occurs if a constant char is assigned
into an integer array.  The fix it to introduce type checking in
M2GenGCC.mod:CodeXIndr.

gcc/m2/ChangeLog:

PR modula2/120389
* gm2-compiler/M2GenGCC.mod (CodeXIndr): Check to see that
the type of left is assignment compatible with the type of
right.

gcc/testsuite/ChangeLog:

PR modula2/120389
* gm2/iso/fail/badarray3.mod: New test.

(cherry picked from commit 895a8abad245365940939911e3d0de850522791e)

Signed-off-by: Gaius Mulley 
---
 gcc/m2/gm2-compiler/M2GenGCC.mod | 8 
 gcc/testsuite/gm2/iso/fail/badarray3.mod | 7 +++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gm2/iso/fail/badarray3.mod

diff --git a/gcc/m2/gm2-compiler/M2GenGCC.mod b/gcc/m2/gm2-compiler/M2GenGCC.mod
index bc1d588fce6..2dfa54a 100644
--- a/gcc/m2/gm2-compiler/M2GenGCC.mod
+++ b/gcc/m2/gm2-compiler/M2GenGCC.mod
@@ -8229,6 +8229,14 @@ BEGIN
type := SkipType (type) ;
DeclareConstant (rightpos, right) ;
DeclareConstructor (rightpos, quad, right) ;
+   IF StrictTypeChecking AND
+  (NOT AssignmentTypeCompatible (xindrpos, "", GetType (left), right))
+   THEN
+  MetaErrorT2 (tokenno,
+   'assignment check caught mismatch between {%1Ead} and 
{%2ad}',
+   left, right) ;
+  SubQuad (quad)
+   END ;
IF IsProcType(SkipType(type))
THEN
   BuildAssignmentStatement (location, BuildIndirect (location, Mod2Gcc 
(left), GetPointerType ()), Mod2Gcc (right))
diff --git a/gcc/testsuite/gm2/iso/fail/badarray3.mod 
b/gcc/testsuite/gm2/iso/fail/badarray3.mod
new file mode 100644
index 000..be53d21e74f
--- /dev/null
+++ b/gcc/testsuite/gm2/iso/fail/badarray3.mod
@@ -0,0 +1,7 @@
+MODULE badarray3 ;
+
+VAR
+   x: ARRAY [1..5] OF INTEGER ;
+BEGIN
+   x[1] := 'c';
+END badarray3.
-- 
2.20.1

Re: [PATCH v2 1/1] contrib: add bpf-vmtest-tool to test BPF programs

2025-07-15 Thread Jose E. Marchesi

> Hi Jose,
> On 15/07/25 22:55, Jose E. Marchesi wrote:
>> Hi Piyush.
>> This form of the script looks generally good to me.
>> May be a good time to move to the second stage of the project, which
>> if
>> I am not mistaken consists in creating some dejagnu infrastructure so we
>> can get testsuites running the script on the test sources.
>
> I’m working on a draft patch to discuss the approach before fully
> implementing it. Understanding how DejaGnu and the GCC testsuite work
> internally took some time. I have some doubts about the best places to
> make edits for certain functionalities, and I’ll detail them in the
> draft patch.

Yes, it is often better to ask sooner than later.  People in this list
know about both dejagnu and the GCC testsuite infrastructure.

>> At this point it would be good if this and future series would be
>> available in some public branch somewhere until the stuff is ready to go
>> in the main gcc.git repository.  Do you have access to some suitable
>> forge or similar?  Otherwise you may want to use forge.sourceware.org.
>
> I’ve been using my GitHub repository:
> https://github.com/PiyushRaj927/gcc to push patches, will this work?
> I can keep a dedicated branch there for accepted patches. If
> preferred, I can also set up a repository on forge.sourceware.org

That works just fine, thanks :)
I assume the branch is ebpf-compiletests.

As we discussed, it would be good to have a page in the wiki to document
this effort, something like https://gcc.gnu.org/wiki/BPFRunTimeTests.

I can create it and point to your github repo and branch, for starters.

If you create an account in the wiki then I will give you write access
so you can update the page.

Sounds good to you?

[pushed] c++: don't mark void exprs as read [PR44677]

2025-07-15 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In Jakub's patch for PR44677 he added code to prevent mark_exp_read on
e.g. (void)++i from marking i as read, but it seems to me that we can
generalize that to avoid looking any farther into any void expression;
you can't read a void value, and an explicit cast will have already called
mark_exp_read on its operand in convert_to_void.

For testing I added an assert to catch places where we were trying to mark
void expressions as read, and fix a few that it found.  But there were
several other places (such as check_return_expr) where we could have a void
expression but always calling mark_exp_read makes sense, so I dropped the
assert from the final commit.

PR c++/44677

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_fold) [CLEANUP_POINT_EXPR]: Don't force rvalue.
[COMPOUND_EXPR]: Likewise.
* cvt.cc (convert_to_void): Call mark_exp_read later.
* expr.cc (mark_use): Turn off read_p for any void argument.
(mark_exp_read): Return early for void argument.
---
 gcc/cp/cp-gimplify.cc |  5 +++--
 gcc/cp/cvt.cc | 13 ++---
 gcc/cp/expr.cc| 37 ++---
 3 files changed, 15 insertions(+), 40 deletions(-)

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index addbc29d104..d54fe347a6c 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -3028,7 +3028,7 @@ cp_fold (tree x, fold_flags_t flags)
 case CLEANUP_POINT_EXPR:
   /* Strip CLEANUP_POINT_EXPR if the expression doesn't have side
 effects.  */
-  r = cp_fold_rvalue (TREE_OPERAND (x, 0), flags);
+  r = cp_fold (TREE_OPERAND (x, 0), flags);
   if (!TREE_SIDE_EFFECTS (r))
x = r;
   break;
@@ -3223,7 +3223,8 @@ cp_fold (tree x, fold_flags_t flags)
  && (VAR_P (op0) || TREE_CODE (op0) == PARM_DECL)
  && !DECL_READ_P (op0))
clear_decl_read = true;
-  op1 = cp_fold_rvalue (TREE_OPERAND (x, 1), flags);
+  op1 = cp_fold_maybe_rvalue (TREE_OPERAND (x, 1),
+ code != COMPOUND_EXPR, flags);
   if (clear_decl_read)
DECL_READ_P (op0) = 0;
 
diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
index f663a6d08c8..55be12db951 100644
--- a/gcc/cp/cvt.cc
+++ b/gcc/cp/cvt.cc
@@ -1186,13 +1186,6 @@ convert_to_void (tree expr, impl_conv_void implicit, 
tsubst_flags_t complain)
 
   expr = maybe_undo_parenthesized_ref (expr);
 
-  expr = mark_discarded_use (expr);
-  if (implicit == ICV_CAST)
-/* An explicit cast to void avoids all -Wunused-but-set* warnings.  */
-mark_exp_read (expr);
-
-  if (!TREE_TYPE (expr))
-return expr;
   if (invalid_nonstatic_memfn_p (loc, expr, complain))
 return error_mark_node;
   if (TREE_CODE (expr) == PSEUDO_DTOR_EXPR)
@@ -1209,6 +1202,12 @@ convert_to_void (tree expr, impl_conv_void implicit, 
tsubst_flags_t complain)
 
   if (VOID_TYPE_P (TREE_TYPE (expr)))
 return expr;
+
+  expr = mark_discarded_use (expr);
+  if (implicit == ICV_CAST)
+/* An explicit cast to void avoids all -Wunused-but-set* warnings.  */
+mark_exp_read (expr);
+
   switch (TREE_CODE (expr))
 {
 case COND_EXPR:
diff --git a/gcc/cp/expr.cc b/gcc/cp/expr.cc
index 8b5a098ecb3..32dc3eee78f 100644
--- a/gcc/cp/expr.cc
+++ b/gcc/cp/expr.cc
@@ -102,6 +102,9 @@ mark_use (tree expr, bool rvalue_p, bool read_p,
   if (reject_builtin && reject_gcc_builtin (expr, loc))
 return error_mark_node;
 
+  if (TREE_TYPE (expr) && VOID_TYPE_P (TREE_TYPE (expr)))
+read_p = false;
+
   if (read_p)
 mark_exp_read (expr);
 
@@ -213,25 +216,6 @@ mark_use (tree expr, bool rvalue_p, bool read_p,
}
   gcc_fallthrough ();
 CASE_CONVERT:
-  if (VOID_TYPE_P (TREE_TYPE (expr)))
-   switch (TREE_CODE (TREE_OPERAND (expr, 0)))
- {
- case PREINCREMENT_EXPR:
- case PREDECREMENT_EXPR:
- case POSTINCREMENT_EXPR:
- case POSTDECREMENT_EXPR:
-   tree op0;
-   op0 = TREE_OPERAND (TREE_OPERAND (expr, 0), 0);
-   STRIP_ANY_LOCATION_WRAPPER (op0);
-   if ((VAR_P (op0) || TREE_CODE (op0) == PARM_DECL)
-   && !DECL_READ_P (op0)
-   && (VAR_P (op0) ? warn_unused_but_set_variable
-   : warn_unused_but_set_parameter) > 1)
- read_p = false;
-   break;
- default:
-   break;
- }
   recurse_op[0] = true;
   break;
 
@@ -371,6 +355,9 @@ mark_exp_read (tree exp)
   if (exp == NULL)
 return;
 
+  if (TREE_TYPE (exp) && VOID_TYPE_P (TREE_TYPE (exp)))
+return;
+
   switch (TREE_CODE (exp))
 {
 case VAR_DECL:
@@ -381,18 +368,6 @@ mark_exp_read (tree exp)
   DECL_READ_P (exp) = 1;
   break;
 CASE_CONVERT:
-  if (VOID_TYPE_P (TREE_TYPE (exp)))
-   switch (TREE_CODE (TREE_OPERAND (exp, 0)))
- {
- case PREINCREMENT_EXPR:
- case PREDECREMENT_EXPR:
- case POSTINCREMEN

[PATCH] varasm: Reject non-constant vector types [PR121091]

2025-07-15 Thread Andrew Pinski

The switch conversion code asks initializer_constant_valid_p
if the expression is a valid initializer constant and for
vector types which have a non-constant elements initializer_constant_valid_p
would return it is valid.  But it should be rejected as not being
valid for an initializer.

Build and tested for aarch64-linux-gnu.

PR tree-optimization/121091

gcc/ChangeLog:

* varasm.cc (initializer_constant_valid_p_1): Reject vector types which
have a non-constant number of elements.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pr121091-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 .../gcc.target/aarch64/sve/pr121091-1.c   | 25 +++
 gcc/varasm.cc |  5 
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c
new file mode 100644
index 000..ea2e5ce6b6a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121091-1.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* PR tree-optimization/121091 */
+/* Switch conversion would convert this
+   into static constant array but since
+   svbool_t is a VLA type, it can't be
+   stored in a constant pool. */
+
+#include "arm_sve.h"
+
+svbool_t e(int mode, svbool_t pg) {
+switch (mode) {
+case 0:
+pg = svptrue_pat_b16(SV_VL6);
+break;
+case 1:
+pg = svpfalse_b();
+break;
+case 2:
+pg = svptrue_pat_b16(SV_VL2);
+break;
+}
+  return pg;
+}
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 10c1d2e3137..36b7cd8812d 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -5001,6 +5001,11 @@ initializer_constant_valid_p_1 (tree value, tree 
endtype, tree *cache)
 {
   tree ret;
 
+  /* Variable sized vectors are never valid for initializers.  */
+  if (VECTOR_TYPE_P (endtype)
+  && !TYPE_VECTOR_SUBPARTS (endtype).is_constant ())
+return NULL_TREE;
+
   switch (TREE_CODE (value))
 {
 case CONSTRUCTOR:
-- 
2.43.0

[committed v2] libstdc++: Constrain std::swap using concepts in C++20

2025-07-15 Thread Jonathan Wakely

This is a minor compile-time optimization for C++20.

libstdc++-v3/ChangeLog:

* include/bits/move.h (swap): Replace enable_if with concepts
when available, and with __enable_if_t alias otherwise.
---

Here's what I pushed.

Tested x86_64-linux.

 libstdc++-v3/include/bits/move.h | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 085ca074fc88..061e6b4de3d8 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -215,14 +215,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @return   Nothing.
   */
   template
-_GLIBCXX20_CONSTEXPR
-inline
-#if __cplusplus >= 201103L
-typename enable_if<__and_<__not_<__is_tuple_like<_Tp>>,
- is_move_constructible<_Tp>,
- is_move_assignable<_Tp>>::value>::type
+#if __glibcxx_concepts // >= C++20
+requires (! __is_tuple_like<_Tp>::value)
+  && is_move_constructible_v<_Tp>
+  && is_move_assignable_v<_Tp>
+constexpr void
+#elif __cplusplus >= 201103L
+_GLIBCXX20_CONSTEXPR inline
+__enable_if_t<__and_<__not_<__is_tuple_like<_Tp>>,
+is_move_constructible<_Tp>,
+is_move_assignable<_Tp>>::value>
 #else
-void
+inline void
 #endif
 swap(_Tp& __a, _Tp& __b)
 _GLIBCXX_NOEXCEPT_IF(__and_,
@@ -241,12 +245,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // DR 809. std::swap should be overloaded for array types.
   /// Swap the contents of two arrays.
   template
-_GLIBCXX20_CONSTEXPR
-inline
-#if __cplusplus >= 201103L
-typename enable_if<__is_swappable<_Tp>::value>::type
+#if __glibcxx_concepts // >= C++20
+requires is_swappable_v<_Tp>
+constexpr void
+#elif __cplusplus >= 201103L
+_GLIBCXX20_CONSTEXPR inline
+__enable_if_t<__is_swappable<_Tp>::value>
 #else
-void
+inline void
 #endif
 swap(_Tp (&__a)[_Nm], _Tp (&__b)[_Nm])
 _GLIBCXX_NOEXCEPT_IF(__is_nothrow_swappable<_Tp>::value)
-- 
2.50.1

[PATCH] libstdc++/ranges: Prefer using offset-based _CachedPosition

2025-07-15 Thread Patrick Palka

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

The offset-based partial specialization of _CachedPosition for
random-access iterators is currently only selected if the offset type is
smaller than the iterator type.  Before r12-1018-g46ed811bcb4b86 this
made sense since the main partial specialization only stored the
iterator (incorrectly).  After that bugfix, the main partial
specialization now effectively stores a std::optional, so the
size constraint is inaccurate, and it must invalidate itself upon
copy/move unlike the offset-based partial specialization.  So I think
we should just always prefer the offset-based _CachedPosition for a
random-access iterator, even if the offset type happens to be larger
than the iterator type.

libstdc++-v3/ChangeLog:

* include/std/ranges (__detail::_CachedPosition): Remove
additional size constraint on the offset-based partial
specialization.
---
 libstdc++-v3/include/std/ranges | 2 --
 1 file changed, 2 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index efe62969d657..2970b2cbe00b 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -1585,8 +1585,6 @@ namespace views::__adaptor
   };
 
 template
-  requires (sizeof(range_difference_t<_Range>)
-   <= sizeof(iterator_t<_Range>))
   struct _CachedPosition<_Range>
   {
   private:
-- 
2.50.1.271.gd30e120486

Re: [PATCH, V3] Add -mcpu=future to the PowerPC

2025-07-15 Thread Michael Meissner

On Tue, Jul 15, 2025 at 08:11:05AM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Jul 01, 2025 at 12:14:32PM -0400, Michael Meissner wrote:
> > This patch adds the support that can be used in developing GCC support
> > for potential future PowerPC processors.
> > 
> > These changes are being added so that hardware designers can evaluate
> > potential new features to be added to the PowerPC processors in the
> > future.  It may be these features will be incorporated into real
> > hardware using a different name in the future. Or it may be these
> > features will not be incoporated into actual PowerPC hardware in the
> > future.
> 
> None of this belongs in the commit message.  In the furure please post
> stuff exactly as you want it to be committed!

The patch includes comments regarding previous patches, and aren't the
commit patch.  If I eliminate those comments, you will then say things
like 'where did this come from'.

I can try to separate things so there is the initial commentary meant
for the review to address concerns from previous patches.

> > * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
> > -mcpu=future, define _ARCH_FUTURE.
> > * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
> > (POWERPC_MASKS): Add OPTION_MASK_FUTURE.
> > (future cpu): Add 'future' cpu.
> > * config/rs6000/rs6000-tables.opt: Regenerate.
> > * config/rs6000/rs6000.opt (-mfuture-internal): New internal option to
> > indicate the user used -mcpu=future.
> 
> NAK, as said before.
> 
> If the user wants to see what rs6000_cpu is set to, the user should just
> look at rs6000_cpu, don't go via more indirect roundabout ways.
> 
> > +/* At present, we are not providing a unique processor option for 
> > -mcpu=future.
> 
> What does this comment mean even?

We have essentially two choices, and I feel you have not liked either.
I don't see a clear way forward.

The way I did it before was to have a separate processor enumeration
for cpu=future.  This involves adding switch statements and such for
every place we test for cpu == power10 or power11.  And it also means
that power10.md has lots of edits to add the future case.

Or we can do it the way I did in this patch, to not create a new cpu
enumeration at the present time, and just set it to power11.  Some day
in the future, sure we will move things around, but at the current
time, we are not adding unique scheduling for cpu=future.

It really doesn't matter to me, as I've done both approaches.  Do you
want me to go back to having a separate future cpu enumeration, which
involves all of the changes we've seen in the past, or did you want the
more minimal changes that I just did.

> > +   Until we define these changes, just use the power11 defaults.  */
> > +RS6000_CPU ("future", PROCESSOR_POWER11, MASK_POWERPC64 | 
> > FUTURE_MASKS_SERVER)
> 
> It isn't talking about this code for sure.
> 
> > +;; Users should not use -mfuture-internal, but we need to use a bit to 
> > identify
> > +;; when the user changes the default cpu via #pragma GCC 
> > target("cpu=future")
> > +;; and then resets it later.
> 
> What does that mean?  It hints at a bug elsewhere, but not more than
> hints unfortunately.
> 
> 
> Segher

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Re: [PATCH] arm: avoid gcc_s dependency

2025-07-15 Thread Sam James

Pierre Ossman  writes:

> On 14/07/2025 22:24, Sam James wrote:
>> A patch rebased against trunk would also be appreciated. See
>> https://gcc.gnu.org/contribute.html for the needed format.
>> 
>
> The patch applies cleanly against trunk, so nothing is needed there.
>
> I was hoping to get the interest of someone more familiar with gcc, to
> not have to figure out everything surrounding test cases and ChangeLog
> conventions. :)

The best way to get someone's attention is a correctly submitted patch
;)

Otherwise it's essentially just a ping on the bug.

>
> Regards,

[PATCH] aarch64: small compile time improvement, disable hardreg PRE if !TARGET_FP8 [PR121095]

2025-07-15 Thread Andrew Pinski

r15-6789-ge7f98d9603808b added a new RTL pass for hardreg PRE for the hard 
register
of FPM_REGNUM, but this pass does nothing if there can be any FPM_REGNUM 
register in it.
So let's set HARDREG_PRE_REGNOS to include all zeros if !TARGET_FP8.
Now the pass will only run if there is a possibility of having the FPM register.

Built and tested for aarch64-linux-gnu.

PR target/121095
gcc/ChangeLog:

* config/aarch64/aarch64.h (HARDREG_PRE_REGNOS): Don't include 
FPM_REGNUM
if !TARGET_FP8.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 096c853af7f..d128ed726f0 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1661,8 +1661,8 @@ enum class aarch64_tristate_mode : int { NO, YES, MAYBE };
 int (aarch64_local_sme_state::ANY) }
 
 /* Zero terminated list of regnos for which hardreg PRE should be
-   applied.  */
-#define HARDREG_PRE_REGNOS { FPM_REGNUM, 0 }
+   applied.  Only enable FPM reg if FP8 is enabled. */
+#define HARDREG_PRE_REGNOS { TARGET_FP8 ? FPM_REGNUM : 0, 0 }
 
 #endif
 
-- 
2.43.0

Re: [COMMITTED] Fix tree.cc compilation on SPARC

2025-07-15 Thread Andrew Pinski

On Tue, Jul 15, 2025 at 12:52 AM Rainer Orth
 wrote:
>
> commit 4d7baa94a48c27030c8ffcfaf3dd187be09903a9
> Author: Andrew Pinski 
> Date:   Sun Jul 13 11:56:03 2025 -0700
>
> tree: Add include to tm_p.h to tree.cc [PR120866]
>
> broke SPARC bootstrap:
>
> In file included from ./tm_p.h:4,
>  from /vol/gcc/src/hg/master/local/gcc/tree.cc:35:
> /vol/gcc/src/hg/master/local/gcc/config/sparc/sparc-protos.h:46:47: error: 
> use of enum ‘memmodel’ without previous declaration
>46 | extern void sparc_emit_membar_for_model (enum memmodel, int, int);
>   |
>
> Fixed by including memmodel.h.
>
> Bootstrapped without regressions on sparc-sun-solaris2.11 and
> i386-pc-solaris2.11.
>
> Committed to trunk.

Thanks for fixing this. I had noticed that memmodel.h was almost
always included right before tm_p.h too but I didn't look into why
though.

Thanks,
Andrew Pinski

>
> Rainer
>
> --
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>
> 2025-07-15  Rainer Orth  
>
> gcc:
> * tree.cc: Include memmodel.h.
>

Re: [PATCH] libstdc++/ranges: Prefer using offset-based _CachedPosition

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 9:51 PM Patrick Palka  wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?
>
> -- >8 --
>
> The offset-based partial specialization of _CachedPosition for
> random-access iterators is currently only selected if the offset type is
> smaller than the iterator type.  Before r12-1018-g46ed811bcb4b86 this
> made sense since the main partial specialization only stored the
> iterator (incorrectly).  After that bugfix, the main partial
> specialization now effectively stores a std::optional, so the
> size constraint is inaccurate,

optional is still smaller than 2 * sizeof(Iterator), so the
constraints
could be updated to range::difference <= 2 * sizeof(Iterator) or <=
sizeof(non_propagating_cache).

However, having a range::difference_t being larger than iterator, should be
unusual, as the iterator must be able to represent as many positions as the
difference
type. So except for cases like repeat, they should be similar size. I do
not think
it is worth optimizing for that.

I think it is easier for users if the caching rules are simplified, and
does not depend
on size.

and it must invalidate itself upon
> copy/move unlike the offset-based partial specialization.  So I think
> we should just always prefer the offset-based _CachedPosition for a
> random-access iterator, even if the offset type happens to be larger
> than the iterator type.
>

> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (__detail::_CachedPosition): Remove
> additional size constraint on the offset-based partial
> specialization.
> ---
>
Given the above LGTM.

>  libstdc++-v3/include/std/ranges | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index efe62969d657..2970b2cbe00b 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -1585,8 +1585,6 @@ namespace views::__adaptor
>};
>
>  template
> -  requires (sizeof(range_difference_t<_Range>)
> -   <= sizeof(iterator_t<_Range>))
>struct _CachedPosition<_Range>
>{
>private:
> --
> 2.50.1.271.gd30e120486
>
>

Re: [PATCH] libstdc++: Define __promoted_t alias for C++17 without fold expressions [PR121097]

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 10:36 PM Jonathan Wakely  wrote:

> Define a fallback implementation of the __promoted_t alias for
> hypothetical C++17 compilers which don't define __cpp_fold_expressions.
> Without this,  can't be compiled by such compilers, because the
> 3-arg form of std::hypot uses the alias.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/121097
> * include/ext/type_traits.h (__promoted_t): Define fallback
> implementation for the 3-arg case that std::hypot uses.
> ---
>
> I tested this by bodging the header so that this definition is used
> unconditionally, and it seemed to work.
>
I do not understand why we are defining the promote_t alias, instead
of just changing the std::hypot to use __promoted_3, that is provided
also in case when __promoted is defined. Like fma does.

>
>  libstdc++-v3/include/ext/type_traits.h | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/libstdc++-v3/include/ext/type_traits.h
> b/libstdc++-v3/include/ext/type_traits.h
> index 061555231468..ca5bc4e0d23e 100644
> --- a/libstdc++-v3/include/ext/type_traits.h
> +++ b/libstdc++-v3/include/ext/type_traits.h
> @@ -269,6 +269,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  {
>typedef __typeof__(_Tp2() + _Up2() + _Vp2() + _Wp2()) __type;
>  };
> +
> +#ifdef __glibcxx_hypot // C++ >= 17 && HOSTED
> +  // __promoted_t is needed by the 3-arg std::hypot so define this in case
> +  // any C++17 compiler doesn't define __cpp_fold_expressions (PR 121097).
> +  template
> +using __promoted_t = typename __promote_3<_Tp1, _Tp2, _Tp3>::__type;
> +#endif
> +
>  #endif
>
>  _GLIBCXX_END_NAMESPACE_VERSION
> --
> 2.50.1
>
>

[PATCH] Add -mfentry -fno-pic only on gnu/x86 targets

2025-07-15 Thread H.J. Lu

Since only gnu/x86 targets support -mfentry, add -mfentry -fno-pic only
on gnu/x86 targets.

PR target/120881
PR testsuite/121078
* gcc.dg/20021014-1.c (dg-additional-options): Add -mfentry
-fno-pic only on gnu/x86 targets.
* gcc.dg/aru-2.c (dg-additional-options): Likewise.
* gcc.dg/nest.c (dg-additional-options): Likewise.
* gcc.dg/pr32450.c (dg-additional-options): Likewise.
* gcc.dg/pr43643.c (dg-additional-options): Likewise.
* gcc.target/i386/pr104447.c (dg-additional-options): Likewise.
* gcc.target/i386/pr113122-3.c(dg-additional-options): Likewise.
* gcc.target/i386/pr119386-1.c (dg-additional-options): Add
-mfentry only on gnu targets.
* gcc.target/i386/pr119386-2.c (dg-additional-options): Likewise.

Signed-off-by: H.J. Lu 
---
 gcc/testsuite/gcc.dg/20021014-1.c  | 2 +-
 gcc/testsuite/gcc.dg/aru-2.c   | 2 +-
 gcc/testsuite/gcc.dg/nest.c| 2 +-
 gcc/testsuite/gcc.dg/pr32450.c | 3 ++-
 gcc/testsuite/gcc.dg/pr43643.c | 2 +-
 gcc/testsuite/gcc.target/i386/pr104447.c   | 3 ++-
 gcc/testsuite/gcc.target/i386/pr113122-3.c | 3 ++-
 gcc/testsuite/gcc.target/i386/pr119386-1.c | 4 ++--
 gcc/testsuite/gcc.target/i386/pr119386-2.c | 4 ++--
 9 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/20021014-1.c 
b/gcc/testsuite/gcc.dg/20021014-1.c
index f5f6fcf3625..ee5d4597aa3 100644
--- a/gcc/testsuite/gcc.dg/20021014-1.c
+++ b/gcc/testsuite/gcc.dg/20021014-1.c
@@ -2,7 +2,7 @@
 /* { dg-require-profiling "-p" } */
 /* { dg-options "-O2 -p" } */
 /* { dg-options "-O2 -p -static" { target hppa*-*-hpux* } } */
-/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-* x86_64-*-* } 
} */
+/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-gnu* 
x86_64-*-gnu* } } */
 /* { dg-error "profiler" "No profiler support" { target xstormy16-*-* } 0 } */
 /* { dg-message "" "consider using `-pg' instead of `-p' with gprof(1)" { 
target *-*-freebsd* } 0 } */
 
diff --git a/gcc/testsuite/gcc.dg/aru-2.c b/gcc/testsuite/gcc.dg/aru-2.c
index 102ece17726..61898de032a 100644
--- a/gcc/testsuite/gcc.dg/aru-2.c
+++ b/gcc/testsuite/gcc.dg/aru-2.c
@@ -1,7 +1,7 @@
 /* { dg-do run } */
 /* { dg-require-profiling "-pg" } */
 /* { dg-options "-O2 -pg" } */
-/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-* x86_64-*-* } 
} */
+/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-gnu* 
x86_64-*-gnu* } } */
 
 static int __attribute__((noinline))
 bar (int x)
diff --git a/gcc/testsuite/gcc.dg/nest.c b/gcc/testsuite/gcc.dg/nest.c
index 9221ed1c8f8..2dce65edb88 100644
--- a/gcc/testsuite/gcc.dg/nest.c
+++ b/gcc/testsuite/gcc.dg/nest.c
@@ -3,7 +3,7 @@
 /* { dg-require-profiling "-pg" } */
 /* { dg-options "-O2 -pg" } */
 /* { dg-options "-O2 -pg -static" { target hppa*-*-hpux* } } */
-/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-* x86_64-*-* } 
} */
+/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-gnu* 
x86_64-*-gnu* } } */
 /* { dg-error "profiler" "No profiler support" { target xstormy16-*-* } 0 } */
 
 extern void abort (void);
diff --git a/gcc/testsuite/gcc.dg/pr32450.c b/gcc/testsuite/gcc.dg/pr32450.c
index 4aaeb7dd654..0af262f5c67 100644
--- a/gcc/testsuite/gcc.dg/pr32450.c
+++ b/gcc/testsuite/gcc.dg/pr32450.c
@@ -3,7 +3,8 @@
 /* { dg-do run } */
 /* { dg-require-profiling "-pg" } */
 /* { dg-options "-O2 -pg" } */
-/* { dg-options "-O2 -pg -mtune=core2 -mfentry -fno-pic" { target { i?86-*-* 
x86_64-*-* } } } */
+/* { dg-options "-O2 -pg -mtune=core2" { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-gnu* 
x86_64-*-gnu* } } */
 /* { dg-options "-O2 -pg -static" { target hppa*-*-hpux* } } */
 
 extern void abort (void);
diff --git a/gcc/testsuite/gcc.dg/pr43643.c b/gcc/testsuite/gcc.dg/pr43643.c
index a62586dc719..41c00c8a2af 100644
--- a/gcc/testsuite/gcc.dg/pr43643.c
+++ b/gcc/testsuite/gcc.dg/pr43643.c
@@ -4,7 +4,7 @@
 /* { dg-require-profiling "-pg" } */
 /* { dg-options "-O2 -pg" } */
 /* { dg-options "-O2 -pg -static" { target hppa*-*-hpux* } } */
-/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-* x86_64-*-* } 
} */
+/* { dg-additional-options "-mfentry -fno-pic" { target i?86-*-gnu* 
x86_64-*-gnu* } } */
 
 extern char *strdup (const char *);
 
diff --git a/gcc/testsuite/gcc.target/i386/pr104447.c 
b/gcc/testsuite/gcc.target/i386/pr104447.c
index f58170db7ec..145ba90ac0c 100644
--- a/gcc/testsuite/gcc.target/i386/pr104447.c
+++ b/gcc/testsuite/gcc.target/i386/pr104447.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-profiling "-pg" } */
-/* { dg-options "-O2 -pg -mfentry -fno-pic" } */
+/* { dg-options "-O2 -pg" } */
+/* { dg-additional-options "-mfentry -fno-pic" { target *-*-gnu* } } */
 
 int
 bar (int x)
diff --git a/gcc/testsuite/gcc.target/i386/pr113122-3.c 
b/gcc/testsuite/gcc.target

Re: [PATCH] libgcc/Makefile.in: Delete `MACHMODE_H` def

2025-07-15 Thread Andrew Pinski

On Tue, Jul 15, 2025 at 2:51 PM John Ericson
 wrote:
>
> This dates back to the creation of top-level `libgcc` in
> fa9585134f6f58fa0d3da3ca4ad5493855aea2dc. I strongly suspect that this
> does nothing.

So looking into this further, MACHMODE_H used part of LIBGCC_DEPS
because of TM_H and r0-78222-gfa9585134f6f58 moved away from including
tm.h from libgcc. It was copied over unused.

>
> (For context, my overall goal here is hoping libgcc can depend on
> fewer/no stuff that is generated by `gcc/Makefile`. This is me trying to
> pluck some low-hanging fruit -- this is the only direct mention of
> `insn-modes.h` in libgcc.)

There are only a few things left:
GCC_CFLAGS; `$(CFLAGS_FOR_TARGET) $(INTERNAL_CFLAGS) $(T_CFLAGS)
$(LOOSE_WARN) $(C_LOOSE_WARN)` I am not sure how to remove that.
INHIBIT_LIBC_CFLAGS; looks complicated to remove
TARGET_SYSTEM_ROOT; this one should be easy I think; just `$CC
-print-sysroot` should print it out ...
NO_PIE_CFLAGS; check if __PIE__ is defined by the compiler and if so
add -fno-PIE to the CFLAGS

Thanks,
Andrew Pinski

> ---
>  libgcc/Makefile.in | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
> index 0719fd0615d..f7b48dceb06 100644
> --- a/libgcc/Makefile.in
> +++ b/libgcc/Makefile.in
> @@ -193,7 +193,6 @@ AWK = @AWK@
>  GCC_FOR_TARGET = $(CC)
>  LIPO = @LIPO@
>  LIPO_FOR_TARGET = $(LIPO)
> -MACHMODE_H = machmode.h mode-classes.def insn-modes.h
>  NM = @NM@
>  NM_FOR_TARGET = $(NM)
>  RANLIB_FOR_TARGET = $(RANLIB)
> @@ -220,7 +219,6 @@ export INSTALL_DATA
>  export LIB1ASMSRC
>  export LIBGCC2_CFLAGS
>  export LIPO_FOR_TARGET
> -export MACHMODE_H
>  export NM_FOR_TARGET
>  export STRIP_FOR_TARGET
>  export RANLIB_FOR_TARGET
> --
> 2.47.2
>

Re: [PATCH v2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 6:13 PM Patrick Palka  wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk only
> (since it impacts ABI)?
>
> Changes in v2:
>
>  - Condition on forward_iterator instead of default_initializable.
>
> -- >8 --
>
> LWG 3569 adjusted join_view's iterator specification to handle non
> default-constructible iterators by wrapping the corresponding data member
> in std::optional, which we followed suit in r13-2649-g7aa80c82ecf3a3.
>
> But this wrapping is unnecessary for iterators that are already
> default-constructible.  Rather than unconditionally using std::optional
> here, which introduces time/space overhead, this patch conditionalizes
> our LWG 3569 changes on the iterator in question being non-forward (and
> thus non default-constructible).  We check forwardness instead of
> default-constructibility in order to accomodate input-only iterators
> whose default constructor might be underconstrained.
>
I would rephrase that sentence a bit, to:
We check forwardness instead of
default-constructibility in order to accomodate input-only iterators that
satisfies but does not model default_initializable, e.g. whose default
constructor
is underconstrained.

Outside of that this LGTM.

>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (join_view::_Iterator::_M_satisfy):
> Adjust to handle non-std::optional _M_inner as per before LWG 3569.
> (join_view::_Iterator::_M_get_inner): New.
> (join_view::_Iterator::_M_inner): Don't wrap in std::optional if
> the iterator is forward.  Initialize.
> (join_view::_Iterator::operator*): Use _M_get_inner instead
> of *_M_inner.
> (join_view::_Iterator::operator++): Likewise.
> (join_view::_Iterator::iter_move): Likewise.
> (join_view::_Iterator::iter_swap): Likewise.
> ---
>  libstdc++-v3/include/std/ranges | 49 +
>  1 file changed, 37 insertions(+), 12 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index efe62969d657..c9dc25ee52ef 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -2971,7 +2971,12 @@ namespace views::__adaptor
>   }
>
> if constexpr (_S_ref_is_glvalue)
> - _M_inner.reset();
> + {
> +   if constexpr (forward_iterator<_Inner_iter>)
> + _M_inner = _Inner_iter();
> +   else
> + _M_inner.reset();
> + }
>   }
>
>   static constexpr auto
> @@ -3011,6 +3016,24 @@ namespace views::__adaptor
>   return *_M_parent->_M_outer;
>   }
>
> + constexpr _Inner_iter&
> + _M_get_inner()
> + {
> +   if constexpr (forward_iterator<_Inner_iter>)
> + return _M_inner;
> +   else
> + return *_M_inner;
> + }
> +
> + constexpr const _Inner_iter&
> + _M_get_inner() const
> + {
> +   if constexpr (forward_iterator<_Inner_iter>)
> + return _M_inner;
> +   else
> + return *_M_inner;
> + }
> +
>   constexpr
>   _Iterator(_Parent* __parent, _Outer_iter __outer) requires
> forward_range<_Base>
> : _M_outer(std::move(__outer)), _M_parent(__parent)
> @@ -3024,7 +3047,9 @@ namespace views::__adaptor
>   [[no_unique_address]]
> __detail::__maybe_present_t, _Outer_iter>
> _M_outer
>   = decltype(_M_outer)();
> - optional<_Inner_iter> _M_inner;
> + __conditional_t,
> + _Inner_iter, optional<_Inner_iter>> _M_inner
> +   = decltype(_M_inner)();
>   _Parent* _M_parent = nullptr;
>
> public:
> @@ -3048,7 +3073,7 @@ namespace views::__adaptor
>
>   constexpr decltype(auto)
>   operator*() const
> - { return **_M_inner; }
> + { return *_M_get_inner(); }
>
>   // _GLIBCXX_RESOLVE_LIB_DEFECTS
>   // 3500. join_view::iterator::operator->() is bogus
> @@ -3056,7 +3081,7 @@ namespace views::__adaptor
>   operator->() const
> requires __detail::__has_arrow<_Inner_iter>
>   && copyable<_Inner_iter>
> - { return *_M_inner; }
> + { return _M_get_inner(); }
>
>   constexpr _Iterator&
>   operator++()
> @@ -3067,7 +3092,7 @@ namespace views::__adaptor
>   else
> return *_M_parent->_M_inner;
> }();
> -   if (++*_M_inner == ranges::end(__inner_range))
> +   if (++_M_get_inner() == ranges::end(__inner_range))
>   {
> ++_M_get_outer();
> _M_satisfy();
> @@ -3097,9 +3122,9 @@ namespace views::__adaptor
>   {
> if (_M_outer == ranges::end(_M_parent->_M_base))
>   _M_inner = ranges::end(__detail::__as_lvalue(*--_M_outer));
> -

Re: [PATCH] libstdc++: Define std::nothrow as inline variable

2025-07-15 Thread Tomasz Kaminski

As a side note, I think the linkage model that is used for class-scope
static constexpr variables
is matching what we want to achieve here, i.e. we could have:
struct X {};
struct S {
  static constexpr X x = 10;
};

And then, in the source file.
const X S::x;

Maybe we could achieve a similar result for inline constexpr variables, by
putting some compiler specific
attributes on them?


On Tue, Jul 15, 2025 at 1:37 PM Jonathan Wakely  wrote:

> On Tue, 15 Jul 2025 at 12:06, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Tue, Jul 15, 2025 at 12:37 PM Jonathan Wakely 
> wrote:
> >>
> >> On Tue, 15 Jul 2025 at 08:10, Tomasz Kaminski 
> wrote:
> >> >
> >> >
> >> >
> >> > On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely 
> wrote:
> >> >>
> >> >> This makes it possible to use `new(std::nothrow) X` without linking
> to
> >> >> libsupc++ or libstdc++.
> >> >>
> >> >> To ensure we still export the symbol from the library we need to
> >> >> suppress the inline variable in libsupc++/new_handler.cc which is
> done
> >> >> by defining a macro.
> >> >>
> >> >> libstdc++-v3/ChangeLog:
> >> >>
> >> >> * libsupc++/new (nothrow): Define as inline variable.
> >> >> * libsupc++/new_handler.cc (_GLIBCXX_DEFINE_NOTHROW_OBJ):
> >> >> Define.
> >> >> ---
> >> >>
> >> >> Tested powerpc64le-linux.
> >> >>
> >> >>  libstdc++-v3/libsupc++/new| 4 
> >> >>  libstdc++-v3/libsupc++/new_handler.cc | 2 ++
> >> >>  2 files changed, 6 insertions(+)
> >> >>
> >> >> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
> >> >> index fb36dae25a6d..85d28ff40769 100644
> >> >> --- a/libstdc++-v3/libsupc++/new
> >> >> +++ b/libstdc++-v3/libsupc++/new
> >> >> @@ -125,7 +125,11 @@ namespace std
> >> >>  #endif
> >> >>};
> >> >>
> >> >> +#if defined __cpp_inline_variables && ! defined
> _GLIBCXX_DEFINE_NOTHROW_OBJ
> >> >> +  inline constexpr nothrow_t nothrow{};
> >> >> +#else
> >> >>extern const nothrow_t nothrow;
> >> >> +#endif
> >> >
> >> > If you move variable definition before include, this would become:
> >> > +#ifndef _GLIBCXX_DEFINE_NOTHROW_OBJ
> >> > +# ifdef __cpp_inline_variables
> >> > +  inline constexpr nothrow_t nothrow{};
> >> > + # else
> >> >   extern const nothrow_t nothrow;
> >> > +# endif
> >> > +#endif
> >> >
> >> > GCC and clang also accepts, i.e. version when we always have extern
> declaration:
> >> > +#if defined __cpp_inline_variables && ! defined
> _GLIBCXX_DEFINE_NOTHROW_OBJ
> >> > +  inline constexpr nothrow_t nothrow{};
> >> > +#endif
> >> >extern const nothrow_t nothrow;
> >> >
> >> >
> >> >>
> >> >>/** If you write your own error handler to be called by @c new,
> it must
> >> >> *  be of this type.  */
> >> >> diff --git a/libstdc++-v3/libsupc++/new_handler.cc
> b/libstdc++-v3/libsupc++/new_handler.cc
> >> >> index 7cd3e5a69fde..96dfb796c64a 100644
> >> >> --- a/libstdc++-v3/libsupc++/new_handler.cc
> >> >> +++ b/libstdc++-v3/libsupc++/new_handler.cc
> >> >> @@ -23,6 +23,8 @@
> >> >>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If
> not, see
> >> >>  // .
> >> >>
> >> >> +#define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
> >> >
> >> > Could we also move the definition of nothrow here (before including
> new),
> >> > so entities in header new always see it's definition when it is
> constexpr?
> >> > I do not think there is any (even bening) ODR violation possibility
> here, but
> >> > we can avoid the risk that way.,
> >>
> >> We would need to also move the definition of the nothrow_t type there,
> >> otherwise you can't declare the variable.
> >>
> >> And wouldn't it have internal linkage if it's defined as constexpr
> >> const in new_handler.cc?
> >> It only has external linkage because the extern declaration in 
> >> was already seen.
> >>
> >> So in the  header we would need:
> >>
> >> #ifndef _GLIBCXX_DEFINED_NOTHROW_OBJ // defined by new_handler.cc
> >>   struct nothrow_t
> >>   {
> >> #if __cplusplus >= 201103L
> >> explicit nothrow_t() = default;
> >> #endif
> >>   };
> >>
> >> #if defined __cpp_inline_variables
> >>   // Define as an inline variable when possible, so that std::nothrow
> can
> >>   // be used without linking to libsupc++.
> >>   inline constexpr nothrow_t nothrow{};
> >> #endif
> >>   extern const nothrow_t nothrow;
> >> #endif
> >>
> >>
> >> And then in new_handler.cc:
> >>
> >> namespace std
> >> {
> >>   struct nothrow_t
> >>   {
> >> explicit nothrow_t() = default;
> >>   };
> >>
> >> #ifdef __cpp_inline_variables
> >>   // purposely match when header defines it as inline+constexpr
> >>   constexpr
> >> #endif
> >>   extern const nothrow_t nothrow{};
> >> }
> >>
> >> // Prevent  from defining it again:
> >> #define _GLIBCXX_DEFINED_NOTHROW_OBJ 1
> >>
> >> #include "new"
> >>
> >>
> >> But then I get multiple definition errors when linking libstdc++, so
> >> something is wrong here.
> >
> > We could define the nothrow in header if
> _GLIBCXX_DEFINE_NOTHROW_OBJ_IN_NEW is defined:

[PATCH v3] gcc-16/changes.html: Add --enable-x86-64-mfentry

2025-07-15 Thread H.J. Lu

On Tue, Jul 15, 2025 at 2:04 PM Gerald Pfeifer  wrote:
>
> On Tue, 15 Jul 2025, Uros Bizjak wrote:
> > LGTM for content, but let's ask Gerald to proofread the entry.
>
> Happy to!
>
> +  The --enable-x86-64-mfentry configure option is
> +  added to enable -mfentry for x86-64 by default to use
> +  __fentry__, instead of mcount for
> +  profiling.  This option is enabled by default for glibc targets.
> +  
>
> This feels a bit complex to parse. How about something like
>
> +  The new --enable-x86-64-mfentry configure option
> +  makes -mfentry use __fentry__ instead
> +  of mcount for profiling on x86-64.  This option is
> +  enabled by default for glibc targets.
> +  
>
> This replaces "option is added to enable" by "new option makes", fixes
> grammar/word order and drops what feels like an extra "by default".

-mfentry is the option to enable __fentry__.   How about this v3 patch?

> If the latter is wrong, add "by default" before "for profiling on x86-64".
>
>
> What do you think? (If you like it, go ahead and push this new version.)
>
> Gerald



-- 
H.J.
From c8741974d0fb720bb27bcae65c5e116d101ce011 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 14 Jul 2025 20:32:11 +0800
Subject: [PATCH v3] gcc-16/changes.html: Add --enable-x86-64-mfentry

Signed-off-by: H.J. Lu 
---
 htdocs/gcc-16/changes.html | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/htdocs/gcc-16/changes.html b/htdocs/gcc-16/changes.html
index cc6fe204..a700c59c 100644
--- a/htdocs/gcc-16/changes.html
+++ b/htdocs/gcc-16/changes.html
@@ -118,6 +118,13 @@ for general information.
 
 
 
+  The new --enable-x86-64-mfentry configure option
+  enables -mfentry automatically to use
+  __fentry__, instead of mcount, for
+  profiling on x86-64.  This option is enabled by default for
+  glibc targets.
+  
+
 AMD GPU (GCN)
 
 
-- 
2.50.1

Re: [PATCH] libstdc++: Constrain std::swap using concepts in C++20

2025-07-15 Thread Tomasz Kaminski

On Mon, Jul 14, 2025 at 10:38 PM Jonathan Wakely  wrote:

> This is a minor compile-time optimization for C++20.
>
Please mention that you also replaced  _GLIBCXX20_CONSTEXPR, with
constexpr under __glibcxx_concepts (that is >= c++ 20).
Otherwise LGTM.

>
> libstdc++-v3/ChangeLog:
>
> * include/bits/move.h (swap): Replace enable_if with concepts
> when available, and with __enable_if_t alias otherwise.
> ---
>
> Tested powerpc64le-linux.
>
>  libstdc++-v3/include/bits/move.h | 30 ++
>  1 file changed, 18 insertions(+), 12 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/move.h
> b/libstdc++-v3/include/bits/move.h
> index 085ca074fc88..9c70263b526f 100644
> --- a/libstdc++-v3/include/bits/move.h
> +++ b/libstdc++-v3/include/bits/move.h
> @@ -215,14 +215,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> *  @return   Nothing.
>*/
>template
> -_GLIBCXX20_CONSTEXPR
> -inline
> -#if __cplusplus >= 201103L
> -typename enable_if<__and_<__not_<__is_tuple_like<_Tp>>,
> - is_move_constructible<_Tp>,
> - is_move_assignable<_Tp>>::value>::type
> +#if __glibcxx_concepts // >= C++20
> +requires (! __is_tuple_like<_Tp>::value)
> +  && is_move_constructible_v<_Tp>
> +  && is_move_assignable_v<_Tp>
> +// swap is constexpr when __cpp_lib_constexpr_algorithms >= 201806L
> +constexpr void
> +#elif __cplusplus >= 201103L
> +inline __enable_if_t<__and_<__not_<__is_tuple_like<_Tp>>,
> +   is_move_constructible<_Tp>,
> +   is_move_assignable<_Tp>>::value>
>  #else
> -void
> +inline void
>  #endif
>  swap(_Tp& __a, _Tp& __b)
>  _GLIBCXX_NOEXCEPT_IF(__and_,
> @@ -241,12 +245,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>// DR 809. std::swap should be overloaded for array types.
>/// Swap the contents of two arrays.
>template
> -_GLIBCXX20_CONSTEXPR
> -inline
> -#if __cplusplus >= 201103L
> -typename enable_if<__is_swappable<_Tp>::value>::type
> +#if __glibcxx_concepts // >= C++20
> +requires is_swappable_v<_Tp>
> +// swap is constexpr when __cpp_lib_constexpr_algorithms >= 201806L
> +constexpr void
> +#elif __cplusplus >= 201103L
> +inline __enable_if_t<__is_swappable<_Tp>::value>
>  #else
> -void
> +inline void
>  #endif
>  swap(_Tp (&__a)[_Nm], _Tp (&__b)[_Nm])
>  _GLIBCXX_NOEXCEPT_IF(__is_nothrow_swappable<_Tp>::value)
> --
> 2.50.1
>
>

Re: [PATCH 1/2] libstdc++: Add missing initializers for __maybe_present_t members [PR119962]

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka  wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
> 15?  Not sure if this corner case is worth backporting any further.
>
> Can we just use direct-list-initialization via {} instead of '= T()'
> here?  I wasn't sure so I went with the latter to more closely mirror
> the standard.
>
default_initializable (
https://eel.is/c++draft/concept.default.init#concept:default_initializable)
implies that T{} is well-formed, however there is no semantic
requirements that T() and T{} are the same.

The only case when {} could break, and "= T()" would work, is an aggregate
that has
members with explicit default constructor, but it would be ill-formed.
There is also difference
between lifetime of reference members with default initializers (binding to
temporary), but
it does not matter for members of the class, as they go away with
constructors.

There is also, pre-C++20 case, where classes with deleted default
constructor, where initializable
via {} but not = T(). But again, default_initializable requires both.

So, as far I can tell if we require T being default_initializable, we can
use {} for initialization, but
given that aggregate initialization/defintion changes every standard (you
can now mix base and
designated init in C++26), I would just use = T().



> -- >8 --
>
> Data members of type __maybe_present_t where the conditionally present
> type might be an aggregate or fundamental type need to be explicitly
> value-initialized (rather than implicitly default-initialized) to ensure
> that default-initialization of the containing class results in an
> completely initialized object.
>
> PR libstdc++/119962
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (join_view::_Iterator::_M_outer): Initialize.
> (lazy_split_view::_OuterIter::_M_current): Initialize.
> (join_with_view::_Iterator::_M_outer_it): Initialize.
> * testsuite/std/ranges/adaptors/join.cc (test15): New test.
> * testsuite/std/ranges/adaptors/join_with/1.cc (test05): New test.
> * testsuite/std/ranges/adaptors/lazy_split.cc (test13): New test.
> ---
>  libstdc++-v3/include/std/ranges  | 9 ++---
>  libstdc++-v3/testsuite/std/ranges/adaptors/join.cc   | 8 
>  .../testsuite/std/ranges/adaptors/join_with/1.cc | 8 
>  libstdc++-v3/testsuite/std/ranges/adaptors/lazy_split.cc | 8 
>  4 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 3a6710bd0ae1..efe62969d657 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -3022,7 +3022,8 @@ namespace views::__adaptor
>   { _M_satisfy(); }
>
>   [[no_unique_address]]
> -   __detail::__maybe_present_t, _Outer_iter>
> _M_outer;
> +   __detail::__maybe_present_t, _Outer_iter>
> _M_outer
> + = decltype(_M_outer)();
>   optional<_Inner_iter> _M_inner;
>   _Parent* _M_parent = nullptr;
>
> @@ -3376,7 +3377,8 @@ namespace views::__adaptor
>
>   [[no_unique_address]]
> __detail::__maybe_present_t,
> -   iterator_t<_Base>> _M_current;
> +   iterator_t<_Base>> _M_current
> + = decltype(_M_current)();
>   bool _M_trailing_empty = false;
>
> public:
> @@ -7400,7 +7402,8 @@ namespace views::__adaptor
>
>  _Parent* _M_parent = nullptr;
>  [[no_unique_address]]
> -  __detail::__maybe_present_t, _OuterIter>
> _M_outer_it;
> +  __detail::__maybe_present_t, _OuterIter>
> _M_outer_it
> +   = decltype(_M_outer_it)();
>  variant<_PatternIter, _InnerIter> _M_inner_it;
>
>  constexpr _OuterIter&
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
> b/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
> index 2861115c22a0..a9395b489919 100644
> --- a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc
> @@ -233,6 +233,13 @@ test14()
>VERIFY( ranges::equal(v | views::join, (int[]){1, 2, 3}) );
>  }
>
> +void
> +test15()
> +{
> +  // PR libstdc++/119962 - __maybe_present_t misses initialization
> +  constexpr
> decltype(views::join(views::single(views::single(0))).begin()) it;
> +}
> +
>  int
>  main()
>  {
> @@ -250,4 +257,5 @@ main()
>test12();
>test13();
>test14();
> +  test15();
>  }
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/join_with/1.cc
> b/libstdc++-v3/testsuite/std/ranges/adaptors/join_with/1.cc
> index 8ab30a5277da..4d55c9d3be78 100644
> --- a/libstdc++-v3/testsuite/std/ranges/adaptors/join_with/1.cc
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/join_with/1.cc
> @@ -94,6 +94,13 @@ test04()
>return true;
>  }
>
> +void
> +test05()
> +{
> +  // PR libstdc++/119962 - __maybe_present_t misses initial

Re: [committed] libstdc++: Add comments to deleted std::swap overloads for LWG 2766

2025-07-15 Thread Tomasz Kaminski

On Mon, Jul 14, 2025 at 10:58 PM Jonathan Wakely  wrote:

> We pre-emptively implemented part of LWG 2766, which still hasn't been
> approved. Add comments to the deleted swap overloads saying why they're
> there, because the standard doesn't require them.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/stl_pair.h (swap): Add comment to deleted
> overload.
> * include/bits/unique_ptr.h (swap): Likewise.
> * include/std/array (swap): Likewise.
> * include/std/optional (swap): Likewise.
> * include/std/tuple (swap): Likewise.
> * include/std/variant (swap): Likewise.
> * testsuite/23_containers/array/tuple_interface/get_neg.cc:
> Adjust dg-error line numbers.
> ---
>
> Tested powerpc64le-linux, pushed to trunk.
>
Only tangentially related, but it looks like we are adding more and more
use cases for having a conditional = delete(expr),
than can be followed by body.

>
>  libstdc++-v3/include/bits/stl_pair.h| 2 ++
>  libstdc++-v3/include/bits/unique_ptr.h  | 2 ++
>  libstdc++-v3/include/std/array  | 2 ++
>  libstdc++-v3/include/std/optional   | 2 ++
>  libstdc++-v3/include/std/tuple  | 4 +++-
>  libstdc++-v3/include/std/variant| 2 ++
>  .../23_containers/array/tuple_interface/get_neg.cc  | 6 +++---
>  7 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/stl_pair.h
> b/libstdc++-v3/include/bits/stl_pair.h
> index 8c57712b4617..393f6a016196 100644
> --- a/libstdc++-v3/include/bits/stl_pair.h
> +++ b/libstdc++-v3/include/bits/stl_pair.h
> @@ -1132,6 +1132,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #endif // C++23
>
>  #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
> +  // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +  // 2766. Swapping non-swappable types
>template
>  typename enable_if,
>__is_swappable<_T2>>::value>::type
> diff --git a/libstdc++-v3/include/bits/unique_ptr.h
> b/libstdc++-v3/include/bits/unique_ptr.h
> index 6ae46a93800c..d76ad63ba7bf 100644
> --- a/libstdc++-v3/include/bits/unique_ptr.h
> +++ b/libstdc++-v3/include/bits/unique_ptr.h
> @@ -832,6 +832,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  { __x.swap(__y); }
>
>  #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
> +  // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +  // 2766. Swapping non-swappable types
>template
>  typename enable_if::value>::type
>  swap(unique_ptr<_Tp, _Dp>&,
> diff --git a/libstdc++-v3/include/std/array
> b/libstdc++-v3/include/std/array
> index fdcf0b073762..12f010921db1 100644
> --- a/libstdc++-v3/include/std/array
> +++ b/libstdc++-v3/include/std/array
> @@ -381,6 +381,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  { __one.swap(__two); }
>
>  #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
> +  // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +  // 2766. Swapping non-swappable types
>template
>  __enable_if_t::_Is_swappable::value>
>  swap(array<_Tp, _Nm>&, array<_Tp, _Nm>&) = delete;
> diff --git a/libstdc++-v3/include/std/optional
> b/libstdc++-v3/include/std/optional
> index cc7af5bbd7d2..e5051d72c828 100644
> --- a/libstdc++-v3/include/std/optional
> +++ b/libstdc++-v3/include/std/optional
> @@ -1740,6 +1740,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  noexcept(noexcept(__lhs.swap(__rhs)))
>  { __lhs.swap(__rhs); }
>
> +  // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +  // 2766. Swapping non-swappable types
>template
>  enable_if_t && is_swappable_v<_Tp>)>
>  swap(optional<_Tp>&, optional<_Tp>&) = delete;
> diff --git a/libstdc++-v3/include/std/tuple
> b/libstdc++-v3/include/std/tuple
> index b39ce710984c..2e6499eab22d 100644
> --- a/libstdc++-v3/include/std/tuple
> +++ b/libstdc++-v3/include/std/tuple
> @@ -2835,6 +2835,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  { __x.swap(__y); }
>
>  #if __cpp_lib_ranges_zip // >= C++23
> +  /// Exchange the values of two const tuples (if const elements can be
> swapped)
>template
>  requires (is_swappable_v && ...)
>  constexpr void
> @@ -2844,7 +2845,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  #endif // C++23
>
>  #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
> -  /// Exchange the values of two const tuples (if const elements can be
> swapped)
> +  // _GLIBCXX_RESOLVE_LIB_DEFECTS
> +  // 2766. Swapping non-swappable types
>template
>  _GLIBCXX20_CONSTEXPR
>  typename enable_if...>::value>::type
> diff --git a/libstdc++-v3/include/std/variant
> b/libstdc++-v3/include/std/variant
> index ec46ff1dabb5..2f44f9700283 100644
> --- a/libstdc++-v3/include/std/variant
> +++ b/libstdc++-v3/include/std/variant
> @@ -1387,6 +1387,8 @@ namespace __detail::__variant
>  noexcept(noexcept(__lhs.swap(__rhs)))
>  { __lhs.swap(__rhs); }
>
> +  // _GLIBCXX_RESOLVE_LIB_

[PATCH] expand: Allow fixed-point arithmetic for RDIV_EXPR.

2025-07-15 Thread Robin Dapp


Hi,

r16-2175-g5aa21765236730 introduced an assert for floating-point modes
when expanding an RDIV_EXPR but forgot fixed-point modes.  This patch
adds ALL_FIXED_POINT_MODE_P to the assert.

Bootstrap and regtest running on x86, aarch64, and power10.  Regtested
on rv64gcv.  Regtest on arm running, needed to set it up still.

Regards
Robin

PR middle-end/121065

gcc/ChangeLog:

* cfgexpand.cc (expand_debug_expr): Allow fixed-point modes for
RDIV_EXPR.
* optabs-tree.cc (optab_for_tree_code): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr121065.c: New test.
---
gcc/cfgexpand.cc|  3 ++-
gcc/optabs-tree.cc  |  3 ++-
gcc/testsuite/gcc.target/arm/pr121065.c | 11 +++
3 files changed, 15 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/arm/pr121065.c

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index a656ccebf17..8a55f4f472a 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -5358,7 +5358,8 @@ expand_debug_expr (tree exp)
  return simplify_gen_binary (MULT, mode, op0, op1);

case RDIV_EXPR:
-  gcc_assert (FLOAT_MODE_P (mode));
+  gcc_assert (FLOAT_MODE_P (mode)
+ || ALL_FIXED_POINT_MODE_P (mode));
  /* Fall through.  */
case TRUNC_DIV_EXPR:
case EXACT_DIV_EXPR:
diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
index 9308a6dfd65..0de74c7966a 100644
--- a/gcc/optabs-tree.cc
+++ b/gcc/optabs-tree.cc
@@ -82,7 +82,8 @@ optab_for_tree_code (enum tree_code code, const_tree type,
return unknown_optab;
  /* FALLTHRU */
case RDIV_EXPR:
-  gcc_assert (FLOAT_TYPE_P (type));
+  gcc_assert (FLOAT_TYPE_P (type)
+ || ALL_FIXED_POINT_MODE_P (TYPE_MODE (type)));
  /* FALLTHRU */
case TRUNC_DIV_EXPR:
case EXACT_DIV_EXPR:
diff --git a/gcc/testsuite/gcc.target/arm/pr121065.c 
b/gcc/testsuite/gcc.target/arm/pr121065.c
new file mode 100644
index 000..dfc6059a46d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr121065.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-mcpu=cortex-m55" } */
+
+_Accum sa;
+char c;
+
+void
+div_csa ()
+{
+  c /= sa;
+}
--
2.50.0

Re: [PATCH] libstdc++: Make ranges::advance(it, n, bound) follow standard more strictly

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 09:25, Tomasz Kaminski  wrote:
>
>
>
> On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely  wrote:
>>
>> The standard specifies some of the effects of ranges::advance in terms
>> of "Equivalent to:" and it's observable that our current implementation
>> deviates from the precise specification in the standard.  This was
>> causing some failures in the libc++ testsuite.
>>
>> For the sized_sentinel_for case I optimized our implementation to
>> avoid redundant calls when we have already checked that there's nothing
>> to do.  We were eliding `advance(i, bound)` when the iterator already
>> equals the sentinel, and eliding `advance(i, n)` when `n` is zero. In
>> both cases, removing the seemingly redundant calls is not equivalent to
>> the spec because `i = std::move(bound)` or `i += 0` operations can be
>> observed by program-defined iterators. This patch inlines the observable
>> side effects of advance(i, bound) or advance(i, 0) without actually
>> calling those functions.
>>
>> For the non-sized sentinel case, `if (i == bound || n == 0)` is
>> different from `if (n == 0 || i == bound)` for the case where n is zero
>> and a program-defined iterator observes the number of comparisons.
>> This patch changes it to do `n == 0` first. I don't think this is
>> required by the standard, as this condition is not "Equivalent to:" any
>> observable sequence of operations, but testing `n == 0` first is
>> probably cheaper anyway.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/bits/ranges_base.h (ranges::advance(i, n, bound)):
>> Ensure that observable side effects on iterators match what is
>> specified in the standard.
>> ---
>>
>> For most iterator types, I assume the compiler will inline these extra
>> redundant operations, so they'll only make a difference for iterators
>> that do actually observe the number of operations.
>>
>> Tested powerpc64le-linux.
>
> LGTM
>>
>>
>>  libstdc++-v3/include/bits/ranges_base.h | 18 +++---
>>  1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/libstdc++-v3/include/bits/ranges_base.h 
>> b/libstdc++-v3/include/bits/ranges_base.h
>> index 0251e5d0928a..4a0b9e2f5ec3 100644
>> --- a/libstdc++-v3/include/bits/ranges_base.h
>> +++ b/libstdc++-v3/include/bits/ranges_base.h
>> @@ -882,7 +882,14 @@ namespace ranges
>> const auto __diff = __bound - __it;
>
> Would you mind changing this iter_difference_t<_It>, this is required to be 
> exactly that type,

Yes, if that helps with reading the code, let's be explicit about the
type. I've made that change.

> but I think it would help me with understanding (see comment below)
>>
>>
>> if (__diff == 0)
>> - return __n;
>> + {
>> +   // inline any possible side effects of advance(it, bound)
>> +   if constexpr (assignable_from<_It&, _Sent>)
>> + __it = std::move(__bound);
>> +   else if constexpr (random_access_iterator<_It>)
>> + __it += 0;
>> +   return __n;
>> + }
>> else if (__diff > 0 ? __n >= __diff : __n <= __diff)
>
> I was wondering if we should check precondition __glibcxx_assert((__n < 0) == 
> (__diff < 0)),
> but then I realized that we never enter this branch, if __n and __diff 
> disagree on direction.

Right. The standard says to check |n| >= |diff| and if we did that,
then we would need to check that the directions agree. But that would
mean abs(n) >= abs(diff) which would be three comparisons and we'd
also need to check the directions agree. The way I implemented it only
does two, and implicitly checks the directions agree. If they don't
agree then we take the next branch and fail the assertion there.

It's not necessarily better code the way I did the comparisons, at
least not with GCC:
https://godbolt.org/z/P6cc7s3WP
But it avoids needing two assertions, so that should be less code for
the entire function.

If assertions are disabled then we take the "wrong" branch and do
advance(i, n) instead of advance(i, bound), but that case is UB anyway
so there is no "wrong" outcome. However, now that I think about it,
maybe advance(i, bound) would be safer for the UB case, if we assume
that the sentinel is the correct bound. Advancing by an incorrect
value of n (in the wrong direction) might take us out of bounds,
whereas advance(i, bound) will not, assuming a correct bound.

> And both __n and __diff are or iter_different_t, so there is no chance of 
> signed->unsigned promotion.

Yeah.

Re: [PATCH 3/7] aarch64: Handle DImode BCAX operations

2025-07-15 Thread Kyrylo Tkachov



> On 8 Jul 2025, at 17:43, Richard Sandiford  wrote:
>
> Kyrylo Tkachov  writes:
>> Thanks for your comments, do you mean something like the following?
>
> Yeah, the patch LGTM, thanks.

So it turned out that doing this in the EOR3 pattern in patch 4/7 caused 
wrong-code in 531.deepsjeng_r:
  [(set (match_operand:DI 0 "register_operand")
(xor:DI
 (xor:DI
  (match_operand:DI 2 "register_operand")
  (match_operand:DI 3 "register_operand"))
 (match_operand:DI 1 "register_operand")))]
  "TARGET_SHA3"
  {@ [ cons: =0, 1, 2 , 3  ; attrs: type ]
 [ w   , w, w , w  ; crypto_sha3 ] eor3\t%0.16b, %1.16b, %2.16b, %3.16b
 [ &r  , r, r0, r0 ; multiple] #
  }
  "&& REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
  [(set (match_dup 4) (xor:DI (match_dup 2) (match_dup 3)))
   (set (match_dup 0) (xor:DI (match_dup 4) (match_dup 1)))]
  {
if (reload_completed)
  operands[4] = operands[0];
else if (can_create_pseudo_p ())
  operands[4] = gen_reg_rtx (DImode);
else
  FAIL;
  }

Using just “=&r,r,r,r” rather than “=&r,r,r0,r0” worked fine.
I think I’ll need to dive deeper into the details, but if something stands out 
it’d be great.

Thanks,
Kyrill


>
> Richard
>
>> Or do you mean to have separate alternatives with each one individually 
>> tying one of operands 2 or 3 to r0?
>>
>> Kyrill
>>
>>
>>>
>>> Thanks,
>>> Tamar
>>>
 Thanks,
 Richard
>>
>>
>> From 9e67b44d9ff111b0f280d7f3fe2c197aa7dabc94 Mon Sep 17 00:00:00 2001
>> From: Kyrylo Tkachov 
>> Date: Thu, 3 Jul 2025 09:45:02 -0700
>> Subject: [PATCH 3/7] aarch64: Handle DImode BCAX operations
>>
>> To handle DImode BCAX operations we want to do them on the SIMD side only if
>> the incoming arguments don't require a cross-bank move.
>> This means we need to split back the combination to separate GP BIC+EOR
>> instructions if the operands are expected to be in GP regs through reload.
>> The split happens pre-reload if we already know that the destination will be
>> a GP reg.  Otherwise if reload descides to use the "=r,r" alternative we 
>> ensure
>> operand 0 is early-clobber.
>> This scheme is similar to how we handle the BSL operations elsewhere in
>> aarch64-simd.md.
>>
>> Thus, for the functions:
>> uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) { return BCAX (a, b, 
>> c); }
>> uint64x1_t bcax_d (uint64x1_t a, uint64x1_t b, uint64x1_t c) { return BCAX 
>> (a, b, c); }
>>
>> we now generate the desired:
>> bcax_d_gp:
>>bic x1, x1, x2
>>eor x0, x1, x0
>>ret
>>
>> bcax_d:
>>bcaxv0.16b, v0.16b, v1.16b, v2.16b
>>ret
>>
>> When the inputs are in SIMD regs we use BCAX and when they are in GP regs we
>> don't force them to SIMD with extra moves.
>>
>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>
>> Signed-off-by: Kyrylo Tkachov 
>>
>> gcc/
>>
>> * config/aarch64/aarch64-simd.md (*bcaxqdi4): New
>> define_insn_and_split.
>>
>> gcc/testsuite/
>>
>> * gcc.target/aarch64/simd/bcax_d.c: Add tests for DImode arguments.
>> ---
>> gcc/config/aarch64/aarch64-simd.md| 29 +++
>> .../gcc.target/aarch64/simd/bcax_d.c  |  6 +++-
>> 2 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>> b/gcc/config/aarch64/aarch64-simd.md
>> index 4493e55603d..270cb2ff3a1 100644
>> --- a/gcc/config/aarch64/aarch64-simd.md
>> +++ b/gcc/config/aarch64/aarch64-simd.md
>> @@ -9252,6 +9252,35 @@
>>   [(set_attr "type" "crypto_sha3")]
>> )
>>
>> +(define_insn_and_split "*bcaxqdi4"
>> +  [(set (match_operand:DI 0 "register_operand")
>> + (xor:DI
>> +   (and:DI
>> + (not:DI (match_operand:DI 3 "register_operand"))
>> + (match_operand:DI 2 "register_operand"))
>> +   (match_operand:DI 1 "register_operand")))]
>> +  "TARGET_SHA3"
>> +  {@ [ cons: =0, 1, 2 , 3  ; attrs: type ]
>> + [ w   , w, w , w  ; crypto_sha3 ] bcax\t%0.16b, %1.16b, %2.16b, 
>> %3.16b
>> + [ &r  , r, r0, r0 ; multiple] #
>> +  }
>> +  "&& REG_P (operands[0]) && GP_REGNUM_P (REGNO (operands[0]))"
>> +  [(set (match_dup 4)
>> + (and:DI (not:DI (match_dup 3))
>> + (match_dup 2)))
>> +   (set (match_dup 0)
>> + (xor:DI (match_dup 4)
>> + (match_dup 1)))]
>> +  {
>> +if (reload_completed)
>> +  operands[4] = operands[0];
>> +else if (can_create_pseudo_p ())
>> +  operands[4] = gen_reg_rtx (DImode);
>> +else
>> +  FAIL;
>> +  }
>> +)
>> +
>> ;; SM3
>>
>> (define_insn "aarch64_sm3ss1qv4si"
>> diff --git a/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c 
>> b/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c
>> index d68f0e102bf..a7640c3f6f1 100644
>> --- a/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c
>> +++ b/gcc/testsuite/gcc.target/aarch64/simd/bcax_d.c
>> @@ -7,9 +7,13 @@
>>
>> #define BCAX(x,y,z)  ((x) ^ ((y) & ~(z)))
>>
>> +/* When the inputs come from GP regs don't form a BCAX.  */
>> +uint64_t bcax_d_gp (uint64_t a, uint64_t b, uint64_t c) {

Re: [PATCH 0/1] [RFC][AutoFDO]: Source filename tracking in GCOV

2025-07-15 Thread Dhruv Chawla


On 08/07/25 18:01, Jan Hubicka wrote:

External email: Use caution opening links or attachments



Hi Honza,


On 8 Jul 2025, at 2:26 am, Jan Hubicka  wrote:

External email: Use caution opening links or attachments


Hi,
as discussed also on the autofdo pull request, LLVM solves the same
problem using -funique-internal-linkage-names
https://reviews.llvm.org/D73307

All non-public functions gets theis symbol renamed from
.__uniq.


How is  __uniq. added to static symbols 
in the profile?


The patch does three things
  1) extends ipa-visibility pass to rename all non-public function
 symbols adding the __uniq suffix.
 This skips those marked as used so asm statements can work.
  2) makes dwarf2out to always add DW_AT_linkage_name attribute to
 inlined to DW_TAG_inlined_subroutine dies
  3) extends auto-profile to accept profiles with unique names
 when building without unique names and vice versa.

I think it is pretty much what LLVM does except that I compute hash
based on object file name while LLVM uses filename of the outer
translation unit (which is easy to change, I just wanted to have
something functional to see how it works in practice).

There is a comment on the pull request comment I added
https://github.com/google/autofdo/pull/244#issuecomment-3046121191
So it seems that llvm folks are not that happy with uniq suffixes since
it breaks asm statements in Linux kernel.  I originally tought renaming
is done in dwarf only but indeed renaming all static symbols is quite
radical.

Their proposal
https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801
seems to be equivalent to what we have as profile_id.  It is 64bit
identifier of a function that should be stable across builds and (modulo
conflits) unique within translated program.  Currently it is assigned
only to functions that may be used as indirect call targets and is used
by normal FDO for resolving cross-unit indirect calls.

One option would be to use profile IDs in auto-profiles too.  I guess
they can be streamed to dwarf via an extension as 64bit IDs. But it is
not clear to me that it is what LLVM folks work on and if it will
eventually get upstreamed.

If we want to finish your solution (adding file names in create_gcov). I
think we need to solve the following:
  1) extend dwarf2out to add DW_AT_linkage_name attributes for all
 function symbols.  This is easy to do.
  2) veirfy that create_gcov can safely determine symbols with public
 or static linkage (even inlined ones).  There is DW_AT_public
 attribute
 and stream file names only for public linkage symbols
  3) instead of streaming filename of file containing the symbol
 stream filename of the corresponding translation unit.

I would say that the advantage of profile id is probably shorter gcov
files, advantage of streaming filename:symbol_name pairs is that the
profile info is easier to read.  What do you think?


Hi Honza,

Is there a way to do the renaming only for symbols that are duplicated?
Either way, I think your solution is better because it reuses all the
infrastructure that already exists and it also avoids having to modify the
GCOV format.

Are you planning on committing your changes for this?



Honza



--
Regards,
Dhruv

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Luc Grosheintz





On 7/14/25 08:57, Tomasz Kaminski wrote:

Hi Luc,

While running the libc++ test on libstdc++ we have found the following
issue in our implementation.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121061
Would you be interested in looking into fixing this?


Yes, I'll take care of this. Do you know if it's the only issue?
If it's not clear it might make sense that I learn how to run their
tests on our code.



Also, libc++ makes the default constructor of mdpsan conditionally noexcept
as extension
(standard does not require it
https://eel.is/c++draft/views.multidim#mdspan.mdspan.cons).
We could do the same. Instead of writing a big conditional noexcept
specification, I would suggest
defaultint the constructor on first declaration:
   constexpr
   mdspan()
   requires (rank_dynamic() > 0)
   && is_default_constructible_v
   && is_default_constructible_v
   && is_default_constructible_v
   = default;

And then having default member initializers:
 private:
   [[no_unique_address]] accessor_type _M_accessor = accessor_type();
   [[no_unique_address]] mapping_type _M_mapping = mapping_type();
   [[no_unique_address]] data_handle_type _M_handle = data_handle_type();
We do not want to use "{}" as the samantis is a bit differnt.



Yes, I think it makes sense if those two implementations behave
the same.

Re: [PATCH] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-07-15 Thread Konstantinos Eleftheriou

Hi Richard, thanks for the review!

We've tested the pre-patch version on PowerPC and there was indeed an
issue, which is now fixed with the patched version.

Konstantinos

On Fri, Jul 4, 2025 at 9:34 AM Richard Sandiford
 wrote:
>
> Konstantinos Eleftheriou  writes:
> > On Wed, May 7, 2025 at 11:29 AM Richard Sandiford
> >  wrote:
> >> But I thought the code was allowing multiple stores to be forwarded to
> >> a single (wider) load.  E.g. 4 individual byte stores at address X, X+1,
> >> X+2 and X+3 could be forwarded to a 4-byte load at address X.  And the code
> >> I mentioned is handling the least significant byte by zero-extending it.
> >>
> >> For big-endian targets, the least significant byte should come from
> >> address X+3 rather than address X.  The byte at address X (i.e. the
> >> byte with the equal offset) should instead go in the most significant
> >> byte, typically using a shift left.
> > Hi, I'm attaching a patch that we prepared for this. It would be of
> > great help if someone could test it on a big-endian target, preferably
> > one with BITS_BIG_ENDIAN == 0 as we were having issues with that in
> > the past.
> >
> > From 278f83b834a97541fe0a2d2bbad84aca34601fed Mon Sep 17 00:00:00 2001
> > From: Konstantinos Eleftheriou 
> > Date: Tue, 3 Jun 2025 09:16:17 +0200
> > Subject: [PATCH] asf: Fix offset check in base reg initialization for
> >  big-endian targets
> >
> > During the base register initialization, in the case that we are
> > eliminating the load instruction, we are using `offset == 0` in order
> > to find the store instruction that has the same offset as the load. This
> > would not work on big-endian targets where byte 0 would be the MS byte.
> >
> > This patch updates the condition to take into account the target's 
> > endianness.
> >
> > We are, also, removing the adjustment of the starting position for the
> > bitfield insertion, when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN. This is
> > supposed to be handled inside `store_bit_field` and it's not needed anymore
> > after the offset fix.
>
> Yeah.  In particular BITS_BIG_ENDIAN describes how bits are measured
> by insv and extv.  Its effect is localise as much as possible and so it
> doesn't affect how store_bit_field measures bits.  store_bit_field
> instead always measures in memory order (BITS_PER_UNIT == second byte
> in memory order, etc.)
>
> Out of curiosity, did you try this on a little-endian POWER system?
> That has BITS_BIG_ENDIAN==1, BYTES_BIG_ENDIAN==0, so I would have
> expected something to have gone wrong with the pre-patch code.  It would
> be good to check if you could (there are some machines in the compile farm).
>
> > gcc/ChangeLog:
> >
> >   * avoid-store-forwarding.cc (generate_bit_insert_sequence):
> >   Remove adjustment of bitfield insertion's starting position
> >   when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN.
> >   * avoid-store-forwarding.cc (process_store_forwarding):
> >   Update offset check in base reg initialization to take
> >   into account the target's endianness.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/aarch64/avoid-store-forwarding-be.c: New test.
> > ---
> >  gcc/avoid-store-forwarding.cc | 18 ---
> >  .../aarch64/avoid-store-forwarding-be.c   | 23 +++
> >  2 files changed, 28 insertions(+), 13 deletions(-)
> >  create mode 100644 
> > gcc/testsuite/gcc.target/aarch64/avoid-store-forwarding-be.c
> >
> > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
> > index 6825d0426ecc..457d1d0c200c 100644
> > --- a/gcc/avoid-store-forwarding.cc
> > +++ b/gcc/avoid-store-forwarding.cc
> > @@ -119,17 +119,6 @@ generate_bit_insert_sequence (store_fwd_info 
> > *store_info, rtx dest)
> >unsigned HOST_WIDE_INT bitsize = store_size * BITS_PER_UNIT;
> >unsigned HOST_WIDE_INT start = store_info->offset * BITS_PER_UNIT;
> >
> > -  /* Adjust START for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
> > - Given that the bytes will be reversed in this case, we need to
> > - calculate the starting position from the end of the destination
> > - register.  */
> > -  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
> > -{
> > -  unsigned HOST_WIDE_INT load_mode_bitsize
> > - = (GET_MODE_BITSIZE (GET_MODE (dest))).to_constant ();
> > -  start = load_mode_bitsize - bitsize - start;
> > -}
> > -
> >rtx mov_reg = store_info->mov_reg;
> >store_bit_field (dest, bitsize, start, 0, 0, GET_MODE (mov_reg), mov_reg,
> >  false, false);
> > @@ -248,11 +237,14 @@ process_store_forwarding (vec 
> > &stores, rtx_insn *load_insn,
> >  {
> >it->mov_reg = gen_reg_rtx (GET_MODE (it->store_mem));
> >rtx_insn *insns = NULL;
> > -  const bool has_zero_offset = it->offset == 0;
> > +  HOST_WIDE_INT store_size = MEM_SIZE (it->store_mem).to_constant ();
> > +  const bool has_base_offset = BYTES_BIG_ENDIAN
> > +

Re: [PATCH] libstdc++: Define std::nothrow as inline variable

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 12:37 PM Jonathan Wakely  wrote:

> On Tue, 15 Jul 2025 at 08:10, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely 
> wrote:
> >>
> >> This makes it possible to use `new(std::nothrow) X` without linking to
> >> libsupc++ or libstdc++.
> >>
> >> To ensure we still export the symbol from the library we need to
> >> suppress the inline variable in libsupc++/new_handler.cc which is done
> >> by defining a macro.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> * libsupc++/new (nothrow): Define as inline variable.
> >> * libsupc++/new_handler.cc (_GLIBCXX_DEFINE_NOTHROW_OBJ):
> >> Define.
> >> ---
> >>
> >> Tested powerpc64le-linux.
> >>
> >>  libstdc++-v3/libsupc++/new| 4 
> >>  libstdc++-v3/libsupc++/new_handler.cc | 2 ++
> >>  2 files changed, 6 insertions(+)
> >>
> >> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
> >> index fb36dae25a6d..85d28ff40769 100644
> >> --- a/libstdc++-v3/libsupc++/new
> >> +++ b/libstdc++-v3/libsupc++/new
> >> @@ -125,7 +125,11 @@ namespace std
> >>  #endif
> >>};
> >>
> >> +#if defined __cpp_inline_variables && ! defined
> _GLIBCXX_DEFINE_NOTHROW_OBJ
> >> +  inline constexpr nothrow_t nothrow{};
> >> +#else
> >>extern const nothrow_t nothrow;
> >> +#endif
> >
> > If you move variable definition before include, this would become:
> > +#ifndef _GLIBCXX_DEFINE_NOTHROW_OBJ
> > +# ifdef __cpp_inline_variables
> > +  inline constexpr nothrow_t nothrow{};
> > + # else
> >   extern const nothrow_t nothrow;
> > +# endif
> > +#endif
> >
> > GCC and clang also accepts, i.e. version when we always have extern
> declaration:
> > +#if defined __cpp_inline_variables && ! defined
> _GLIBCXX_DEFINE_NOTHROW_OBJ
> > +  inline constexpr nothrow_t nothrow{};
> > +#endif
> >extern const nothrow_t nothrow;
> >
> >
> >>
> >>/** If you write your own error handler to be called by @c new, it
> must
> >> *  be of this type.  */
> >> diff --git a/libstdc++-v3/libsupc++/new_handler.cc
> b/libstdc++-v3/libsupc++/new_handler.cc
> >> index 7cd3e5a69fde..96dfb796c64a 100644
> >> --- a/libstdc++-v3/libsupc++/new_handler.cc
> >> +++ b/libstdc++-v3/libsupc++/new_handler.cc
> >> @@ -23,6 +23,8 @@
> >>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not,
> see
> >>  // .
> >>
> >> +#define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
> >
> > Could we also move the definition of nothrow here (before including new),
> > so entities in header new always see it's definition when it is
> constexpr?
> > I do not think there is any (even bening) ODR violation possibility
> here, but
> > we can avoid the risk that way.,
>
> We would need to also move the definition of the nothrow_t type there,
> otherwise you can't declare the variable.
>
> And wouldn't it have internal linkage if it's defined as constexpr
> const in new_handler.cc?
> It only has external linkage because the extern declaration in 
> was already seen.
>
> So in the  header we would need:
>
> #ifndef _GLIBCXX_DEFINED_NOTHROW_OBJ // defined by new_handler.cc
>   struct nothrow_t
>   {
> #if __cplusplus >= 201103L
> explicit nothrow_t() = default;
> #endif
>   };
>
> #if defined __cpp_inline_variables
>   // Define as an inline variable when possible, so that std::nothrow can
>   // be used without linking to libsupc++.
>   inline constexpr nothrow_t nothrow{};
> #endif
>   extern const nothrow_t nothrow;
> #endif
>
>
> And then in new_handler.cc:
>
> namespace std
> {
>   struct nothrow_t
>   {
> explicit nothrow_t() = default;
>   };
>
> #ifdef __cpp_inline_variables
>   // purposely match when header defines it as inline+constexpr
>   constexpr
> #endif
>   extern const nothrow_t nothrow{};
> }
>
> // Prevent  from defining it again:
> #define _GLIBCXX_DEFINED_NOTHROW_OBJ 1
>
> #include "new"
>
>
> But then I get multiple definition errors when linking libstdc++, so
> something is wrong here.
>
We could define the nothrow in header if _GLIBCXX_DEFINE_NOTHROW_OBJ_IN_NEW
is defined:
#ifdef _GLIBCXX_DEFINE_NOTHROW_OBJ_IN_NEW
  extern _GLIBCXX17_CONSTEXPR const nothrow_t nothrow = nothrow_t();
#elifdef __cpp_inline_variables
  // Define as an inline variable when possible, so that std::nothrow can
  // be used without linking to libsupc++.
  inline constexpr nothrow_t nothrow{};
#else
  extern const nothrow_t nothrow;
#endif

And then in new_handler.cc, do:
// instruct  to provide definition of nothrow.
#define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
#include "new"

>
> If all you're trying to do is ensure that 'constexpr' is seen
> consistently, I don't think it's worth it. It's already not going to
> be constexpr when combining C++98 and C++17 objects, because it can
> never be constexpr in C++98.
>
That's true, so you can ship the original as is.

Regards,
Tomasz

Re: [PATCH v2] RISC-V: Vector-scalar widening multiply-(subtract-)accumulate [PR119100]

2025-07-15 Thread Paul-Antoine Arras


On 14/07/2025 14:13, Jeff Law wrote:
Paul-Antoine''s patches don't have the leading "a" and "b" component 
typically seen in a patch from git diff.  I wonder if that's why pre- 
commit testing isn't picking them up properly.


It's something I noticed when adding them to my system.


Yes, I have this slight alteration in my git-diff because it makes some 
operations easier. However I already had it in previous patches and it 
apparently did not prevent CI from applying them. So not sure exactly 
what is going on here.

--
PA

Re: [PATCH] libstdc++: Define std::nothrow as inline variable

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 08:10, Tomasz Kaminski  wrote:
>
>
>
> On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely  wrote:
>>
>> This makes it possible to use `new(std::nothrow) X` without linking to
>> libsupc++ or libstdc++.
>>
>> To ensure we still export the symbol from the library we need to
>> suppress the inline variable in libsupc++/new_handler.cc which is done
>> by defining a macro.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * libsupc++/new (nothrow): Define as inline variable.
>> * libsupc++/new_handler.cc (_GLIBCXX_DEFINE_NOTHROW_OBJ):
>> Define.
>> ---
>>
>> Tested powerpc64le-linux.
>>
>>  libstdc++-v3/libsupc++/new| 4 
>>  libstdc++-v3/libsupc++/new_handler.cc | 2 ++
>>  2 files changed, 6 insertions(+)
>>
>> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
>> index fb36dae25a6d..85d28ff40769 100644
>> --- a/libstdc++-v3/libsupc++/new
>> +++ b/libstdc++-v3/libsupc++/new
>> @@ -125,7 +125,11 @@ namespace std
>>  #endif
>>};
>>
>> +#if defined __cpp_inline_variables && ! defined _GLIBCXX_DEFINE_NOTHROW_OBJ
>> +  inline constexpr nothrow_t nothrow{};
>> +#else
>>extern const nothrow_t nothrow;
>> +#endif
>
> If you move variable definition before include, this would become:
> +#ifndef _GLIBCXX_DEFINE_NOTHROW_OBJ
> +# ifdef __cpp_inline_variables
> +  inline constexpr nothrow_t nothrow{};
> + # else
>   extern const nothrow_t nothrow;
> +# endif
> +#endif
>
> GCC and clang also accepts, i.e. version when we always have extern 
> declaration:
> +#if defined __cpp_inline_variables && ! defined _GLIBCXX_DEFINE_NOTHROW_OBJ
> +  inline constexpr nothrow_t nothrow{};
> +#endif
>extern const nothrow_t nothrow;
>
>
>>
>>/** If you write your own error handler to be called by @c new, it must
>> *  be of this type.  */
>> diff --git a/libstdc++-v3/libsupc++/new_handler.cc 
>> b/libstdc++-v3/libsupc++/new_handler.cc
>> index 7cd3e5a69fde..96dfb796c64a 100644
>> --- a/libstdc++-v3/libsupc++/new_handler.cc
>> +++ b/libstdc++-v3/libsupc++/new_handler.cc
>> @@ -23,6 +23,8 @@
>>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>>  // .
>>
>> +#define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
>
> Could we also move the definition of nothrow here (before including new),
> so entities in header new always see it's definition when it is constexpr?
> I do not think there is any (even bening) ODR violation possibility here, but
> we can avoid the risk that way.,

We would need to also move the definition of the nothrow_t type there,
otherwise you can't declare the variable.

And wouldn't it have internal linkage if it's defined as constexpr
const in new_handler.cc?
It only has external linkage because the extern declaration in 
was already seen.

So in the  header we would need:

#ifndef _GLIBCXX_DEFINED_NOTHROW_OBJ // defined by new_handler.cc
  struct nothrow_t
  {
#if __cplusplus >= 201103L
explicit nothrow_t() = default;
#endif
  };

#if defined __cpp_inline_variables
  // Define as an inline variable when possible, so that std::nothrow can
  // be used without linking to libsupc++.
  inline constexpr nothrow_t nothrow{};
#endif
  extern const nothrow_t nothrow;
#endif


And then in new_handler.cc:

namespace std
{
  struct nothrow_t
  {
explicit nothrow_t() = default;
  };

#ifdef __cpp_inline_variables
  // purposely match when header defines it as inline+constexpr
  constexpr
#endif
  extern const nothrow_t nothrow{};
}

// Prevent  from defining it again:
#define _GLIBCXX_DEFINED_NOTHROW_OBJ 1

#include "new"


But then I get multiple definition errors when linking libstdc++, so
something is wrong here.

If all you're trying to do is ensure that 'constexpr' is seen
consistently, I don't think it's worth it. It's already not going to
be constexpr when combining C++98 and C++17 objects, because it can
never be constexpr in C++98.

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 1:35 PM Jonathan Wakely  wrote:

> OK here are the details for all of the failing tests ...
>
> std/containers/views/mdspan/aligned_accessor/access.pass.cpp
>
> std/containers/views/mdspan/aligned_accessor/ctor.conversion.from.default_accessor.pass.cpp
> std/containers/views/mdspan/aligned_accessor/ctor.conversion.pass.cpp
> std/containers/views/mdspan/aligned_accessor/ctor.default.pass.cpp
> std/containers/views/mdspan/aligned_accessor/offset.pass.cpp
>
> std/containers/views/mdspan/aligned_accessor/operator.conversion.to.default_accessor.pass.cpp
> std/containers/views/mdspan/aligned_accessor/types.pass.cpp
>
> We don't support aligned_accessor yet.
>
> std/containers/views/mdspan/extents/ctor_from_array.pass.cpp
> std/containers/views/mdspan/extents/ctor_from_span.pass.cpp
>
> These are both the same issue as Bug 121061:
> static_assert(!std::is_constructible_v,
> std::span>);
>
> std/containers/views/mdspan/extents/dims.pass.cpp
>
> We don't support std:::dims
>
>
> std/containers/views/mdspan/layout_stride/is_exhaustive_corner_case.pass.cpp
>
> Somebody needs to analyze this one.
>
This is most likely related to us implementing
https://cplusplus.github.io/LWG/issue4266.

>
> std/containers/views/mdspan/layout_stride/properties.pass.cpp
>
> This looks similar, is_always_exhaustive() is giving the wrong answer
> in some case.
>
> std/containers/views/mdspan/mdspan/ctor.default.pass.cpp
>
> This is checking for a noexcept default ctor.
>
> std/containers/views/mdspan/mdspan/ctor.dh_array.pass.cpp
> std/containers/views/mdspan/mdspan/ctor.dh_span.pass.cpp
>
> These are Bug 121061
>
> std/containers/views/mdspan/mdspan/index_operator.pass.cpp
>
> Dunno what's happening here:
>
> std/containers/views/mdspan/mdspan/index_operator.pass.cpp:178:46:
> in 'constexpr' expansion of
> 'check_operator_constraints nt, 18446744073709551615>, std::layout_left,
> std::default_accessor >, std::array true>, 1> >(std::mdspan , std::layout_left,
> std::default_accessor >(((int*)(& data)),
> construct_mapping >((s
> td::layout_left(), std::layout_left()), std::extents 18446744073709551615>(1))), std::array true>, 1>{std::__array_traits<
> IntConfig, 1>::_Type{IntConfig true, true>{0}}})'
> /home/jwakely/gcc/16/include/c++/16.0.0/mdspan:1234:25: error: no
> match for 'operator[]' (operand types are 'const std::mdspan std::extents 44073709551615>, std::layout_left, std::default_accessor >' and
> 'std::span, 1>')
>
>From docs:
// IntConfig has configurable conversion properties: convert from const&,
convert from non-const, no-throw-ctor from const&, no-throw-ctor from
non-const
This is again `_S_int_cast` not forwarding properly as noted in 121061.

> 1234 | { return (*this)[span rank()>(__indices)]; }
>  |  ~~~^
>
>
> std/containers/views/mdspan/mdspan/properties.pass.cpp
>
> Many many failures like this:
>
> std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: error:
> static assertion failed
>  146 |   static_assert(noexcept(MDS::is_always_unique()));
>  | ^
>
These  are not noexcept in standard, so clang most likely made them
conditionally noexcept.
We could do the same.

std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: note:
> 'false' evaluates to false
> std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: error:
> static assertion failed
>  147 |   static_assert(noexcept(MDS::is_always_exhaustive()));
>  | ^
> std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: note:
> 'false' evaluates to false
> std/containers/views/mdspan/mdspan/properties.pass.cpp:148:17: error:
> static assertion failed
>  148 |   static_assert(noexcept(MDS::is_always_strided()));
>  | ^~
>
>

Re: [PATCH] c++, libstdc++, v5: Implement C++26 P3068R5 - constexpr exceptions [PR117785]

2025-07-15 Thread Jonathan Wakely

On Thu, 10 Jul 2025 at 09:06, Jakub Jelinek wrote:
>
> On Wed, Jul 09, 2025 at 06:45:41PM -0400, Jason Merrill wrote:
> > > + && reduced_constant_expression_p (val))
> >
> > And a value doesn't need to be constant to be printable, we should be able
> > to print it unconditionally.
>
> Sure, the question is if printing non-constant value is better for users.
> The change to do unconditionally the %qE results in
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp26/constexpr-eh12.C:71:49: error: 
> uncaught exception '(E*)(& heap )'
> while previously it was
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp26/constexpr-eh12.C:71:49: error: 
> uncaught exception of type 'E*'
> I've kept the conditional for now but if you really want that change, can 
> remove it
> in the 2 spots and tweak constexpr-eh12.C test's expectations.
>
> > Is there a reason not to add it to heap_vars here?
>
> In the earlier patch I had:
>
> > > +case CXA_ALLOCATE_EXCEPTION:
> ...
> > > + tree var = cxa_allocate_exception (loc, type, size_zero_node);
> > > + ctx->global->heap_vars.safe_push (var);
> > > + ctx->global->put_value (var, NULL_TREE);
> vs.
> > > +case CXA_BAD_CAST:
> > > +case CXA_BAD_TYPEID:
> > > +case CXA_THROW_BAD_ARRAY_NEW_LENGTH:
> ...
> > > + tree var = cxa_allocate_exception (loc, type, size_one_node);
> > > + tree ctor
> > > +   = build_special_member_call (var, complete_ctor_identifier,
> > > +NULL, type, LOOKUP_NORMAL,
> > > +ctx->quiet ? tf_none
> > > +: tf_warning_or_error);
> > > + if (ctor == error_mark_node)
> > > +   {
> > > + *non_constant_p = true;
> > > + return call;
> > > +   }
> > > + ctx->global->heap_vars.safe_push (var);
>
> and so it wasn't added to heap_vars if build_special_member_call failed.
> But thinking about it now, we really don't care about what is in heap_vars
> if *non_constant_p, so I've made the change.
>
> The rest incorporated into the following version of the patch, passes
> GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 make check-g++ 
> RUNTESTFLAGS="--target_board=unix/-fpie dg.exp='constexpr-eh* 
> constexpr-asm-5.C static_assert1.C feat-cxx26.C constexpr-throw.C 
> constexpr-84192.C constexpr-dynamic*.C consteval34.C constexpr-new27.C 
> consteval-memfn1.C constexpr-typeid5.C constexpr-ellipsis*'"
> so far.
>
> 2025-07-10  Jakub Jelinek  
>
> PR c++/117785
> gcc/c-family/
> * c-cppbuiltin.cc (c_cpp_builtins): Predefine
> __cpp_constexpr_exceptions=202411L for C++26.
> gcc/cp/
> * constexpr.cc: Implement C++26 P3068R5 - constexpr exceptions.
> (class constexpr_global_ctx): Add caught_exceptions and
> uncaught_exceptions members.
> (constexpr_global_ctx::constexpr_global_ctx): Initialize
> uncaught_exceptions.
> (returns, breaks, continues, switches): Move earlier.
> (throws): New function.
> (exception_what_str, diagnose_std_terminate,
> diagnose_uncaught_exception): New functions.
> (enum cxa_builtin): New type.
> (cxx_cxa_builtin_fn_p, cxx_eval_cxa_builtin_fn): New functions.
> (cxx_eval_builtin_function_call): Add jump_target argument.  Call
> cxx_eval_cxa_builtin_fn for __builtin_eh_ptr_adjust_ref.  Adjust
> cxx_eval_constant_expression calls, if it results in jmp_target,
> set *jump_target to it and return.
> (cxx_bind_parameters_in_call): Add jump_target argument.  Pass
> it through to cxx_eval_constant_expression.  If it sets *jump_target,
> break.
> (fold_operand): Adjust cxx_eval_constant_expression caller.
> (cxx_eval_assert): Likewise.  If it set jmp_target, return true.
> (cxx_eval_internal_function): Add jump_target argument.  Pass it
> through to cxx_eval_constant_expression.  Return early if
> *jump_target after recursing on args.
> (cxx_eval_dynamic_cast_fn): Likewise.  Don't set reference_p for
> C++26 with -fexceptions.
> (cxx_eval_thunk_call): Add jump_target argument.  Pass it through
> to cxx_eval_constant_expression.
> (cxx_set_object_constness): Likewise.  Don't set TREE_READONLY if
> throws (jump_target).
> (cxx_eval_call_expression): Add jump_target argument.  Pass it
> through to cxx_eval_internal_function, cxx_eval_builtin_function_call,
> cxx_eval_thunk_call, cxx_eval_dynamic_cast_fn and
> cxx_set_object_constness.  Pass it through also
> cxx_eval_constant_expression on arguments, cxx_bind_parameters_in_call
> and cxx_fold_indirect_ref and for those cases return early if
> *jump_target.  Call cxx_eval_cxa_builtin_fn for cxx_cxa_builtin_fn_p
> functions.  For cxx_eval_constant_expression on body, pass address of
> cleared jmp_target automatic variable, if it throws propagate

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 12:44, Tomasz Kaminski  wrote:
>
>
>
> On Tue, Jul 15, 2025 at 1:35 PM Jonathan Wakely  wrote:
>>
>> OK here are the details for all of the failing tests ...
>>
>> std/containers/views/mdspan/aligned_accessor/access.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/ctor.conversion.from.default_accessor.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/ctor.conversion.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/ctor.default.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/offset.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/operator.conversion.to.default_accessor.pass.cpp
>> std/containers/views/mdspan/aligned_accessor/types.pass.cpp
>>
>> We don't support aligned_accessor yet.
>>
>> std/containers/views/mdspan/extents/ctor_from_array.pass.cpp
>> std/containers/views/mdspan/extents/ctor_from_span.pass.cpp
>>
>> These are both the same issue as Bug 121061:
>> static_assert(!std::is_constructible_v,
>> std::span>);
>>
>> std/containers/views/mdspan/extents/dims.pass.cpp
>>
>> We don't support std:::dims
>>
>> std/containers/views/mdspan/layout_stride/is_exhaustive_corner_case.pass.cpp
>>
>> Somebody needs to analyze this one.
>
> This is most likely related to us implementing 
> https://cplusplus.github.io/LWG/issue4266.
>>
>>
>> std/containers/views/mdspan/layout_stride/properties.pass.cpp
>>
>> This looks similar, is_always_exhaustive() is giving the wrong answer
>> in some case.
>>
>> std/containers/views/mdspan/mdspan/ctor.default.pass.cpp
>>
>> This is checking for a noexcept default ctor.
>>
>> std/containers/views/mdspan/mdspan/ctor.dh_array.pass.cpp
>> std/containers/views/mdspan/mdspan/ctor.dh_span.pass.cpp
>>
>> These are Bug 121061
>>
>> std/containers/views/mdspan/mdspan/index_operator.pass.cpp
>>
>> Dunno what's happening here:
>>
>> std/containers/views/mdspan/mdspan/index_operator.pass.cpp:178:46:
>> in 'constexpr' expansion of
>> 'check_operator_constraints> nt, 18446744073709551615>, std::layout_left,
>> std::default_accessor >, std::array> true>, 1> >(std::mdspan> , std::layout_left,
>> std::default_accessor >(((int*)(& data)),
>> construct_mapping >((s
>> td::layout_left(), std::layout_left()), std::extents> 18446744073709551615>(1))), std::array> true>, 1>{std::__array_traits<
>> IntConfig, 1>::_Type{IntConfig> true, true>{0}}})'
>> /home/jwakely/gcc/16/include/c++/16.0.0/mdspan:1234:25: error: no
>> match for 'operator[]' (operand types are 'const std::mdspan> std::extents> 44073709551615>, std::layout_left, std::default_accessor >' and
>> 'std::span, 1>')
>
> From docs:
> // IntConfig has configurable conversion properties: convert from const&, 
> convert from non-const, no-throw-ctor from const&, no-throw-ctor from 
> non-const
> This is again `_S_int_cast` not forwarding properly as noted in 121061.
>>
>> 1234 | { return (*this)[span(__indices)]; 
>> }
>>  |  ~~~^
>>
>>
>> std/containers/views/mdspan/mdspan/properties.pass.cpp
>>
>> Many many failures like this:
>>
>> std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: error:
>> static assertion failed
>>  146 |   static_assert(noexcept(MDS::is_always_unique()));
>>  | ^
>
> These  are not noexcept in standard, so clang most likely made them 
> conditionally noexcept.
> We could do the same.
>
>> std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: note:
>> 'false' evaluates to false
>> std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: error:
>> static assertion failed
>>  147 |   static_assert(noexcept(MDS::is_always_exhaustive()));
>>  | ^
>> std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: note:
>> 'false' evaluates to false
>> std/containers/views/mdspan/mdspan/properties.pass.cpp:148:17: error:
>> static assertion failed
>>  148 |   static_assert(noexcept(MDS::is_always_strided()));
>>  | ^~
>>

Thanks for looking into them, so it sounds like the failures are all
expected, and no conformance problems except for the one already in
bugzilla - nice work!

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 12:56 PM Luc Grosheintz 
wrote:

>
>
> On 7/14/25 08:57, Tomasz Kaminski wrote:
> > Hi Luc,
> >
> > While running the libc++ test on libstdc++ we have found the following
> > issue in our implementation.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121061
> > Would you be interested in looking into fixing this?
>
> Yes, I'll take care of this. Do you know if it's the only issue?
> If it's not clear it might make sense that I learn how to run their
> tests on our code.
>
> >
> > Also, libc++ makes the default constructor of mdpsan conditionally
> noexcept
> > as extension
> > (standard does not require it
> > https://eel.is/c++draft/views.multidim#mdspan.mdspan.cons).
> > We could do the same. Instead of writing a big conditional noexcept
> > specification, I would suggest
> > defaultint the constructor on first declaration:
> >constexpr
> >mdspan()
> >requires (rank_dynamic() > 0)
> >&& is_default_constructible_v
> >&& is_default_constructible_v
> >&& is_default_constructible_v
> >= default;
> >
> > And then having default member initializers:
> >  private:
> >[[no_unique_address]] accessor_type _M_accessor = accessor_type();
> >[[no_unique_address]] mapping_type _M_mapping = mapping_type();
> >[[no_unique_address]] data_handle_type _M_handle =
> data_handle_type();
> > We do not want to use "{}" as the samantis is a bit differnt.
> >
>
> Yes, I think it makes sense if those two implementations behave
> the same.
>
It seem that clang defines mdspan::is_(always_)?(exhaustive|unique|strided)
with conditional noexcept.
Given that standard-provided layouts define this funciton as noexcept, it
makes sense to propagate that, for common cases.

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Jakub Jelinek

On Mon, Jul 14, 2025 at 11:58:32PM -0400, Jason Merrill wrote:
> Coming back to this comment, it still seems to me that we can generalize
> this and ignore anything cast to void here, as in the below; after something
> has been cast to void, it can no longer be read.  I don't get any
> regressions from this simplification, either.
> 
> We might generalize to anything of void type, but I haven't tested that.

We do want to warn for b and not for a e.g. on -Wunused-but-set-variable
void
foo ()
{
  int a, b;
  a = 1;
  b = 2;
  (void) a;
}
Though, from what I can see, we warn correctly even with your patch.
I admit I don't know any longer what was the reason for all the special
cases in mark_exp_read and whether over the years some of them might be
unneeded.  We have some test coverage, but only limited (40KB of tests
in c-c++-common & g++.dg), plus sure full bootstrap/regtest does test it
further.

Could it be committed separately though, in case there is some regression
that bisection can find whether it is this simplification or the original
patch?

> commit adcf4220b73a9b7f44a35728f60aa5b351ef51d8
> Author: Jason Merrill 
> Date:   Mon Jul 14 18:29:17 2025 -0400
> 
> void
> 
> diff --git a/gcc/cp/expr.cc b/gcc/cp/expr.cc
> index 8b5a098ecb3..e4a7cfd7bec 100644
> --- a/gcc/cp/expr.cc
> +++ b/gcc/cp/expr.cc
> @@ -214,24 +214,7 @@ mark_use (tree expr, bool rvalue_p, bool read_p,
>gcc_fallthrough ();
>  CASE_CONVERT:
>if (VOID_TYPE_P (TREE_TYPE (expr)))
> - switch (TREE_CODE (TREE_OPERAND (expr, 0)))
> -   {
> -   case PREINCREMENT_EXPR:
> -   case PREDECREMENT_EXPR:
> -   case POSTINCREMENT_EXPR:
> -   case POSTDECREMENT_EXPR:
> - tree op0;
> - op0 = TREE_OPERAND (TREE_OPERAND (expr, 0), 0);
> - STRIP_ANY_LOCATION_WRAPPER (op0);
> - if ((VAR_P (op0) || TREE_CODE (op0) == PARM_DECL)
> - && !DECL_READ_P (op0)
> - && (VAR_P (op0) ? warn_unused_but_set_variable
> - : warn_unused_but_set_parameter) > 1)
> -   read_p = false;
> - break;
> -   default:
> - break;
> -   }
> + read_p = false;
>recurse_op[0] = true;
>break;
>  
> @@ -382,16 +365,7 @@ mark_exp_read (tree exp)
>break;
>  CASE_CONVERT:
>if (VOID_TYPE_P (TREE_TYPE (exp)))
> - switch (TREE_CODE (TREE_OPERAND (exp, 0)))
> -   {
> -   case PREINCREMENT_EXPR:
> -   case PREDECREMENT_EXPR:
> -   case POSTINCREMENT_EXPR:
> -   case POSTDECREMENT_EXPR:
> - return;
> -   default:
> - break;
> -   }
> + return;
>/* FALLTHRU */
>  case ARRAY_REF:
>  case COMPONENT_REF:


Jakub

Re: [PATCH] libstdc++: Constrain std::swap using concepts in C++20

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 08:29, Tomasz Kaminski  wrote:
>
>
>
> On Mon, Jul 14, 2025 at 10:38 PM Jonathan Wakely  wrote:
>>
>> This is a minor compile-time optimization for C++20.
>
> Please mention that you also replaced  _GLIBCXX20_CONSTEXPR, with
> constexpr under __glibcxx_concepts (that is >= c++ 20).

I've just discovered that the last non-Clang-based version of Intel
icc did not define __cpp_concepts in its C++20 mode:
https://godbolt.org/z/vbTzjxoro
And so __glibcxx_concepts won't be defined for that compiler.

Sigh. This means std::swap will no longer be constexpr in C++20 mode,
so I might put the _GLIBCXX20_CONSTEXPR back in the #elif __cplusplus
>= 201103L group.

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-15 Thread Richard Sandiford

Soumya AR  writes:
> One additional change with this patch is that I had to update ldapr-sext.c 
> too.
>
> During the combine pass, cases of UNSPECV_LDAP (with an offset) + sign_extend
> transform to LDAPUR + SEXT, and later to LDAPURS (with address folding).
>
> The aarch64 tests run with -moverride=tune=none, which clears all tuning flags
> including the AVOID_LDAPUR flag, enabling LDAPUR. (Since the generic arch 
> tuning
> adjustments are applied before the tune override.)
>
> This breaks ldapr-sext.c, in all instances where an offset is used. (Even 
> though
> there is no explicit offset in the testcase, since it uses global variables,
> there is an implicit offset due to the alignment padding in bss.)
>
> As a result, the following code:
>
> add   x0, x0, 2
> ldapursh  x0, [x0]
>
> becomes:
>
> ldapursh  x0, [x0, 2]
>
> I can just add -moverride=tune=avoid_ldapur to ldapr-sext.c to maintain the
> original behavior but since we're extending LDAPURS to fold offsets anyway,
> I think it makes sense to check that alongside as well.
> Let me know if that works.

Yeah, sounds good to me.

> [...]
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index abbb97768f5..4ee539e1dcd 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -18760,6 +18760,8 @@ aarch64_adjust_generic_arch_tuning (struct 
> tune_params ¤t_tune)
>if (TARGET_SVE2)
>  current_tune.extra_tuning_flags
>&= ~AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS;
> +  if (!AARCH64_HAVE_ISA(V8_8A))
> +  aarch64_tune_params.extra_tuning_flags |= 
> AARCH64_EXTRA_TUNE_AVOID_LDAPUR;

Sorry for the formatting nit, but the last line above should be
indented by 2 columns fewer.

> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> index 36b0dbd1f57..57a8e410c20 100644
> --- a/gcc/config/aarch64/atomics.md
> +++ b/gcc/config/aarch64/atomics.md
> @@ -679,13 +679,16 @@
>  )
>  
>  (define_insn "aarch64_atomic_load_rcpc"
> -  [(set (match_operand:ALLI 0 "register_operand" "=r")
> +  [(set (match_operand:ALLI 0 "register_operand")
>  (unspec_volatile:ALLI
> -  [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
> +  [(match_operand:ALLI 1 "aarch64_rcpc_memory_operand")
> (match_operand:SI 2 "const_int_operand")] ;; model
>UNSPECV_LDAP))]
>"TARGET_RCPC"
> -  "ldapr\t%0, %1"
> +  {@ [ cons: =0 , 1   ; attrs: enable_ldapur ]
> + [ r   , Q   ; any   ] ldapr\t%0, %1
> + [ r   , Ust ; yes   ] ldapur\t%0, %1
> +  }
>  )
>  
>  (define_insn "aarch64_atomic_load"
> @@ -705,21 +708,24 @@
>  )
>  
>  (define_insn "*aarch64_atomic_load_rcpc_zext"
> -  [(set (match_operand:SD_HSDI 0 "register_operand" "=r")
> +  [(set (match_operand:SD_HSDI 0 "register_operand")
>  (zero_extend:SD_HSDI
>(unspec_volatile:ALLX
> -[(match_operand:ALLX 1 "aarch64_sync_memory_operand" "Q")
> +[(match_operand:ALLX 1 "aarch64_rcpc_memory_operand")
>   (match_operand:SI 2 "const_int_operand")]   ;; model
> UNSPECV_LDAP)))]
>"TARGET_RCPC && ( > )"
> -  "ldapr\t%w0, %1"
> +  {@ [ cons: =0 , 1   ; attrs: enable_ldapur ]
> + [ r   , Q   ; any  ] ldapr\t%w0, %1
> + [ r   , Ust ; yes  ] ldapur\t%w0, 
> %1
> +  }
>  )

In both of the patterns above, it would be good to keep the "[", ",", ";" and
"]" lined up.

OK with those changes, thanks.

Richard

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 12:26, Jonathan Wakely  wrote:
>
> On Tue, 15 Jul 2025 at 11:57, Luc Grosheintz  wrote:
> >
> >
> >
> > On 7/14/25 08:57, Tomasz Kaminski wrote:
> > > Hi Luc,
> > >
> > > While running the libc++ test on libstdc++ we have found the following
> > > issue in our implementation.
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121061
> > > Would you be interested in looking into fixing this?
> >
> > Yes, I'll take care of this. Do you know if it's the only issue?
> > If it's not clear it might make sense that I learn how to run their
> > tests on our code.
>
> It's very non-trivial and requires some changes to their config. I'm
> working on making it easier.
>
> There are quite a few failures but I haven't analyzed them yet:
>
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/access.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/ctor.conversion.from.default_accessor.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/ctor.conversion.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/ctor.default.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/offset.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/operator.conversion.to.default_accessor.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/aligned_accessor/types.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/extents/ctor_from_array.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/extents/ctor_from_span.pass.cpp
>   stdlib-libstdc++.cfg.in :: std/containers/views/mdspan/extents/dims.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/layout_stride/is_exhaustive_corner_case.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/layout_stride/properties.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/mdspan/ctor.default.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/mdspan/ctor.dh_array.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/mdspan/ctor.dh_span.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/mdspan/index_operator.pass.cpp
>   stdlib-libstdc++.cfg.in ::
> std/containers/views/mdspan/mdspan/properties.pass.cpp

At least the aligned_accessor ones are probably easily explained ;-)


>
>
>
> >
> > >
> > > Also, libc++ makes the default constructor of mdpsan conditionally 
> > > noexcept
> > > as extension
> > > (standard does not require it
> > > https://eel.is/c++draft/views.multidim#mdspan.mdspan.cons).
> > > We could do the same. Instead of writing a big conditional noexcept
> > > specification, I would suggest
> > > defaultint the constructor on first declaration:
> > >constexpr
> > >mdspan()
> > >requires (rank_dynamic() > 0)
> > >&& is_default_constructible_v
> > >&& is_default_constructible_v
> > >&& is_default_constructible_v
> > >= default;
> > >
> > > And then having default member initializers:
> > >  private:
> > >[[no_unique_address]] accessor_type _M_accessor = accessor_type();
> > >[[no_unique_address]] mapping_type _M_mapping = mapping_type();
> > >[[no_unique_address]] data_handle_type _M_handle = 
> > > data_handle_type();
> > > We do not want to use "{}" as the samantis is a bit differnt.
> > >
> >
> > Yes, I think it makes sense if those two implementations behave
> > the same.
> >

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 11:57, Luc Grosheintz  wrote:
>
>
>
> On 7/14/25 08:57, Tomasz Kaminski wrote:
> > Hi Luc,
> >
> > While running the libc++ test on libstdc++ we have found the following
> > issue in our implementation.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121061
> > Would you be interested in looking into fixing this?
>
> Yes, I'll take care of this. Do you know if it's the only issue?
> If it's not clear it might make sense that I learn how to run their
> tests on our code.

It's very non-trivial and requires some changes to their config. I'm
working on making it easier.

There are quite a few failures but I haven't analyzed them yet:

  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/access.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/ctor.conversion.from.default_accessor.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/ctor.conversion.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/ctor.default.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/offset.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/operator.conversion.to.default_accessor.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/aligned_accessor/types.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/extents/ctor_from_array.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/extents/ctor_from_span.pass.cpp
  stdlib-libstdc++.cfg.in :: std/containers/views/mdspan/extents/dims.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/layout_stride/is_exhaustive_corner_case.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/layout_stride/properties.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/mdspan/ctor.default.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/mdspan/ctor.dh_array.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/mdspan/ctor.dh_span.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/mdspan/index_operator.pass.cpp
  stdlib-libstdc++.cfg.in ::
std/containers/views/mdspan/mdspan/properties.pass.cpp





>
> >
> > Also, libc++ makes the default constructor of mdpsan conditionally noexcept
> > as extension
> > (standard does not require it
> > https://eel.is/c++draft/views.multidim#mdspan.mdspan.cons).
> > We could do the same. Instead of writing a big conditional noexcept
> > specification, I would suggest
> > defaultint the constructor on first declaration:
> >constexpr
> >mdspan()
> >requires (rank_dynamic() > 0)
> >&& is_default_constructible_v
> >&& is_default_constructible_v
> >&& is_default_constructible_v
> >= default;
> >
> > And then having default member initializers:
> >  private:
> >[[no_unique_address]] accessor_type _M_accessor = accessor_type();
> >[[no_unique_address]] mapping_type _M_mapping = mapping_type();
> >[[no_unique_address]] data_handle_type _M_handle = 
> > data_handle_type();
> > We do not want to use "{}" as the samantis is a bit differnt.
> >
>
> Yes, I think it makes sense if those two implementations behave
> the same.
>

Re: [PATCH] libstdc++: Make ranges::advance(it, n, bound) follow standard more strictly

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 11:56 AM Jonathan Wakely  wrote:

> On Tue, 15 Jul 2025 at 09:25, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely 
> wrote:
> >>
> >> The standard specifies some of the effects of ranges::advance in terms
> >> of "Equivalent to:" and it's observable that our current implementation
> >> deviates from the precise specification in the standard.  This was
> >> causing some failures in the libc++ testsuite.
> >>
> >> For the sized_sentinel_for case I optimized our implementation to
> >> avoid redundant calls when we have already checked that there's nothing
> >> to do.  We were eliding `advance(i, bound)` when the iterator already
> >> equals the sentinel, and eliding `advance(i, n)` when `n` is zero. In
> >> both cases, removing the seemingly redundant calls is not equivalent to
> >> the spec because `i = std::move(bound)` or `i += 0` operations can be
> >> observed by program-defined iterators. This patch inlines the observable
> >> side effects of advance(i, bound) or advance(i, 0) without actually
> >> calling those functions.
> >>
> >> For the non-sized sentinel case, `if (i == bound || n == 0)` is
> >> different from `if (n == 0 || i == bound)` for the case where n is zero
> >> and a program-defined iterator observes the number of comparisons.
> >> This patch changes it to do `n == 0` first. I don't think this is
> >> required by the standard, as this condition is not "Equivalent to:" any
> >> observable sequence of operations, but testing `n == 0` first is
> >> probably cheaper anyway.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> * include/bits/ranges_base.h (ranges::advance(i, n, bound)):
> >> Ensure that observable side effects on iterators match what is
> >> specified in the standard.
> >> ---
> >>
> >> For most iterator types, I assume the compiler will inline these extra
> >> redundant operations, so they'll only make a difference for iterators
> >> that do actually observe the number of operations.
> >>
> >> Tested powerpc64le-linux.
> >
> > LGTM
> >>
> >>
> >>  libstdc++-v3/include/bits/ranges_base.h | 18 +++---
> >>  1 file changed, 15 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libstdc++-v3/include/bits/ranges_base.h
> b/libstdc++-v3/include/bits/ranges_base.h
> >> index 0251e5d0928a..4a0b9e2f5ec3 100644
> >> --- a/libstdc++-v3/include/bits/ranges_base.h
> >> +++ b/libstdc++-v3/include/bits/ranges_base.h
> >> @@ -882,7 +882,14 @@ namespace ranges
> >> const auto __diff = __bound - __it;
> >
> > Would you mind changing this iter_difference_t<_It>, this is required to
> be exactly that type,
>
> Yes, if that helps with reading the code, let's be explicit about the
> type. I've made that change.
>
> > but I think it would help me with understanding (see comment below)
> >>
> >>
> >> if (__diff == 0)
> >> - return __n;
> >> + {
> >> +   // inline any possible side effects of advance(it,
> bound)
> >> +   if constexpr (assignable_from<_It&, _Sent>)
> >> + __it = std::move(__bound);
> >> +   else if constexpr (random_access_iterator<_It>)
> >> + __it += 0;
> >> +   return __n;
> >> + }
> >> else if (__diff > 0 ? __n >= __diff : __n <= __diff)
> >
> > I was wondering if we should check precondition __glibcxx_assert((__n <
> 0) == (__diff < 0)),
> > but then I realized that we never enter this branch, if __n and __diff
> disagree on direction.
>
> Right. The standard says to check |n| >= |diff| and if we did that,
> then we would need to check that the directions agree. But that would
> mean abs(n) >= abs(diff) which would be three comparisons and we'd
> also need to check the directions agree. The way I implemented it only
> does two, and implicitly checks the directions agree. If they don't
> agree then we take the next branch and fail the assertion there.
>
> It's not necessarily better code the way I did the comparisons, at
> least not with GCC:
> https://godbolt.org/z/P6cc7s3WP
> But it avoids needing two assertions, so that should be less code for
> the entire function.
>
> If assertions are disabled then we take the "wrong" branch and do
> advance(i, n) instead of advance(i, bound), but that case is UB anyway
> so there is no "wrong" outcome. However, now that I think about it,
> maybe advance(i, bound) would be safer for the UB case, if we assume
> that the sentinel is the correct bound. Advancing by an incorrect
> value of n (in the wrong direction) might take us out of bounds,
> whereas advance(i, bound) will not, assuming a correct bound.
>
I am not sure, if one direction disagrees, then I am not sure if betting on
sentinel
being correct is any more likely than n being correct. So I am fine with
leaving this as is.

>
> > And both __n and __diff are or iter_different_t, so there is no chance
> of signed->unsigned promotion.
>
> Yeah.
>

Re: [PATCH v4 0/6] Implement mdspan.

2025-07-15 Thread Jonathan Wakely

OK here are the details for all of the failing tests ...

std/containers/views/mdspan/aligned_accessor/access.pass.cpp
std/containers/views/mdspan/aligned_accessor/ctor.conversion.from.default_accessor.pass.cpp
std/containers/views/mdspan/aligned_accessor/ctor.conversion.pass.cpp
std/containers/views/mdspan/aligned_accessor/ctor.default.pass.cpp
std/containers/views/mdspan/aligned_accessor/offset.pass.cpp
std/containers/views/mdspan/aligned_accessor/operator.conversion.to.default_accessor.pass.cpp
std/containers/views/mdspan/aligned_accessor/types.pass.cpp

We don't support aligned_accessor yet.

std/containers/views/mdspan/extents/ctor_from_array.pass.cpp
std/containers/views/mdspan/extents/ctor_from_span.pass.cpp

These are both the same issue as Bug 121061:
static_assert(!std::is_constructible_v,
std::span>);

std/containers/views/mdspan/extents/dims.pass.cpp

We don't support std:::dims

std/containers/views/mdspan/layout_stride/is_exhaustive_corner_case.pass.cpp

Somebody needs to analyze this one.

std/containers/views/mdspan/layout_stride/properties.pass.cpp

This looks similar, is_always_exhaustive() is giving the wrong answer
in some case.

std/containers/views/mdspan/mdspan/ctor.default.pass.cpp

This is checking for a noexcept default ctor.

std/containers/views/mdspan/mdspan/ctor.dh_array.pass.cpp
std/containers/views/mdspan/mdspan/ctor.dh_span.pass.cpp

These are Bug 121061

std/containers/views/mdspan/mdspan/index_operator.pass.cpp

Dunno what's happening here:

std/containers/views/mdspan/mdspan/index_operator.pass.cpp:178:46:
in 'constexpr' expansion of
'check_operator_constraints, std::layout_left,
std::default_accessor >, std::array, 1> >(std::mdspan, std::layout_left,
std::default_accessor >(((int*)(& data)),
construct_mapping >((s
td::layout_left(), std::layout_left()), std::extents(1))), std::array, 1>{std::__array_traits<
IntConfig, 1>::_Type{IntConfig{0}}})'
/home/jwakely/gcc/16/include/c++/16.0.0/mdspan:1234:25: error: no
match for 'operator[]' (operand types are 'const std::mdspan, std::layout_left, std::default_accessor >' and
'std::span, 1>')
1234 | { return (*this)[span(__indices)]; }
 |  ~~~^


std/containers/views/mdspan/mdspan/properties.pass.cpp

Many many failures like this:

std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: error:
static assertion failed
 146 |   static_assert(noexcept(MDS::is_always_unique()));
 | ^
std/containers/views/mdspan/mdspan/properties.pass.cpp:146:17: note:
'false' evaluates to false
std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: error:
static assertion failed
 147 |   static_assert(noexcept(MDS::is_always_exhaustive()));
 | ^
std/containers/views/mdspan/mdspan/properties.pass.cpp:147:17: note:
'false' evaluates to false
std/containers/views/mdspan/mdspan/properties.pass.cpp:148:17: error:
static assertion failed
 148 |   static_assert(noexcept(MDS::is_always_strided()));
 | ^~

Re: [PATCH] libstdc++: Define std::nothrow as inline variable

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 12:06, Tomasz Kaminski  wrote:
>
>
>
> On Tue, Jul 15, 2025 at 12:37 PM Jonathan Wakely  wrote:
>>
>> On Tue, 15 Jul 2025 at 08:10, Tomasz Kaminski  wrote:
>> >
>> >
>> >
>> > On Mon, Jul 14, 2025 at 10:43 PM Jonathan Wakely  
>> > wrote:
>> >>
>> >> This makes it possible to use `new(std::nothrow) X` without linking to
>> >> libsupc++ or libstdc++.
>> >>
>> >> To ensure we still export the symbol from the library we need to
>> >> suppress the inline variable in libsupc++/new_handler.cc which is done
>> >> by defining a macro.
>> >>
>> >> libstdc++-v3/ChangeLog:
>> >>
>> >> * libsupc++/new (nothrow): Define as inline variable.
>> >> * libsupc++/new_handler.cc (_GLIBCXX_DEFINE_NOTHROW_OBJ):
>> >> Define.
>> >> ---
>> >>
>> >> Tested powerpc64le-linux.
>> >>
>> >>  libstdc++-v3/libsupc++/new| 4 
>> >>  libstdc++-v3/libsupc++/new_handler.cc | 2 ++
>> >>  2 files changed, 6 insertions(+)
>> >>
>> >> diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
>> >> index fb36dae25a6d..85d28ff40769 100644
>> >> --- a/libstdc++-v3/libsupc++/new
>> >> +++ b/libstdc++-v3/libsupc++/new
>> >> @@ -125,7 +125,11 @@ namespace std
>> >>  #endif
>> >>};
>> >>
>> >> +#if defined __cpp_inline_variables && ! defined 
>> >> _GLIBCXX_DEFINE_NOTHROW_OBJ
>> >> +  inline constexpr nothrow_t nothrow{};
>> >> +#else
>> >>extern const nothrow_t nothrow;
>> >> +#endif
>> >
>> > If you move variable definition before include, this would become:
>> > +#ifndef _GLIBCXX_DEFINE_NOTHROW_OBJ
>> > +# ifdef __cpp_inline_variables
>> > +  inline constexpr nothrow_t nothrow{};
>> > + # else
>> >   extern const nothrow_t nothrow;
>> > +# endif
>> > +#endif
>> >
>> > GCC and clang also accepts, i.e. version when we always have extern 
>> > declaration:
>> > +#if defined __cpp_inline_variables && ! defined 
>> > _GLIBCXX_DEFINE_NOTHROW_OBJ
>> > +  inline constexpr nothrow_t nothrow{};
>> > +#endif
>> >extern const nothrow_t nothrow;
>> >
>> >
>> >>
>> >>/** If you write your own error handler to be called by @c new, it must
>> >> *  be of this type.  */
>> >> diff --git a/libstdc++-v3/libsupc++/new_handler.cc 
>> >> b/libstdc++-v3/libsupc++/new_handler.cc
>> >> index 7cd3e5a69fde..96dfb796c64a 100644
>> >> --- a/libstdc++-v3/libsupc++/new_handler.cc
>> >> +++ b/libstdc++-v3/libsupc++/new_handler.cc
>> >> @@ -23,6 +23,8 @@
>> >>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>> >>  // .
>> >>
>> >> +#define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
>> >
>> > Could we also move the definition of nothrow here (before including new),
>> > so entities in header new always see it's definition when it is constexpr?
>> > I do not think there is any (even bening) ODR violation possibility here, 
>> > but
>> > we can avoid the risk that way.,
>>
>> We would need to also move the definition of the nothrow_t type there,
>> otherwise you can't declare the variable.
>>
>> And wouldn't it have internal linkage if it's defined as constexpr
>> const in new_handler.cc?
>> It only has external linkage because the extern declaration in 
>> was already seen.
>>
>> So in the  header we would need:
>>
>> #ifndef _GLIBCXX_DEFINED_NOTHROW_OBJ // defined by new_handler.cc
>>   struct nothrow_t
>>   {
>> #if __cplusplus >= 201103L
>> explicit nothrow_t() = default;
>> #endif
>>   };
>>
>> #if defined __cpp_inline_variables
>>   // Define as an inline variable when possible, so that std::nothrow can
>>   // be used without linking to libsupc++.
>>   inline constexpr nothrow_t nothrow{};
>> #endif
>>   extern const nothrow_t nothrow;
>> #endif
>>
>>
>> And then in new_handler.cc:
>>
>> namespace std
>> {
>>   struct nothrow_t
>>   {
>> explicit nothrow_t() = default;
>>   };
>>
>> #ifdef __cpp_inline_variables
>>   // purposely match when header defines it as inline+constexpr
>>   constexpr
>> #endif
>>   extern const nothrow_t nothrow{};
>> }
>>
>> // Prevent  from defining it again:
>> #define _GLIBCXX_DEFINED_NOTHROW_OBJ 1
>>
>> #include "new"
>>
>>
>> But then I get multiple definition errors when linking libstdc++, so
>> something is wrong here.
>
> We could define the nothrow in header if _GLIBCXX_DEFINE_NOTHROW_OBJ_IN_NEW 
> is defined:
> #ifdef _GLIBCXX_DEFINE_NOTHROW_OBJ_IN_NEW
>   extern _GLIBCXX17_CONSTEXPR const nothrow_t nothrow = nothrow_t();
> #elifdef __cpp_inline_variables
>   // Define as an inline variable when possible, so that std::nothrow can
>   // be used without linking to libsupc++.
>   inline constexpr nothrow_t nothrow{};
> #else
>   extern const nothrow_t nothrow;
> #endif
>
> And then in new_handler.cc, do:
> // instruct  to provide definition of nothrow.
> #define _GLIBCXX_DEFINE_NOTHROW_OBJ 1
> #include "new"

Ah yes, that's much cleaner and avoid repeating the type. I'll see if
that works as expected.


>>
>>
>> If all you're trying to do is ensure that 'constexpr' is seen
>> consistently, I don't thi

Re: [PATCH 1/2] libstdc++: Add missing initializers for __maybe_present_t members [PR119962]

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 08:26, Tomasz Kaminski  wrote:
>
>
>
> On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka  wrote:
>>
>> Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps
>> 15?  Not sure if this corner case is worth backporting any further.
>>
>> Can we just use direct-list-initialization via {} instead of '= T()'
>> here?  I wasn't sure so I went with the latter to more closely mirror
>> the standard.
>
> default_initializable 
> (https://eel.is/c++draft/concept.default.init#concept:default_initializable)
> implies that T{} is well-formed, however there is no semantic
> requirements that T() and T{} are the same.
>
> The only case when {} could break, and "= T()" would work, is an aggregate 
> that has
> members with explicit default constructor, but it would be ill-formed. There 
> is also difference
> between lifetime of reference members with default initializers (binding to 
> temporary), but
> it does not matter for members of the class, as they go away with 
> constructors.
>
> There is also, pre-C++20 case, where classes with deleted default 
> constructor, where initializable
> via {} but not = T(). But again, default_initializable requires both.
>
> So, as far I can tell if we require T being default_initializable, we can use 
> {} for initialization, but
> given that aggregate initialization/defintion changes every standard (you can 
> now mix base and
> designated init in C++26), I would just use = T().

Yes, that's my thinking too.

OK for trunk and gcc-15.

RE: [PATCH 1/1] aarch64: Adapt unwinder to linux's SME signal behaviour

2025-07-15 Thread Tamar Christina

> -Original Message-
> From: Richard Sandiford 
> Sent: Tuesday, July 15, 2025 9:41 AM
> To: Tamar Christina 
> Cc: Yury Khrustalev ; gcc-patches@gcc.gnu.org; Mark
> Rutland 
> Subject: Re: [PATCH 1/1] aarch64: Adapt unwinder to linux's SME signal 
> behaviour
> 
> Tamar Christina  writes:
> > One question I did have not directly related to the unwinder changes,
> > But the ABI mentions that if any of the reserved bytes in TPIDR2_EL0
> > Block are non-zero that TPIDR2_EL0 must be left unchanged [1].
> 
> The full requirement is:
> 
>   If TPIDR2_EL0 is nonnull and if any reserved byte in the first 16 bytes
>   of the TPIDR2 block has a nonzero value, the thread must do one of the
>   following:
> 
>   * leave TPIDR2_EL0 unchanged;
>   * abort in some platform-defined manner; or
>   * handle the nonzero reserved bytes of the TPIDR2 block in accordance
> with future versions of the AAPCS64.
> 
> So...
> 
> > If that's the case what does it mean for the lazy save scheme, since it
> > requires that TPIDR2_EL0 be 0 at the end of the save? I see we don't
> > actually check that in e.g. __arm_za_disable, we just unconditionally
> > zero TPIDR2_EL0.
> 
> ...we take the second or third option (aborting or handling in line
> with a future PCS version).  Currently it's always the aborting option,
> since there has been no addition to TPIDR2 blocks since their original
> definition.
> 
> __arm_za_disable is documented as saving in the same way as __arm_tpidr2_save,
> and __arm_tpidr2_save is documented as doing:
> 
>   * If any of the reserved bytes in the first 16 bytes of the TPIDR2
> block are nonzero, the subroutine either:
> * aborts in some platform-defined manner; or
> * handles the nonzero reserved bytes of the TPIDR2 block in
>   accordance with future versions of the AAPCS64.
> 

Ah I see, it's handled in the call to __libgcc_arm_tpidr2_save,

Thanks, All clear now!

Tamar

> Richard

Re: [PATCH] aarch64: fixup: Implement sme2+faminmax extension.

2025-07-15 Thread Richard Sandiford

Alfie Richards  writes:
> Hi all,
>
> This is a minor fixup to the previous patch I committed fixing Spencers
> comments.
>
> Bootstrapped and reg tested for Aarch64.
>
> Thanks,
> Alfie
>
> -- >8 --
>
> Fixup to the SME2+FAMINMAX intrinsics commit.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-sme.md (@aarch64_sme_):
>   Change gating and comment.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64-sme.md | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-sme.md 
> b/gcc/config/aarch64/aarch64-sme.md
> index bfe368e80b5..6b3f4390943 100644
> --- a/gcc/config/aarch64/aarch64-sme.md
> +++ b/gcc/config/aarch64/aarch64-sme.md
> @@ -1269,8 +1269,8 @@ (define_insn "*aarch64_sme_single__plus"
>  ;;  Absolute minimum/maximum
>  ;; -
>  ;; Includes:
> -;; - svamin (SME2+faminmax)
> -;; - svamin (SME2+faminmax)
> +;; - FAMIN (SME2+FAMINMAX)
> +;; - FAMAX (SME2+FAMINMAX)
>  ;; -
>  
>  (define_insn "@aarch64_sme_"
> @@ -1278,7 +1278,7 @@ (define_insn "@aarch64_sme_"
>   (unspec:SVE_Fx24 [(match_operand:SVE_Fx24 1 "register_operand" "%0")
> (match_operand:SVE_Fx24 2 "register_operand" 
> "Uw")]
>FAMINMAX_UNS))]
> -  "TARGET_SME2 && TARGET_FAMINMAX"
> +  "TARGET_STREAMING_SME2 && TARGET_FAMINMAX"
>"\t%0, %1, %2"
>  )

Re: [PATCH] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-07-15 Thread Philipp Tomsich

Applied to trunk (with fixups to up the Changelog and after checking
again that the subreg_size_lowpart_offset is included). Thanks!

--Philipp.


On Tue, 15 Jul 2025 at 15:43, Richard Sandiford
 wrote:
>
> Konstantinos Eleftheriou  writes:
> > Hi Richard, thanks for the review!
> >
> > We've tested the pre-patch version on PowerPC and there was indeed an
> > issue, which is now fixed with the patched version.
>
> Thanks for the extra testing.  The patch is ok for trunk and for
> any necessary backports with the subreg_size_lowpart_offset change
> mentioned below.
>
> It doesn't look like you have commit access yet.  If you'd like it,
> please follow the instructions at https://gcc.gnu.org/gitwrite.html
> (I'll sponsor).
>
> Richard
>
> > Konstantinos
> >
> > On Fri, Jul 4, 2025 at 9:34 AM Richard Sandiford
> >  wrote:
> >>
> >> Konstantinos Eleftheriou  writes:
> >> > On Wed, May 7, 2025 at 11:29 AM Richard Sandiford
> >> >  wrote:
> >> >> But I thought the code was allowing multiple stores to be forwarded to
> >> >> a single (wider) load.  E.g. 4 individual byte stores at address X, X+1,
> >> >> X+2 and X+3 could be forwarded to a 4-byte load at address X.  And the 
> >> >> code
> >> >> I mentioned is handling the least significant byte by zero-extending it.
> >> >>
> >> >> For big-endian targets, the least significant byte should come from
> >> >> address X+3 rather than address X.  The byte at address X (i.e. the
> >> >> byte with the equal offset) should instead go in the most significant
> >> >> byte, typically using a shift left.
> >> > Hi, I'm attaching a patch that we prepared for this. It would be of
> >> > great help if someone could test it on a big-endian target, preferably
> >> > one with BITS_BIG_ENDIAN == 0 as we were having issues with that in
> >> > the past.
> >> >
> >> > From 278f83b834a97541fe0a2d2bbad84aca34601fed Mon Sep 17 00:00:00 2001
> >> > From: Konstantinos Eleftheriou 
> >> > Date: Tue, 3 Jun 2025 09:16:17 +0200
> >> > Subject: [PATCH] asf: Fix offset check in base reg initialization for
> >> >  big-endian targets
> >> >
> >> > During the base register initialization, in the case that we are
> >> > eliminating the load instruction, we are using `offset == 0` in order
> >> > to find the store instruction that has the same offset as the load. This
> >> > would not work on big-endian targets where byte 0 would be the MS byte.
> >> >
> >> > This patch updates the condition to take into account the target's 
> >> > endianness.
> >> >
> >> > We are, also, removing the adjustment of the starting position for the
> >> > bitfield insertion, when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN. This is
> >> > supposed to be handled inside `store_bit_field` and it's not needed 
> >> > anymore
> >> > after the offset fix.
> >>
> >> Yeah.  In particular BITS_BIG_ENDIAN describes how bits are measured
> >> by insv and extv.  Its effect is localise as much as possible and so it
> >> doesn't affect how store_bit_field measures bits.  store_bit_field
> >> instead always measures in memory order (BITS_PER_UNIT == second byte
> >> in memory order, etc.)
> >>
> >> Out of curiosity, did you try this on a little-endian POWER system?
> >> That has BITS_BIG_ENDIAN==1, BYTES_BIG_ENDIAN==0, so I would have
> >> expected something to have gone wrong with the pre-patch code.  It would
> >> be good to check if you could (there are some machines in the compile 
> >> farm).
> >>
> >> > gcc/ChangeLog:
> >> >
> >> >   * avoid-store-forwarding.cc (generate_bit_insert_sequence):
> >> >   Remove adjustment of bitfield insertion's starting position
> >> >   when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN.
> >> >   * avoid-store-forwarding.cc (process_store_forwarding):
> >> >   Update offset check in base reg initialization to take
> >> >   into account the target's endianness.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >   * gcc.target/aarch64/avoid-store-forwarding-be.c: New test.
> >> > ---
> >> >  gcc/avoid-store-forwarding.cc | 18 ---
> >> >  .../aarch64/avoid-store-forwarding-be.c   | 23 +++
> >> >  2 files changed, 28 insertions(+), 13 deletions(-)
> >> >  create mode 100644 
> >> > gcc/testsuite/gcc.target/aarch64/avoid-store-forwarding-be.c
> >> >
> >> > diff --git a/gcc/avoid-store-forwarding.cc 
> >> > b/gcc/avoid-store-forwarding.cc
> >> > index 6825d0426ecc..457d1d0c200c 100644
> >> > --- a/gcc/avoid-store-forwarding.cc
> >> > +++ b/gcc/avoid-store-forwarding.cc
> >> > @@ -119,17 +119,6 @@ generate_bit_insert_sequence (store_fwd_info 
> >> > *store_info, rtx dest)
> >> >unsigned HOST_WIDE_INT bitsize = store_size * BITS_PER_UNIT;
> >> >unsigned HOST_WIDE_INT start = store_info->offset * BITS_PER_UNIT;
> >> >
> >> > -  /* Adjust START for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
> >> > - Given that the bytes will be reversed in this case, we need to
> >> > - calculate the starting position from the end of the de

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-15 Thread Soumya AR



> On 15 Jul 2025, at 3:24 PM, Richard Sandiford  
> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Soumya AR  writes:
>> One additional change with this patch is that I had to update ldapr-sext.c 
>> too.
>> 
>> During the combine pass, cases of UNSPECV_LDAP (with an offset) + sign_extend
>> transform to LDAPUR + SEXT, and later to LDAPURS (with address folding).
>> 
>> The aarch64 tests run with -moverride=tune=none, which clears all tuning 
>> flags
>> including the AVOID_LDAPUR flag, enabling LDAPUR. (Since the generic arch 
>> tuning
>> adjustments are applied before the tune override.)
>> 
>> This breaks ldapr-sext.c, in all instances where an offset is used. (Even 
>> though
>> there is no explicit offset in the testcase, since it uses global variables,
>> there is an implicit offset due to the alignment padding in bss.)
>> 
>> As a result, the following code:
>> 
>> add   x0, x0, 2
>> ldapursh  x0, [x0]
>> 
>> becomes:
>> 
>> ldapursh  x0, [x0, 2]
>> 
>> I can just add -moverride=tune=avoid_ldapur to ldapr-sext.c to maintain the
>> original behavior but since we're extending LDAPURS to fold offsets anyway,
>> I think it makes sense to check that alongside as well.
>> Let me know if that works.
> 
> Yeah, sounds good to me.
> 
>> [...]
>> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
>> index abbb97768f5..4ee539e1dcd 100644
>> --- a/gcc/config/aarch64/aarch64.cc
>> +++ b/gcc/config/aarch64/aarch64.cc
>> @@ -18760,6 +18760,8 @@ aarch64_adjust_generic_arch_tuning (struct 
>> tune_params ¤t_tune)
>>   if (TARGET_SVE2)
>> current_tune.extra_tuning_flags
>>   &= ~AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS;
>> +  if (!AARCH64_HAVE_ISA(V8_8A))
>> +  aarch64_tune_params.extra_tuning_flags |= 
>> AARCH64_EXTRA_TUNE_AVOID_LDAPUR;
> 
> Sorry for the formatting nit, but the last line above should be
> indented by 2 columns fewer.
> 
>> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
>> index 36b0dbd1f57..57a8e410c20 100644
>> --- a/gcc/config/aarch64/atomics.md
>> +++ b/gcc/config/aarch64/atomics.md
>> @@ -679,13 +679,16 @@
>> )
>> 
>> (define_insn "aarch64_atomic_load_rcpc"
>> -  [(set (match_operand:ALLI 0 "register_operand" "=r")
>> +  [(set (match_operand:ALLI 0 "register_operand")
>> (unspec_volatile:ALLI
>> -  [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
>> +  [(match_operand:ALLI 1 "aarch64_rcpc_memory_operand")
>>(match_operand:SI 2 "const_int_operand")] ;; model
>>   UNSPECV_LDAP))]
>>   "TARGET_RCPC"
>> -  "ldapr\t%0, %1"
>> +  {@ [ cons: =0 , 1   ; attrs: enable_ldapur ]
>> + [ r   , Q   ; any   ] ldapr\t%0, %1
>> + [ r   , Ust ; yes   ] ldapur\t%0, %1
>> +  }
>> )
>> 
>> (define_insn "aarch64_atomic_load"
>> @@ -705,21 +708,24 @@
>> )
>> 
>> (define_insn "*aarch64_atomic_load_rcpc_zext"
>> -  [(set (match_operand:SD_HSDI 0 "register_operand" "=r")
>> +  [(set (match_operand:SD_HSDI 0 "register_operand")
>> (zero_extend:SD_HSDI
>>   (unspec_volatile:ALLX
>> -[(match_operand:ALLX 1 "aarch64_sync_memory_operand" "Q")
>> +[(match_operand:ALLX 1 "aarch64_rcpc_memory_operand")
>>  (match_operand:SI 2 "const_int_operand")]   ;; model
>>UNSPECV_LDAP)))]
>>   "TARGET_RCPC && ( > )"
>> -  "ldapr\t%w0, %1"
>> +  {@ [ cons: =0 , 1   ; attrs: enable_ldapur ]
>> + [ r   , Q   ; any  ] ldapr\t%w0, 
>> %1
>> + [ r   , Ust ; yes  ] ldapur\t%w0, 
>> %1
>> +  }
>> )
> 
> In both of the patterns above, it would be good to keep the "[", ",", ";" and
> "]" lined up.

Fixed the formatting, thanks for your help!
Committed at: 
https://gcc.gnu.org/cgit/gcc/commit/?id=6b76dfad9b2c80a43b2e775d0027ba4b636d6022

> OK with those changes, thanks.
> 
> Richard

Re: [PATCH] openmp, fortran: Fix ICE when the procedure name cannot be found in declare variant directives [PR104428]

2025-07-15 Thread Tobias Burnus


Kwok Cheung Yeung wrote:
This patch fixes an ICE due to a null-pointer dereference when finding 
the symbol for the procedure name in a declare variant directive, 
which occurs because the result of gfc_find_sym_tree is being 
dereferenced unconditionally. The result is now checked, and the 
symbol is set to NULL if it can't be found, resulting in a subsequent 
error.


If the symbol is implicitly typed, then the error will also occur. I 
think this makes sense as implicit variables are of type integer or 
real, which cannot be used to specify the variant procedure.


The latter sounds wrong – but fortunately, there seems to be no issue, if I 
look at:

program p
  interface
function x()
end function x
  end interface
  print *, foo()
  !$omp do
  do i = 1, 1
print *, foo()
  end do
contains
  function foo()
!$omp declare variant(x) match(construct={do})
  end
end

which compiles just fine. (The functions should be implicitly real.)


Tested on x86_64 host, okay for trunk?


LGTM, including backporting. (At least backporting to GCC 15 seems to 
make sense; it is marked as [13/14/15/16 Regression], but was reported 
against GCC 12 in Feb 2022.)


Minor comments:


gcc/fortran/

PR fortran/104428
* trans-openmp.cc (gfc_trans_omp_declare_variant): Check that proc_st
is non-NULL before dereferencing.  Add line number to error message.


Thanks for using the location data for the error message!



--- a/gcc/fortran/trans-openmp.cc
…
- variant_proc_sym = proc_st->n.sym;
+ variant_proc_sym = proc_st ? proc_st->n.sym : NULL;
}
if (variant_proc_sym == NULL)
{
- gfc_error ("Cannot find symbol %qs", variant_proc_name);
+ gfc_error ("Cannot find symbol %qs at %L", variant_proc_name,
+



--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr104428.f90
@@ -0,0 +1,16 @@
+! { dg-do compile }
+! { dg-options "-fopenmp" }


This shouldn't be needed for gomp/

…


+!$omp declare variant(y) match(construct={do}) ! { dg-error "Cannot find symbol 
.y." }


GCC is run such that 'y' works – i.e. not printing some fancy quotation 
marks like ‘…’ or “…” or «…» (and contrary to "(1)", also no \\ escaping 
is required).


Thanks for the patch!

Tobias

Re: ACCESS_WITH_SIZE for pointers Re: [PATCH] tree-optimization/120929: Limit MEM_REF handling to .ACCESS_WITH_SIZE

2025-07-15 Thread Qing Zhao



> On Jul 15, 2025, at 02:32, Richard Biener  wrote:
> 
> On Mon, Jul 14, 2025 at 10:58 PM Qing Zhao  wrote:
>> 
>> 
>>> On Jul 7, 2025, at 13:07, Qing Zhao  wrote:
>>> 
>>> As I mentioned in the latest email I replied to the thread, the original 
>>> implementation of the counted_by for pointer was implemented without the 
>>> additional indirection.
>>> But that implementation has a fundamental bug during testing.  then I 
>>> changed the implementation like the current.
>>> 
>>> I need spending a little more time to find the details of that fundamental 
>>> bug with the original implementation.
>>> 
>>> If the current bug is urgent to be fixed. and you are not comfortable with 
>>> the simple Patch Sid provided, then I am okay to back it out now and then 
>>> push it back with the fix to this current bug at a later time after 
>>> everyone is comfortable with the current implementation.
>>> 
>>> Thanks a lot!
>>> 
>>> Qing
>> 
>> 
>> Hi,  this is an update on the above fundamental issue I mentioned 
>> previously. (I finally located this issue and recorded it here)
>> 
>> 1. Based on the previous discussion on how to resolve PR120929, we agreed 
>> the following solution:
>> 
>> struct S {
>>  int n;
>>  int *p __attribute__((counted_by(n)));
>> } *f;
>> 
>> when generating a call to .ACCESS_WITH_SIZE for f->p, instead of generating
>> *.ACCESS_WITH_SIZE (&f->p, &f->n,...)
>> 
>> We should generate
>> .ACCESS_WITH_SIZE (f->p, &f->n,...)
>> 
>> i.e.,
>> the return type and the type of the first argument of the call is the
>>   original pointer type in this version,
>>   instead of the pointer to the original pointer type in the 7th version;
>> 
>> 2. I implemented this new .ACCESS_WITH_SIZE generation for pointers in my 
>> local workspace. It looked fine in the beginning,
>> However, during testing, I finally located the _fundamental issue_ with this 
>> design.
>> 
>> This issue can be shown clearly with the following simple testing case:
>> (Note, the numbers on the left in the following testing case is the line #)
>> 
>> $ cat t1.c
>>  1 struct annotated {
>>  2   int b;
>>  3   int *c __attribute__ ((counted_by (b)));
>>  4 } *p_array_annotated;
>>  5
>>  6 void __attribute__((__noinline__)) setup (int annotated_count)
>>  7 {
>>  8   p_array_annotated
>>  9 = (struct annotated *) __builtin_malloc (sizeof (struct annotated));
>> 10   p_array_annotated->c = (int *) __builtin_malloc (annotated_count * 
>> sizeof (int));
>> 11   p_array_annotated->c[2] = 10;
>> 12   p_array_annotated->b = annotated_count;
> 
> But isn't this bogus since you access c[2] while it's counted_by value
> is still uninitialized?

Sorry for my mistake in the testing case.

If we switch the line 11 and 12 in the above testing case to make the 
counted_by value is initialized
before p_array_annotated->c is used, the same issue.

  1 struct annotated {
  2   int b;
  3   int *c __attribute__ ((counted_by (b)));
  4 } *p_array_annotated;
  5
  6 void __attribute__((__noinline__)) setup (int annotated_count)
  7 {
  8   p_array_annotated
  9 = (struct annotated *) __builtin_malloc (sizeof (struct annotated));
 10   p_array_annotated->c = (int *) __builtin_malloc (annotated_count * sizeof 
(int));
 11   p_array_annotated->b = annotated_count;
 12   p_array_annotated->c[2] = 10;
 13   return;
 14 }
 15   
 16 int main(int argc, char *argv[])
 17 {
 18   setup (10);
 19   return 0;
 20 }


> I'd say by using counted_by you now invoke UB here.
The UB is because the way we generate the call to .ACCESS_WITH_SIZE: (for the 
above updated testing case, the gimple dump is):

  1 __attribute__((noinline))
  2 void setup (int annotated_count)
  3 {
  4   int * D.2969;
  5   
  6   _1 = __builtin_malloc (16);
  7   p_array_annotated = _1;
  8   _2 = (long unsigned int) annotated_count;
  9   _3 = _2 * 4;
 10   p_array_annotated.0_4 = p_array_annotated;
 11   _5 = p_array_annotated.0_4->c;
 12   p_array_annotated.1_6 = p_array_annotated;
 13   _7 = &p_array_annotated.1_6->b;
 14   D.2969 = .ACCESS_WITH_SIZE (_5, _7, 0B, 4);
 15   _8 = __builtin_malloc (_3);
 16   D.2969 = _8;
 17   p_array_annotated.2_9 = p_array_annotated;
 18   p_array_annotated.2_9->b = annotated_count;
 19   p_array_annotated.3_11 = p_array_annotated;
 20   _12 = p_array_annotated.3_11->c;
 21   p_array_annotated.4_13 = p_array_annotated;
 22   _14 = &p_array_annotated.4_13->b;
 23   _10 = .ACCESS_WITH_SIZE (_12, _14, 0B, 4);
 24   _15 = _10 + 8;
 25   *_15 = 10;
 26   return;
 27 }
 28 

In the above,  for the source code at line 10: 
10   p_array_annotated->c = (int *) __builtin_malloc (annotated_count * sizeof 
(int));

The ILs are:

 11   _5 = p_array_annotated.0_4->c;
 13   _7 = &p_array_annotated.1_6->b;
 14   D.2969 = .ACCESS_WITH_SIZE (_5, _7, 0B, 4);
 15   _8 = __builtin_malloc (_3);
 16   D.2969 = _8;

Note, at line 11, the value of the pointer “p_array_annotated->c” is assigned 
to _5, and this _5
is passed as the first argument to the call to .ACCES

Re: [PATCH] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-07-15 Thread Richard Sandiford

Konstantinos Eleftheriou  writes:
> Hi Richard, thanks for the review!
>
> We've tested the pre-patch version on PowerPC and there was indeed an
> issue, which is now fixed with the patched version.

Thanks for the extra testing.  The patch is ok for trunk and for
any necessary backports with the subreg_size_lowpart_offset change
mentioned below.

It doesn't look like you have commit access yet.  If you'd like it,
please follow the instructions at https://gcc.gnu.org/gitwrite.html
(I'll sponsor).

Richard

> Konstantinos
>
> On Fri, Jul 4, 2025 at 9:34 AM Richard Sandiford
>  wrote:
>>
>> Konstantinos Eleftheriou  writes:
>> > On Wed, May 7, 2025 at 11:29 AM Richard Sandiford
>> >  wrote:
>> >> But I thought the code was allowing multiple stores to be forwarded to
>> >> a single (wider) load.  E.g. 4 individual byte stores at address X, X+1,
>> >> X+2 and X+3 could be forwarded to a 4-byte load at address X.  And the 
>> >> code
>> >> I mentioned is handling the least significant byte by zero-extending it.
>> >>
>> >> For big-endian targets, the least significant byte should come from
>> >> address X+3 rather than address X.  The byte at address X (i.e. the
>> >> byte with the equal offset) should instead go in the most significant
>> >> byte, typically using a shift left.
>> > Hi, I'm attaching a patch that we prepared for this. It would be of
>> > great help if someone could test it on a big-endian target, preferably
>> > one with BITS_BIG_ENDIAN == 0 as we were having issues with that in
>> > the past.
>> >
>> > From 278f83b834a97541fe0a2d2bbad84aca34601fed Mon Sep 17 00:00:00 2001
>> > From: Konstantinos Eleftheriou 
>> > Date: Tue, 3 Jun 2025 09:16:17 +0200
>> > Subject: [PATCH] asf: Fix offset check in base reg initialization for
>> >  big-endian targets
>> >
>> > During the base register initialization, in the case that we are
>> > eliminating the load instruction, we are using `offset == 0` in order
>> > to find the store instruction that has the same offset as the load. This
>> > would not work on big-endian targets where byte 0 would be the MS byte.
>> >
>> > This patch updates the condition to take into account the target's 
>> > endianness.
>> >
>> > We are, also, removing the adjustment of the starting position for the
>> > bitfield insertion, when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN. This is
>> > supposed to be handled inside `store_bit_field` and it's not needed anymore
>> > after the offset fix.
>>
>> Yeah.  In particular BITS_BIG_ENDIAN describes how bits are measured
>> by insv and extv.  Its effect is localise as much as possible and so it
>> doesn't affect how store_bit_field measures bits.  store_bit_field
>> instead always measures in memory order (BITS_PER_UNIT == second byte
>> in memory order, etc.)
>>
>> Out of curiosity, did you try this on a little-endian POWER system?
>> That has BITS_BIG_ENDIAN==1, BYTES_BIG_ENDIAN==0, so I would have
>> expected something to have gone wrong with the pre-patch code.  It would
>> be good to check if you could (there are some machines in the compile farm).
>>
>> > gcc/ChangeLog:
>> >
>> >   * avoid-store-forwarding.cc (generate_bit_insert_sequence):
>> >   Remove adjustment of bitfield insertion's starting position
>> >   when BYTES_BIG_ENDIAN != BITS_BIG_ENDIAN.
>> >   * avoid-store-forwarding.cc (process_store_forwarding):
>> >   Update offset check in base reg initialization to take
>> >   into account the target's endianness.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >   * gcc.target/aarch64/avoid-store-forwarding-be.c: New test.
>> > ---
>> >  gcc/avoid-store-forwarding.cc | 18 ---
>> >  .../aarch64/avoid-store-forwarding-be.c   | 23 +++
>> >  2 files changed, 28 insertions(+), 13 deletions(-)
>> >  create mode 100644 
>> > gcc/testsuite/gcc.target/aarch64/avoid-store-forwarding-be.c
>> >
>> > diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
>> > index 6825d0426ecc..457d1d0c200c 100644
>> > --- a/gcc/avoid-store-forwarding.cc
>> > +++ b/gcc/avoid-store-forwarding.cc
>> > @@ -119,17 +119,6 @@ generate_bit_insert_sequence (store_fwd_info 
>> > *store_info, rtx dest)
>> >unsigned HOST_WIDE_INT bitsize = store_size * BITS_PER_UNIT;
>> >unsigned HOST_WIDE_INT start = store_info->offset * BITS_PER_UNIT;
>> >
>> > -  /* Adjust START for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN.
>> > - Given that the bytes will be reversed in this case, we need to
>> > - calculate the starting position from the end of the destination
>> > - register.  */
>> > -  if (BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN)
>> > -{
>> > -  unsigned HOST_WIDE_INT load_mode_bitsize
>> > - = (GET_MODE_BITSIZE (GET_MODE (dest))).to_constant ();
>> > -  start = load_mode_bitsize - bitsize - start;
>> > -}
>> > -
>> >rtx mov_reg = store_info->mov_reg;
>> >store_bit_field (dest, bitsize, start, 0, 0, GET_MODE (mov_reg), 
>> > mov_reg,
>

Re: [PATCH] arm: avoid gcc_s dependency

2025-07-15 Thread Pierre Ossman


On 14/07/2025 22:24, Sam James wrote:


A patch rebased against trunk would also be appreciated. See
https://gcc.gnu.org/contribute.html for the needed format.



The patch applies cleanly against trunk, so nothing is needed there.

I was hoping to get the interest of someone more familiar with gcc, to 
not have to figure out everything surrounding test cases and ChangeLog 
conventions. :)


Regards,
--
Pierre Ossman   Software Development
Cendio AB   https://cendio.com
Teknikringen 8  https://twitter.com/ThinLinc
583 30 Linköpinghttps://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Re: [PATCH 1/1] aarch64: Adapt unwinder to linux's SME signal behaviour

2025-07-15 Thread Yury Khrustalev

Hi Tamar,

On Tue, Jul 15, 2025 at 07:00:02AM +, Tamar Christina wrote:
> Hi Yury,
> 
> Thanks for the clear comments!

Thanks goes to Richard!

> > -Original Message-
> > From: Yury Khrustalev 
> > Sent: Thursday, June 19, 2025 2:40 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford ; Mark Rutland
> > 
> > Subject: [PATCH 1/1] aarch64: Adapt unwinder to linux's SME signal behaviour
> > 
> > From: Richard Sandiford 
> > 
> > SME uses a lazy save system to manage ZA.  The idea is that,
> > if a function with ZA state wants to call a "normal" function,
> > it can leave its state in ZA and instead set up a lazy save buffer.
> > If, unexpectedly, that normal function contains a nested use of ZA,
> > that nested use of ZA must commit the lazy save first.
> > 
> > This lazy save system uses a special system register called TPIDR2_EL0.
> > See:
> > 
> >   https://github.com/ARM-software/abi-
> > aa/blob/main/aapcs64/aapcs64.rst#66the-za-lazy-saving-scheme
> > 
> > for details.
> > 
> > The ABI specifies that, on entry to an exception handler, the following
> > things must be true:
> > 
> > * PSTATE.SM must be 0 (the processor must be in non-streaming mode)
> > 
> > * PSTATE.ZA must be 0 (ZA must be off)
> > 
> > * TPIDR2_EL0 must be 0 (there must be no uncommitted lazy save)
> > 
> > This is normally done by making _Unwind_RaiseException & friends
> > commit any lazy save before they unwind.  This also has the side
> > effect of ensuring that TPIDR2_EL0 is never left pointing to a
> > lazy save buffer that has been unwound.
> > 
> > However, things get more complicated with signals.  If:
> > 
> > (a) a signal is raised while ZA is dormant (that is, while there is an
> > uncommitted lazy save);
> > 
> > (b) the signal handler throws an exception; and
> > 
> > (c) that exception is caught outside the signal handler
> > 
> > something must ensure that the lazy save from (a) is committed.
> > 
> > This would be simple if the signal handler was entered with ZA and
> > TPIDR2_EL0 intact.  However, for various good reasons that are out
> > of scope here, this is not done.  Instead, Linux now clears both
> > TPIDR2_EL0 and PSTATE.ZA before entering a signal handler, see:
> > 
> >   https://lore.kernel.org/all/20250417190113.3778111-1-
> > mark.rutl...@arm.com/
> > 
> > for details.
> > 
> > Therefore, it is the unwinder that must simulate a commit of the lazy
> > save from (a).  It can do this by reading the previous values of
> > TPIDR2_EL0 and ZA from the sigcontext.
> > 
> > The SME-related sigcontext structures were only added to linux's
> > asm/sigcontext.h relatively recently and we can't rely on GCC being
> > built against such recent kernel header files.  The patch therefore uses
> > defines relevant macros if they are not defined and provide types that
> > comply with ABI layout of the corresponding linux types.
> > 
> > The patch includes some ugly casting in an attempt to support big-endian
> > ILP32, even though SME on big-endian ILP32 linux should never be a thing.
> > We can remove it if we also remove ILP32 support from GCC.
> > 
> 
> ILP32 is currently deprecated and we'll probably try to remove it this year.
> Perhaps a middle ground is to wrap them in __ILP32__ so we know to find
> them when we remove the support?

I could certainly add #if defined (__ILP32__) for the type casting bits, if
you think this is helpful for the future refactoring.

> OK with that change, but give others a day or two to object.
> 
> I assume this is something we'd want to backport?

Yes, we plan to back-port it to GCC 15 and 14 (it's the first version where
SME support was added to).

Thanks,
Yury

Re: [PATCH 2/2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 3:55 PM Patrick Palka  wrote:

> On Tue, 15 Jul 2025, Tomasz Kaminski wrote:
>
> > On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka  wrote:
> >   Tested on x86_64-pc-linux-gnu, does this look OK for trunk only
> >   (since it impacts ABI)?
> >
> > In theory an Iterator that meets all semantic requirements of the
> input_iterator
> > concept, could provide a default constructor that is unconstrained, but
> ill-formed
> > when invoked. This can be easily done accidentally, by having a default
> member initializer.
> >
> > #include 
> >
> > struct NoDefault
> > { NoDefault(int); };
> >
> > template
> > struct Iterator {
> >   T x = T();
> > };
> >
> > static_assert(std::default_initializable>);
> >
> > Default member initializers are not in immediate context, and checking
> "std::default_initializable"
> > is ill-formed. clang emits error here: https://godbolt.org/z/EafKn6h16
> >
> > You can however, do this optimization for forward_iterator. The
> difference here is that user-defined
> > iterators provides iterator_category/iterator_concept that maps to
> forward_iterator_tag or stronger,
> > so we can check default_initializable.
>
> Good point...  But it seems this is not only an issue in join_view (with
> this patch), we already require elsewhere in  that the default
> ctor of an input iterator is properly constrained:
>
>   filter_view, transform_view, elements_view, stride_view, enumerate_view,
>   to_input_view (and perhaps iota_view also counts)
>
Could you please elaborate? I understand that for this view, they default
constructor
requires that iterator is default_initializable, but if you are never
calling this constructor,
the view will function correctly for my not-properly constrained iterator.
However, for this case the standard does not impose default_initializable
requirement
for iterator, when we are incrementing it.


>
> And so these views would break similarly if the default ctor is
> underconstrained IIUC.  I don't see why we'd want to start caring about
> such iterators in join_view, if other fundamental views already don't?
>
> >
> >
> >
> >   -- >8 --
> >
> >   LWG 3569 adjusted join_view's iterator to handle adapting
> >   non-default-constructible (input) iterators by wrapping the
> >   corresponding data member with std::optional, and we followed suit
> in
> >   r13-2649-g7aa80c82ecf3a3.
> >
> >   But this wrapping is unnecessary for iterators that are already
> >   default-constructible.  Rather than unconditionally using
> std::optional
> >   here, which introduces time/space overhead, this patch
> conditionalizes
> >   our LWG 3569 changes on the iterator in question being
> >   non-default-constructible.
> >
> >
> >   libstdc++-v3/ChangeLog:
> >
> >   * include/std/ranges (join_view::_Iterator::_M_satisfy):
> >   Adjust to handle non-std::optional _M_inner as per before
> LWG 3569.
> >   (join_view::_Iterator::_M_get_inner): New.
> >   (join_view::_Iterator::_M_inner): Don't wrap in
> std::optional if
> >   the iterator is already default constructible.  Initialize.
> >   (join_view::_Iterator::operator*): Use _M_get_inner instead
> >   of *_M_inner.
> >   (join_view::_Iterator::operator++): Likewise.
> >   (join_view::_Iterator::iter_move): Likewise.
> >   (join_view::_Iterator::iter_swap): Likewise.
> >   ---
> >libstdc++-v3/include/std/ranges | 49
> +
> >1 file changed, 37 insertions(+), 12 deletions(-)
> >
> >   diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> >   index efe62969d657..799fa7611ce2 100644
> >   --- a/libstdc++-v3/include/std/ranges
> >   +++ b/libstdc++-v3/include/std/ranges
> >   @@ -2971,7 +2971,12 @@ namespace views::__adaptor
> > }
> >
> >   if constexpr (_S_ref_is_glvalue)
> >   - _M_inner.reset();
> >   + {
> >   +   if constexpr (default_initializable<_Inner_iter>)
> >   + _M_inner = _Inner_iter();
> >   +   else
> >   + _M_inner.reset();
> >   + }
> > }
> >
> > static constexpr auto
> >   @@ -3011,6 +3016,24 @@ namespace views::__adaptor
> > return *_M_parent->_M_outer;
> > }
> >
> >   + constexpr _Inner_iter&
> >   + _M_get_inner()
> >   + {
> >   +   if constexpr (default_initializable<_Inner_iter>)
> >   + return _M_inner;
> >   +   else
> >   + return *_M_inner;
> >   + }
> >   +
> >   + constexpr const _Inner_iter&
> >   + _M_get_inner() const
> >   + {
> >   +   if constexpr (

[PATCH] libstdc++: Define __promoted_t alias for C++17 without fold expressions [PR121097]

2025-07-15 Thread Jonathan Wakely

Define a fallback implementation of the __promoted_t alias for
hypothetical C++17 compilers which don't define __cpp_fold_expressions.
Without this,  can't be compiled by such compilers, because the
3-arg form of std::hypot uses the alias.

libstdc++-v3/ChangeLog:

PR libstdc++/121097
* include/ext/type_traits.h (__promoted_t): Define fallback
implementation for the 3-arg case that std::hypot uses.
---

I tested this by bodging the header so that this definition is used
unconditionally, and it seemed to work.

 libstdc++-v3/include/ext/type_traits.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/libstdc++-v3/include/ext/type_traits.h 
b/libstdc++-v3/include/ext/type_traits.h
index 061555231468..ca5bc4e0d23e 100644
--- a/libstdc++-v3/include/ext/type_traits.h
+++ b/libstdc++-v3/include/ext/type_traits.h
@@ -269,6 +269,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   typedef __typeof__(_Tp2() + _Up2() + _Vp2() + _Wp2()) __type;
 };
+
+#ifdef __glibcxx_hypot // C++ >= 17 && HOSTED
+  // __promoted_t is needed by the 3-arg std::hypot so define this in case
+  // any C++17 compiler doesn't define __cpp_fold_expressions (PR 121097).
+  template
+using __promoted_t = typename __promote_3<_Tp1, _Tp2, _Tp3>::__type;
+#endif
+
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
-- 
2.50.1

Re: [PATCH] RISC-V: Fix vsetvl merge rule.

2025-07-15 Thread 钟居哲

LGTM. Thanks for fixing my issue.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2025-07-14 21:55
To: gcc-patches
CC: kito.ch...@gmail.com; juzhe.zh...@rivai.ai; jeffreya...@gmail.com; 
pan2...@intel.com; rdapp@gmail.com
Subject: [PATCH] RISC-V: Fix vsetvl merge rule.
Hi,
 
In PR120297 we fuse
  vsetvl e8,mf2,...
  vsetvl e64,m1,...
into
  vsetvl e64,m4,...
 
Individually, that's ok but we also change the new vsetvl's demand to
"SEW only" even though the first original one demanded SEW >= 8 and
ratio = 16.
 
As we forget the ratio after the merge we find that the vsetvl following
the merged one has ratio = 64 demand and we fuse into
  vsetvl e64,m1,..
which obviously doesn't have ratio = 16 any more.
 
Regtested on rv64gcv_zvl512b.
 
Regards
Robin
 
PR target/120297
 
gcc/ChangeLog:
 
* config/riscv/riscv-vsetvl.def: Do not forget ratio demand of
previous vsetvl.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/pr120297.c: New test.
---
gcc/config/riscv/riscv-vsetvl.def |  6 +--
gcc/testsuite/gcc.target/riscv/rvv/pr120297.c | 50 +++
2 files changed, 53 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/pr120297.c
 
diff --git a/gcc/config/riscv/riscv-vsetvl.def 
b/gcc/config/riscv/riscv-vsetvl.def
index d7a5ada772d..0f999d2276d 100644
--- a/gcc/config/riscv/riscv-vsetvl.def
+++ b/gcc/config/riscv/riscv-vsetvl.def
@@ -79,7 +79,7 @@ DEF_SEW_LMUL_RULE (sew_only, sew_only, sew_only, sew_eq_p, 
sew_eq_p, nop)
DEF_SEW_LMUL_RULE (sew_only, ge_sew, sew_only,
   sew_ge_and_prev_sew_le_next_max_sew_p, sew_ge_p, nop)
DEF_SEW_LMUL_RULE (
-  sew_only, ratio_and_ge_sew, sew_lmul,
+  sew_only, ratio_and_ge_sew, ratio_and_ge_sew,
   sew_ge_and_prev_sew_le_next_max_sew_and_next_ratio_valid_for_prev_sew_p,
   always_false, modify_lmul_with_next_ratio)
@@ -104,9 +104,9 @@ DEF_SEW_LMUL_RULE (ratio_and_ge_sew, sew_lmul, sew_lmul,
DEF_SEW_LMUL_RULE (ratio_and_ge_sew, ratio_only, ratio_and_ge_sew, ratio_eq_p,
   ratio_eq_p, use_max_sew_and_lmul_with_prev_ratio)
DEF_SEW_LMUL_RULE (
-  ratio_and_ge_sew, sew_only, sew_only,
+  ratio_and_ge_sew, sew_only, ratio_and_ge_sew,
   sew_le_and_next_sew_le_prev_max_sew_and_prev_ratio_valid_for_next_sew_p,
-  always_false, use_next_sew_with_prev_ratio)
+  sew_eq_p, use_next_sew_with_prev_ratio)
DEF_SEW_LMUL_RULE (ratio_and_ge_sew, ge_sew, ratio_and_ge_sew,
   max_sew_overlap_and_prev_ratio_valid_for_next_sew_p,
   sew_ge_p, use_max_sew_and_lmul_with_prev_ratio)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/pr120297.c 
b/gcc/testsuite/gcc.target/riscv/rvv/pr120297.c
new file mode 100644
index 000..3d1845d0fe6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/pr120297.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-require-effective-target riscv_v_ok } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fwhole-program" } */
+
+unsigned a;
+short c;
+char d;
+unsigned long e;
+_Bool f[10][10];
+unsigned g[10];
+long long ak;
+char i = 7;
+long long t[10];
+short x[10][10][10][10];
+short y[10][10][10][10];
+
+void
+h (char i, long long t[], short x[][10][10][10], short y[][10][10][10],
+   _Bool aa)
+{
+  for (int j = 2; j < 8; j += 2)
+{
+  for (short k = 0; k < 10; k++)
+ {
+   for (int l = 3; l < 8; l += 2)
+ a = x[1][j][k][l];
+   c = x[c][1][1][c];
+ }
+  for (int k = 0; k < 10; k++)
+ {
+   f[2][k] |= (_Bool) t[c];
+   g[c] = t[c + 1];
+   d += y[j][1][k][k];
+   e = e > i ? e : i;
+ }
+}
+}
+
+int
+main ()
+{
+  t[c] = 1;
+  h (i, t, x, y, a);
+  for (int j = 0; j < 10; ++j)
+for (int k = 0; k < 10; ++k)
+  ak ^= f[j][k] + 238516665 + (ak >> 2);
+  ak ^= g[c] + 238516665 + (ak >> 2);
+  if (ak != 234635118ull)
+__builtin_abort ();
+}
-- 
2.50.0

Re: [PATCH v2] gcc-16/changes.html: Add --enable-x86-64-mfentry

2025-07-15 Thread Gerald Pfeifer

On Tue, 15 Jul 2025, Uros Bizjak wrote:
> LGTM for content, but let's ask Gerald to proofread the entry.

Happy to!

+  The --enable-x86-64-mfentry configure option is
+  added to enable -mfentry for x86-64 by default to use
+  __fentry__, instead of mcount for
+  profiling.  This option is enabled by default for glibc targets.
+  

This feels a bit complex to parse. How about something like

+  The new --enable-x86-64-mfentry configure option 
+  makes -mfentry use __fentry__ instead 
+  of mcount for profiling on x86-64.  This option is 
+  enabled by default for glibc targets.
+  

This replaces "option is added to enable" by "new option makes", fixes 
grammar/word order and drops what feels like an extra "by default".

If the latter is wrong, add "by default" before "for profiling on x86-64".


What do you think? (If you like it, go ahead and push this new version.)

Gerald

[PATCH] libgcc/Makefile.in: Delete `MACHMODE_H` def

2025-07-15 Thread John Ericson

This dates back to the creation of top-level `libgcc` in
fa9585134f6f58fa0d3da3ca4ad5493855aea2dc. I strongly suspect that this
does nothing.

(For context, my overall goal here is hoping libgcc can depend on
fewer/no stuff that is generated by `gcc/Makefile`. This is me trying to
pluck some low-hanging fruit -- this is the only direct mention of
`insn-modes.h` in libgcc.)
---
 libgcc/Makefile.in | 2 --
 1 file changed, 2 deletions(-)

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index 0719fd0615d..f7b48dceb06 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -193,7 +193,6 @@ AWK = @AWK@
 GCC_FOR_TARGET = $(CC)
 LIPO = @LIPO@
 LIPO_FOR_TARGET = $(LIPO)
-MACHMODE_H = machmode.h mode-classes.def insn-modes.h
 NM = @NM@
 NM_FOR_TARGET = $(NM)
 RANLIB_FOR_TARGET = $(RANLIB)
@@ -220,7 +219,6 @@ export INSTALL_DATA
 export LIB1ASMSRC
 export LIBGCC2_CFLAGS
 export LIPO_FOR_TARGET
-export MACHMODE_H
 export NM_FOR_TARGET
 export STRIP_FOR_TARGET
 export RANLIB_FOR_TARGET
-- 
2.47.2

Re: [PATCH] fortran: Factor array descriptor references

2025-07-15 Thread Steve Kargl

On Sun, Jul 13, 2025 at 10:59:32AM +0200, Mikael Morin wrote:
> 
> Regression tested on x86_64-pc-linux-gnu.
> OK for master?
> 

Yes, with one observation below.

> diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
> index 1561936daf1..af62e17442b 100644
> --- a/gcc/fortran/trans-array.cc
> +++ b/gcc/fortran/trans-array.cc
> @@ -3437,6 +3437,148 @@ save_descriptor_data (tree descr, tree data)
>  }
>  
>  
> +/* Type of the DATA argument passed to walk_tree by 
> substitute_subexpr_in_expr
> +   and used by maybe_substitute_expr.  */
> +
> +typedef struct 
> +{
> +  tree target, repl;
> +}
> +substitute_t;
> +
> +
> +/* Check if the expression in *TP is equal to the substitution target 
> provided
> +   in DATA->TARGET and replace it with DATA->REPL in that case.   This is a
> +   callback function for use with walk_tree.  */
> +   
> +static tree
> +maybe_substitute_expr (tree *tp, int *walk_subtree, void *data)
> +{
> +  substitute_t *subst = (substitute_t *) data;
> +  if (*tp == subst->target)
> +{
> +  *tp = subst->repl;
> +  *walk_subtree = 0;
> +}
> +
> +  return NULL_TREE;
> +}


Can you explain why the above function always
returns NULL_TREE?  It would seem to me that
that function could be declared "static void"
or "static void *".  Is this simply to avoid 
a ...

> +
> +/* Substitute in EXPR any occurence of TARGET with REPLACEMENT.  */
> +
> +static void
> +substitute_subexpr_in_expr (tree target, tree replacement, tree expr)
> +{
> +  substitute_t subst;
> +  subst.target = target;
> +  subst.repl = replacement;
> +
> +  walk_tree (&expr, maybe_substitute_expr, &subst, nullptr);

cast here?

-- 
Steve

Re: [PATCH v2 1/1] contrib: add bpf-vmtest-tool to test BPF programs

2025-07-15 Thread Jose E. Marchesi



> As we discussed, it would be good to have a page in the wiki to document
> this effort, something like https://gcc.gnu.org/wiki/BPFRunTimeTests.

The page now exists.

[pushed: r16-2274] libgdiagnostics: add diagnostic_message_buffer [PR120792]

2025-07-15 Thread David Malcolm

This patch extends libgdiagnostics to provide a way to capture
the pp tokens making up a message string, so that SARIF and
HTML sinks can retain information such as event IDs and URLs.
As well as richer output, this improves the round-tripping of such
information through sarif-replay.

This also allows diagnostic messages to be built up in pieces,
with a drop-in replacement for fprintf, which I've found useful when
attempting to port "ld" to use libgdiagnostics.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-2274-g2a521eee58da7c.

gcc/ChangeLog:
PR sarif-replay/120792
* auto-obstack.h: New file, based on material taken from
pretty-print.cc.
* diagnostic-digraphs.h
(diagnostics::digraphs::digraph::set_description): New.
(diagnostics::digraphs::node::set_label): New.
* doc/libgdiagnostics/topics/compatibility.rst: Add
LIBGDIAGNOSTICS_ABI_4.
* doc/libgdiagnostics/topics/diagnostics.rst
(diagnostic_finish_via_msg_buf): Document new entrypoint.
* doc/libgdiagnostics/topics/execution-paths.rst
(diagnostic_execution_path_add_event_via_msg_buf): Document new
entrypoint.
* doc/libgdiagnostics/topics/index.rst: Add message-buffers.rst.
* doc/libgdiagnostics/topics/message-buffers.rst: New file.
* doc/libgdiagnostics/topics/message-formatting.rst: Add note
about message buffers.
* doc/libgdiagnostics/topics/physical-locations.rst
(diagnostic_add_location_with_label_via_msg_buf): Add.
* doc/libgdiagnostics/tutorial/07-execution-paths.rst: Link to
next section.
* doc/libgdiagnostics/tutorial/08-message-buffers.rst: New file.
* doc/libgdiagnostics/tutorial/index.rst: Add
08-message-buffers.rst.
* libgdiagnostics++.h (libgdiagnostics::message_buffer): New
class.
(libgdiagnostics::execution_path::add_event_via_msg_buf): New.
(libgdiagnostics::diagnostic::add_location_with_label): New.
(libgdiagnostics::diagnostic::finish_via_msg_buf): New.
(libgdiagnostics::graph::set_description): New overload.
(libgdiagnostics::graph::add_edge): New overload.
(libgdiagnostics::node::set_label): New overload.
* libgdiagnostics-private.h
(private_diagnostic_execution_path_add_event_2): Drop decl.
(private_diagnostic_execution_path_add_event_3): New decl.
* libgdiagnostics.cc:  Include "pretty-print-format-impl.h",
"pretty-print-markup.h", and "auto-obstack.h".
(class copying_token_printer): New.
(struct diagnostic_message_buffer): New.
(class pp_element_message_buffer): New.
(libgdiagnostics_path_event::libgdiagnostics_path_event): Replace
params "gmsgid" and "args" with "msg_buf".
(libgdiagnostics_path_event::print_desc): Reimplement using
pp_element_message_buffer to replay m_msg_buf into "pp".
(libgdiagnostics_path_event::m_desc_uncolored): Drop field.
(libgdiagnostics_path_event::m_desc_colored): Drop field.
(libgdiagnostics_path_event::msg_buf): New field.
(diagnostic_execution_path::add_event_va): Reimplement.
(diagnostic_execution_path::add_event_via_msg_buf): New.
(diagnostic::add_location_with_label): New overload, using
msg_buf.
(diagnostic_manager::emit): Reimplement with...
(diagnostic_manager::emit_va): ...this.
(diagnostic_manager::emit_msg_buf): New.
(FAIL_IF_NULL): Rename "p" to "ptr_arg".
(diagnostic_finish_va): Update to use diagnostic_manager::emit_va.
(diagnostic_graph::add_node_with_id): Rename "id" to "node_id".
(diagnostic_graph_add_node): Likewise.
(diagnostic_graph_add_edge): Rename "id" to "edge_id".
(diagnostic_graph_get_node_by_id): Rename "id" to "node_id".
(diagnostic_graph_get_edge_by_id): Rename "id" to "edge_id".
(private_diagnostic_execution_path_add_event_2): Delete.
(diagnostic_message_buffer_new): New public entrypoint.
(diagnostic_message_buffer_release): Likewise.
(diagnostic_message_buffer_append_str): Likewise.
(diagnostic_message_buffer_append_text): Likewise.
(diagnostic_message_buffer_append_byte): Likewise.
(diagnostic_message_buffer_append_printf): Likewise.
(diagnostic_message_buffer_append_event_id): Likewise.
(diagnostic_message_buffer_begin_url): Likewise.
(diagnostic_message_buffer_end_url): Likewise.
(diagnostic_message_buffer_begin_quote): Likewise.
(diagnostic_message_buffer_end_quote): Likewise.
(diagnostic_message_buffer_begin_color): Likewise.
(diagnostic_message_buffer_end_color): Likewise.
(diagnostic_message_buffer_dump): Likewise.
(diagnostic_finish_via_msg_buf): Likewise.
(diagnostic_add_location_with_label_via_msg_buf): Likew

[pushed: r16-2275] spellcheck.{cc,h}: modernization

2025-07-15 Thread David Malcolm

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-2275-g1f0d76d61b94ac.

gcc/ChangeLog:
* spellcheck.cc: Define INCLUDE_ALGORITHM.
(CASE_COST, BASE_COST): Convert to...
(case_cost, base_cost): ...these, in an anonymous namespace.
(get_edit_distance): Update for above.  Use std::min rather than
MIN.
(get_edit_distance_cutoff): Likewise.  Use std::max rather than
MAX.
(selftest::test_edit_distances): Update for BASE_COST renaming.
(selftest::get_old_cutoff): Likewise.  Use std::max.
(selftest::assert_not_suggested_for): Use nullptr.
(selftest::test_find_closest_string): Likewise.
* spellcheck.h: Replace TYPE with StringLikeType in templates,
and use CamelCase.

Signed-off-by: David Malcolm 
---
 gcc/spellcheck.cc | 107 --
 gcc/spellcheck.h  |  20 -
 2 files changed, 66 insertions(+), 61 deletions(-)

diff --git a/gcc/spellcheck.cc b/gcc/spellcheck.cc
index 9a0a92012065..3fb0f50f4fc2 100644
--- a/gcc/spellcheck.cc
+++ b/gcc/spellcheck.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -25,11 +26,15 @@ along with GCC; see the file COPYING3.  If not see
 #include "spellcheck.h"
 #include "selftest.h"
 
+namespace {
+
 /* Cost of a case transformation.  */
-#define CASE_COST 1
+const edit_distance_t case_cost = 1;
 
 /* Cost of another kind of edit.  */
-#define BASE_COST 2
+const edit_distance_t base_cost = 2;
+
+} // anonymous namespace
 
 /* Get the edit distance between the two strings: the minimal
number of edits that are needed to change one string into another,
@@ -55,9 +60,9 @@ get_edit_distance (const char *s, int len_s,
 }
 
   if (len_s == 0)
-return BASE_COST * len_t;
+return base_cost * len_t;
   if (len_t == 0)
-return BASE_COST * len_s;
+return base_cost * len_s;
 
   /* We effectively build a matrix where each (i, j) contains the
  distance between the prefix strings s[0:j] and t[0:i].
@@ -75,7 +80,7 @@ get_edit_distance (const char *s, int len_s,
   /* The first row is for the case of an empty target string, which
  we can reach by deleting every character in the source string.  */
   for (int i = 0; i < len_s + 1; i++)
-v_one_ago[i] = i * BASE_COST;
+v_one_ago[i] = i * base_cost;
 
   /* Build successive rows.  */
   for (int i = 0; i < len_t; i++)
@@ -91,7 +96,7 @@ get_edit_distance (const char *s, int len_s,
   /* The initial column is for the case of an empty source string; we
 can reach prefixes of the target string of length i
 by inserting i characters.  */
-  v_next[0] = (i + 1) * BASE_COST;
+  v_next[0] = (i + 1) * base_cost;
 
   /* Build the rest of the row by considering neighbors to
 the north, west and northwest.  */
@@ -102,18 +107,18 @@ get_edit_distance (const char *s, int len_s,
  if (s[j] == t[i])
cost = 0;
  else if (TOLOWER (s[j]) == TOLOWER (t[i]))
-   cost = CASE_COST;
+   cost = case_cost;
  else
-   cost = BASE_COST;
- edit_distance_t deletion = v_next[j] + BASE_COST;
- edit_distance_t insertion= v_one_ago[j + 1] + BASE_COST;
+   cost = base_cost;
+ edit_distance_t deletion = v_next[j] + base_cost;
+ edit_distance_t insertion= v_one_ago[j + 1] + base_cost;
  edit_distance_t substitution = v_one_ago[j] + cost;
- edit_distance_t cheapest = MIN (deletion, insertion);
- cheapest = MIN (cheapest, substitution);
+ edit_distance_t cheapest = std::min (deletion, insertion);
+ cheapest = std::min (cheapest, substitution);
  if (i > 0 && j > 0 && s[j] == t[i - 1] && s[j - 1] == t[i])
{
- edit_distance_t transposition = v_two_ago[j - 1] + BASE_COST;
- cheapest = MIN (cheapest, transposition);
+ edit_distance_t transposition = v_two_ago[j - 1] + base_cost;
+ cheapest = std::min (cheapest, transposition);
}
  v_next[j + 1] = cheapest;
}
@@ -187,8 +192,8 @@ find_closest_string (const char *target,
 edit_distance_t
 get_edit_distance_cutoff (size_t goal_len, size_t candidate_len)
 {
-  size_t max_length = MAX (goal_len, candidate_len);
-  size_t min_length = MIN (goal_len, candidate_len);
+  size_t max_length = std::max (goal_len, candidate_len);
+  size_t min_length = std::min (goal_len, candidate_len);
 
   gcc_assert (max_length >= min_length);
 
@@ -200,11 +205,11 @@ get_edit_distance_cutoff (size_t goal_len, size_t 
candidate_len)
   /* If the lengths are close, then round down.  */
   if (max_length - min_length <=

Re: [PATCH 2/2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Tomasz Kaminski

On Tue, Jul 15, 2025 at 5:43 PM Patrick Palka  wrote:

> On Tue, 15 Jul 2025, Tomasz Kaminski wrote:
>
> >
> >
> > On Tue, Jul 15, 2025 at 3:55 PM Patrick Palka  wrote:
> >   On Tue, 15 Jul 2025, Tomasz Kaminski wrote:
> >
> >   > On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka 
> wrote:
> >   >   Tested on x86_64-pc-linux-gnu, does this look OK for trunk
> only
> >   >   (since it impacts ABI)?
> >   >
> >   > In theory an Iterator that meets all semantic requirements of
> the input_iterator
> >   > concept, could provide a default constructor that is
> unconstrained, but ill-formed
> >   > when invoked. This can be easily done accidentally, by having a
> default member initializer.
> >   >
> >   > #include 
> >   >
> >   > struct NoDefault
> >   > { NoDefault(int); };
> >   >
> >   > template
> >   > struct Iterator {
> >   >   T x = T();
> >   > };
> >   >
> >   > static_assert(std::default_initializable>);
> >   >
> >   > Default member initializers are not in immediate context, and
> checking "std::default_initializable"
> >   > is ill-formed. clang emits error here:
> https://godbolt.org/z/EafKn6h16
> >   >
> >   > You can however, do this optimization for forward_iterator. The
> difference here is that user-defined
> >   > iterators provides iterator_category/iterator_concept that maps
> to forward_iterator_tag or stronger,
> >   > so we can check default_initializable.
> >
> >   Good point...  But it seems this is not only an issue in join_view
> (with
> >   this patch), we already require elsewhere in  that the
> default
> >   ctor of an input iterator is properly constrained:
> >
> > filter_view, transform_view, elements_view, stride_view,
> enumerate_view,
> > to_input_view (and perhaps iota_view also counts)
> >
> > Could you please elaborate? I understand that for this view, they
> default constructor
> > requires that iterator is default_initializable, but if you are never
> calling this constructor,
> > the view will function correctly for my not-properly constrained
> iterator.
> >
> > However, for this case the standard does not impose
> default_initializable requirement
> > for iterator, when we are incrementing it.
>
> I might be confused, is supporting such underconstrained iterators
> a QoI issue or a correctness issue?
>
> If it's QoI, note that before P2325R3 (approved June 2021) even C++20
> input iterators were required to satisfy default_initializable, so I
> reckon it's quite rare to see C++20 input iterator written after that
> with an underconstrained default ctor.  I'm not sure supporting them is
> worth the tradeoff of pessimizing join_view for input-only iterators.
>
> If it's a correctness issue, I definitely agree with your changes :)
>
I think that this is correctness issue, the LWG3569 removed the
default_initializable
requires on the inner_iterator for join_view, and this has two aspect:
* we do not need to satisfy default_initializable -
default_initializable returns true
* we do not need to model default_initializable - semantics requirement, in
particular
  default_initializable not lying, i.e. when it is true, the
constructor is well-formed
So, I believe the standard requires that no properly constrained
input_iterator to work
now, i.e. we cannot add a default_initializable check. Adding a
default_initializable check,
would for me mean partially reverting the issue implementation.

The corresponding wording is here:
https://eel.is/c++draft/res.on.requirements#2
> If the validity or meaning of a program depends on whether a sequence of
template arguments
 models a concept, and the concept is satisfied but not modeled, the
program is ill-formed, no diagnostic required.

That means if we say requires or Constrains, you cannot have a lying
concept.

However, despite lacking explicit wording, we believe that implementations
are allowed
to promote concepts, i.e. check forward_iterator even if functions require
only input_iterator.

I saw your updated patch, I will check it tomorrow.

Hope this makes sense,
Tomasz

>
> >
> >
> >   And so these views would break similarly if the default ctor is
> >   underconstrained IIUC.  I don't see why we'd want to start caring
> about
> >   such iterators in join_view, if other fundamental views already
> don't?
> >
> >   >
> >   >
> >   >
> >   >   -- >8 --
> >   >
> >   >   LWG 3569 adjusted join_view's iterator to handle adapting
> >   >   non-default-constructible (input) iterators by wrapping the
> >   >   corresponding data member with std::optional, and we
> followed suit in
> >   >   r13-2649-g7aa80c82ecf3a3.
> >   >
> >   >   But this wrapping is unnecessary for iterators that are
> already
> >   >   default-constructible.  Rather than uncondit

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Jason Merrill


On 7/15/25 9:04 AM, Jakub Jelinek wrote:

On Tue, Jul 15, 2025 at 08:21:50AM -0400, Jason Merrill wrote:

Given the above that seems rather unlikely, but I suppose it's fine if you
want to do it that way.  The patch is OK either way.


Committed just the v2 patch.  I can test your patch next with other patches,
or do you want to test/commit it yourself?


I can do it, thanks.

Jason

Re: [PATCH 1/2] aarch64: Fix predication of FP8 FDOT insns [PR120986]

2025-07-15 Thread Kyrylo Tkachov

Hi Alex,

> On 15 Jul 2025, at 14:59, Alex Coplan  wrote:
> 
> Hi,
> 
> The predication of the SVE2 FP8 dot product insns was relying on the
> architectural dependency:
> 
> FEAT_FP8DOT2 => FEAT_FP8DOT4
> 
> which was relaxed in GCC as of
> r15-7480-g299a8e2dc667e795991bc439d2cad5ea5bd379e2, thus leading to
> unrecognisable insn ICEs when compiling a two-way FDOT with just
> +fp8dot2.  This patch fixes the predication of the insns to test for
> the correct feature bit depending on the value of the mode iterator.
> 
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk and backport
> to GCC 15?
> 
> Thanks,
> Alex
> 
> gcc/ChangeLog:
> 
> PR target/120986
> * config/aarch64/aarch64-sve2.md (@aarch64_sve_dot):
> Adjust insn predicate to use new mode attribute which checks
> for the correct feature bit depending on the mode.
> (@aarch64_sve_dot_lane): Likewise.
> * config/aarch64/iterators.md (HAVE_FP8_DOT_INSN): New.
> 
> gcc/testsuite/ChangeLog:
> 
> PR target/120986
> * gcc.target/aarch64/torture/pr120986-1.c: New test.
> ---
> gcc/config/aarch64/aarch64-sve2.md|  4 ++--
> gcc/config/aarch64/iterators.md   |  4 
> gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c | 10 ++
> 3 files changed, 16 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c
> 
> <0001-aarch64-Fix-predication-of-FP8-FDOT-insns-PR120986.patch>

+(define_mode_attr HAVE_FP8_DOT_INSN [(VNx4SF "TARGET_SSVE_FP8DOT4")
+(VNx8HF "TARGET_SSVE_FP8DOT2")])
+
+

I think a more natural way would be to have a new mode iterator with the same 
contents as SVE_FULL_HSF, say SVE_FULL_HSF_FP8 but have the TARGET_* conditions 
guarding those modes.
For example:
(define_mode_iterator SVE_FULL_HSF_FP8 [(VNx4SF "TARGET_SSVE_FP8DOT4") (VNx8HF 
“TARGET_SSVE_FP8DOT2")])

diff --git a/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c 
b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c
new file mode 100644
index 000..8777f1b7711
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.2-a+sve2+fp8dot2" } */

Very minor but this bug is about arch gating rather than something that’s hit 
with optimization, I don’t think there’s value in putting it in the torture 
directory. I’d put it under normal gcc.target/aarch64.

Ok with these changes if you agree.
Thanks,
Kyrill

Re: [PATCH] c++, libstdc++, v5: Implement C++26 P3068R5 - constexpr exceptions [PR117785]

2025-07-15 Thread Jakub Jelinek

On Tue, Jul 15, 2025 at 08:30:33AM -0400, Jason Merrill wrote:
> > The diagnostic only changed for C++26, and I'm not sure why (the _S_at
> > function isn't constexpr, so why wasn't it failing there before?)
> 
> In C++23, we could see that the outermost thing in the expression is a call
> to _M_error, and it's non-constexpr, so we give up at that point and never
> look at the call to _S_at.

Yeah.  Though, even in GNU++23 we could have
  foo (({ if (x == 42) return 14; 23; }), 12);
with non-constexpr foo, so maybe we could do that regardless of the C++ level.
We've never handled that before during constant evaluation though and now
handle it at least in some (for C++26 hopefully most or all) cases, say
  ({ if (x == 42) return 14; 23 }) + x
etc.

> In C++26, evaluating the arguments to a function (including the object
> argument) might throw before we get to calling the non-constexpr function,
> so we need to evaluate them before giving up.

There can be other reasons for different diagnostics between C++23 and C++26
related to the constexpr exceptions patch, e.g.
potential_constant_expression for C++26 can let some expressions through
as potentially constant because it has to assume some call could throw,
while in C++23 and earlier that would be impossible, and so something can be
diagnosed only later for C++26.

Jakub

Re: [PATCH] RISC-V: Improve bswap8 when zbb is enabled

2025-07-15 Thread Jeff Law





On 7/10/25 8:44 AM, Dusan Stojkovic wrote:

This peephole pattern combines the following instructions:
bswap8:
 rev8a5,a0
 -> li  a4,-65536
 -> sraia5,a5,32
 -> and a5,a5,a4
 -> roriw   a5,a5,16
 and a0,a0,a4
 or  a0,a0,a5
 sext.w  a0,a0
 ret

And emits this assembly:
bswap8:
 rev8a5,a0
 -> li  a4,-65536
 -> sraia5,a5,48
 and a0,a0,a4
 or  a0,a0,a5
 sext.w  a0,a0
 ret

Since the load instruction is required for the rest of the test function
in the PR, the pattern conserves the load.

2025-07-10  Dusan Stojkovic  

 PR target/120920

gcc/ChangeLog:

 * config/riscv/peephole.md: New pattern.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/zbb_bswap8.c: New test.

So I'm not sure this transformation is correct.


Let's consider the case where a5 has the value 0x at the 
"li" instruction.


a5 = 0xff00

li a4, -65536  // a4 = 0x
srai a5,a5,32  // a5 = 0xff00
and a5,a5,a4   // a5 = 0xff00
roriw a5,a5,16 // a5 = 0xff00

So with your change:

a5 = 0xff00
li a4, -65536 // a4 = 0x
srai a5,a5,48 // a5 = 0xf000

Am I missing something?

Jeff

Re: [PATCH] aarch64: Use SVE2 NBSL for vector NOR and NAND for Advanced SIMD modes

2025-07-15 Thread Remi Machet


On 7/15/25 08:57, Kyrylo Tkachov wrote:
> External email: Use caution opening links or attachments
>
>
> Hi all,
>
> We already have patterns to use the NBSL instruction to implement vector
> NOR and NAND operations for SVE types and modes. It is straightforward to
> have similar patterns for the fixed-width Advanced SIMD modes as well, though
> it requires combine patterns without the predicate operand and an explicit 'Z'
> output modifier. This patch does so.
>
> So now for example we generate for:
>
> uint64x2_t nand_q(uint64x2_t a, uint64x2_t b) { return NAND(a, b); }
> uint64x2_t nor_q(uint64x2_t a, uint64x2_t b) { return NOR(a, b); }
>
> nand_q:
>  nbsl z0.d, z0.d, z1.d, z1.d
>  ret
>
> nor_q:
>  nbsl z0.d, z0.d, z1.d, z0.d
>  ret
>
> instead of the previous:
> nand_q:
>  and v0.16b, v0.16b, v1.16b
>  not v0.16b, v0.16b
>  ret
>
> nor_q:
>  orr v0.16b, v0.16b, v1.16b
>  not v0.16b, v0.16b
>  ret
>
> The tied operand requirements for NBSL mean that we can generate the MOVPRFX
> when the operands fall that way, but I guess having a 2-insn MOVPRFX form is
> not worse than the current 2-insn codegen at least, and the MOVPRFX can be
> fused by many cores.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for trunk?

Looks good to me.

Remi

> Thanks,
> Kyrill
>
> Signed-off-by: Kyrylo Tkachov 
>
> gcc/
>
>  * config/aarch64/aarch64-sve2.md (*aarch64_sve2_unpred_nor):
>  New define_insn.
>  (*aarch64_sve2_nand_unpred): Likewise.
>
> gcc/testsuite/
>
>  * gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c: New test.
>

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Jason Merrill


On 7/15/25 5:26 AM, Jakub Jelinek wrote:

On Mon, Jul 14, 2025 at 11:58:32PM -0400, Jason Merrill wrote:

Coming back to this comment, it still seems to me that we can generalize
this and ignore anything cast to void here, as in the below; after something
has been cast to void, it can no longer be read.  I don't get any
regressions from this simplification, either.

We might generalize to anything of void type, but I haven't tested that.


We do want to warn for b and not for a e.g. on -Wunused-but-set-variable
void
foo ()
{
   int a, b;
   a = 1;
   b = 2;
   (void) a;
}
Though, from what I can see, we warn correctly even with your patch.


Yes, because the cast itself marks a as read (in convert_to_void); we 
don't need to also look through the cast later.



I admit I don't know any longer what was the reason for all the special
cases in mark_exp_read and whether over the years some of them might be
unneeded.  We have some test coverage, but only limited (40KB of tests
in c-c++-common & g++.dg), plus sure full bootstrap/regtest does test it
further.

Could it be committed separately though, in case there is some regression
that bisection can find whether it is this simplification or the original
patch?


Given the above that seems rather unlikely, but I suppose it's fine if 
you want to do it that way.  The patch is OK either way.


Jason

Re: [PATCH] c++, libstdc++, v5: Implement C++26 P3068R5 - constexpr exceptions [PR117785]

2025-07-15 Thread Jason Merrill


On 7/15/25 7:49 AM, Jonathan Wakely wrote:

On Thu, 10 Jul 2025 at 09:06, Jakub Jelinek wrote:


On Wed, Jul 09, 2025 at 06:45:41PM -0400, Jason Merrill wrote:

+ && reduced_constant_expression_p (val))


And a value doesn't need to be constant to be printable, we should be able
to print it unconditionally.


Sure, the question is if printing non-constant value is better for users.
The change to do unconditionally the %qE results in
/usr/src/gcc/gcc/testsuite/g++.dg/cpp26/constexpr-eh12.C:71:49: error: uncaught 
exception '(E*)(& heap )'
while previously it was
/usr/src/gcc/gcc/testsuite/g++.dg/cpp26/constexpr-eh12.C:71:49: error: uncaught 
exception of type 'E*'
I've kept the conditional for now but if you really want that change, can 
remove it
in the 2 spots and tweak constexpr-eh12.C test's expectations.


Is there a reason not to add it to heap_vars here?


In the earlier patch I had:


+case CXA_ALLOCATE_EXCEPTION:

...

+ tree var = cxa_allocate_exception (loc, type, size_zero_node);
+ ctx->global->heap_vars.safe_push (var);
+ ctx->global->put_value (var, NULL_TREE);

vs.

+case CXA_BAD_CAST:
+case CXA_BAD_TYPEID:
+case CXA_THROW_BAD_ARRAY_NEW_LENGTH:

...

+ tree var = cxa_allocate_exception (loc, type, size_one_node);
+ tree ctor
+   = build_special_member_call (var, complete_ctor_identifier,
+NULL, type, LOOKUP_NORMAL,
+ctx->quiet ? tf_none
+: tf_warning_or_error);
+ if (ctor == error_mark_node)
+   {
+ *non_constant_p = true;
+ return call;
+   }
+ ctx->global->heap_vars.safe_push (var);


and so it wasn't added to heap_vars if build_special_member_call failed.
But thinking about it now, we really don't care about what is in heap_vars
if *non_constant_p, so I've made the change.

The rest incorporated into the following version of the patch, passes
GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 make check-g++ 
RUNTESTFLAGS="--target_board=unix/-fpie dg.exp='constexpr-eh* constexpr-asm-5.C 
static_assert1.C feat-cxx26.C constexpr-throw.C constexpr-84192.C constexpr-dynamic*.C 
consteval34.C constexpr-new27.C consteval-memfn1.C constexpr-typeid5.C 
constexpr-ellipsis*'"
so far.

2025-07-10  Jakub Jelinek  

 PR c++/117785
gcc/c-family/
 * c-cppbuiltin.cc (c_cpp_builtins): Predefine
 __cpp_constexpr_exceptions=202411L for C++26.
gcc/cp/
 * constexpr.cc: Implement C++26 P3068R5 - constexpr exceptions.
 (class constexpr_global_ctx): Add caught_exceptions and
 uncaught_exceptions members.
 (constexpr_global_ctx::constexpr_global_ctx): Initialize
 uncaught_exceptions.
 (returns, breaks, continues, switches): Move earlier.
 (throws): New function.
 (exception_what_str, diagnose_std_terminate,
 diagnose_uncaught_exception): New functions.
 (enum cxa_builtin): New type.
 (cxx_cxa_builtin_fn_p, cxx_eval_cxa_builtin_fn): New functions.
 (cxx_eval_builtin_function_call): Add jump_target argument.  Call
 cxx_eval_cxa_builtin_fn for __builtin_eh_ptr_adjust_ref.  Adjust
 cxx_eval_constant_expression calls, if it results in jmp_target,
 set *jump_target to it and return.
 (cxx_bind_parameters_in_call): Add jump_target argument.  Pass
 it through to cxx_eval_constant_expression.  If it sets *jump_target,
 break.
 (fold_operand): Adjust cxx_eval_constant_expression caller.
 (cxx_eval_assert): Likewise.  If it set jmp_target, return true.
 (cxx_eval_internal_function): Add jump_target argument.  Pass it
 through to cxx_eval_constant_expression.  Return early if
 *jump_target after recursing on args.
 (cxx_eval_dynamic_cast_fn): Likewise.  Don't set reference_p for
 C++26 with -fexceptions.
 (cxx_eval_thunk_call): Add jump_target argument.  Pass it through
 to cxx_eval_constant_expression.
 (cxx_set_object_constness): Likewise.  Don't set TREE_READONLY if
 throws (jump_target).
 (cxx_eval_call_expression): Add jump_target argument.  Pass it
 through to cxx_eval_internal_function, cxx_eval_builtin_function_call,
 cxx_eval_thunk_call, cxx_eval_dynamic_cast_fn and
 cxx_set_object_constness.  Pass it through also
 cxx_eval_constant_expression on arguments, cxx_bind_parameters_in_call
 and cxx_fold_indirect_ref and for those cases return early if
 *jump_target.  Call cxx_eval_cxa_builtin_fn for cxx_cxa_builtin_fn_p
 functions.  For cxx_eval_constant_expression on body, pass address of
 cleared jmp_target automatic variable, if it throws propagate to
 *jump_target and make it non-cacheable.  For C++26 don't diagnose
 calls to non-constexpr functions before cxx_bind_parameters_in_call
 could report some

[committed] libstdc++: Tweak dg-error patterns for C++26 constexpr changes

2025-07-15 Thread Jonathan Wakely

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/copy/debug/constexpr_neg.cc:
* testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc:
* testsuite/25_algorithms/equal/debug/constexpr_neg.cc:
* 
testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc:
* 
testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc:
* 
testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc:
* 
testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_neg.cc:
* 
testsuite/25_algorithms/upper_bound/debug/constexpr_partitioned_pred_neg.cc:
* 
testsuite/25_algorithms/upper_bound/debug/constexpr_valid_range_neg.cc:
---

Tested powerpc64le-linux. Pushed to trunk.

 .../testsuite/25_algorithms/copy/debug/constexpr_neg.cc | 2 +-
 .../25_algorithms/copy_backward/debug/constexpr_neg.cc  | 2 +-
 .../testsuite/25_algorithms/equal/debug/constexpr_neg.cc| 2 +-
 .../lower_bound/debug/constexpr_partitioned_neg.cc  | 2 +-
 .../lower_bound/debug/constexpr_partitioned_pred_neg.cc | 2 +-
 .../lower_bound/debug/constexpr_valid_range_neg.cc  | 2 +-
 .../upper_bound/debug/constexpr_partitioned_neg.cc  | 2 +-
 .../upper_bound/debug/constexpr_partitioned_pred_neg.cc | 2 +-
 .../upper_bound/debug/constexpr_valid_range_neg.cc  | 2 +-
 9 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/debug/constexpr_neg.cc 
b/libstdc++-v3/testsuite/25_algorithms/copy/debug/constexpr_neg.cc
index 0e80977ecc5a..384052477717 100644
--- a/libstdc++-v3/testsuite/25_algorithms/copy/debug/constexpr_neg.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/debug/constexpr_neg.cc
@@ -33,7 +33,7 @@ test1()
 }
 
 static_assert(test1()); // { dg-error "non-constant condition" }
-// { dg-error "_Error_formatter::_M_error()" "" { target *-*-* } 0 }
+// { dg-error "_Error_formatter::(_M_error|_S_at)" "" { target *-*-* } 0 }
 
 constexpr bool
 test2()
diff --git 
a/libstdc++-v3/testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc 
b/libstdc++-v3/testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc
index 410c235adf9b..d5d84b1e290e 100644
--- a/libstdc++-v3/testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/copy_backward/debug/constexpr_neg.cc
@@ -35,4 +35,4 @@ test()
 
 static_assert(test()); // { dg-error "non-constant condition" }
 
-// { dg-prune-output "_Error_formatter::_M_error()" }
+// { dg-prune-output "_Error_formatter::(_M_error|_S_at)" }
diff --git a/libstdc++-v3/testsuite/25_algorithms/equal/debug/constexpr_neg.cc 
b/libstdc++-v3/testsuite/25_algorithms/equal/debug/constexpr_neg.cc
index cbc75092f145..6c1531d42127 100644
--- a/libstdc++-v3/testsuite/25_algorithms/equal/debug/constexpr_neg.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/equal/debug/constexpr_neg.cc
@@ -32,7 +32,7 @@ test01()
 }
 
 static_assert(test01()); // { dg-error "non-constant condition" }
-// { dg-error "_Error_formatter::_M_error()" "" { target *-*-* } 0 }
+// { dg-error "_Error_formatter::(_M_error|_S_at)" "" { target *-*-* } 0 }
 
 constexpr bool
 test02()
diff --git 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc
 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc
index c07145c19269..b44cb4be1d90 100644
--- 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc
+++ 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_neg.cc
@@ -43,5 +43,5 @@ test()
 
 static_assert(test()); // { dg-error "" }
 
-// { dg-prune-output "_Error_formatter::_M_error()" }
+// { dg-prune-output "_Error_formatter::(_M_error|_S_at)" }
 // { dg-prune-output "in 'constexpr'" }
diff --git 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc
 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc
index 09ae26f9b984..7835b30a0e49 100644
--- 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc
+++ 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_partitioned_pred_neg.cc
@@ -33,4 +33,4 @@ test()
 
 static_assert(test()); // { dg-error "" }
 
-// { dg-prune-output "_Error_formatter::_M_error()" }
+// { dg-prune-output "_Error_formatter::(_M_error|_S_at)" }
diff --git 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc
 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc
index 20eb026e728c..911880b59aa4 100644
--- 
a/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc
+++ 
b/libstdc++-v3/testsuite/25_algorithms/lower_bound/debug/constexpr_valid_range_neg.cc
@@ -46,5 +46,5 @@ test2()
 
 static_assert(test2()); // { dg-error "" }
 
-// {

Re: ICE with new IMPORT feature

2025-07-15 Thread Jerry Delisle

Yes Ok to push. Thanks. Jerry

On Tue, Jul 15, 2025, 9:44 AM Paul Richard Thomas <
paul.richard.tho...@gmail.com> wrote:

> Dear All,
>
> The failure that Steve mentioned below is fixed. ChangeLogs and patch are
> self-describing.
>
> Regtests fine - OK for mainline?
>
> Paul
>
> On Sat, 12 Jul 2025 at 19:57, Steve Kargl <
> s...@troutmask.apl.washington.edu> wrote:
>
>> All, Paul,
>>
>> In testing Paul's recent addition of support for IMPORT,
>> I have uncovered an ICE due to mangled source code.  The
>> code leads to a NULL pointer dereference.  The patch that
>> follows my .sig fixes the issue.  Note the testcase has one
>> FAIL.
>>
>> % gmake check-fortran RUNTESTFLAGS="dg.exp=import13.f90"
>>   ...
>> Running /home/kargl/gcc/gcc/gcc/testsuite/gfortran.dg/dg.exp ...
>> FAIL: gfortran.dg/import13.f90   -O  (test for excess errors)
>> ...
>>
>>

Re: [PATCH, V3] Add -mcpu=future to the PowerPC

2025-07-15 Thread Segher Boessenkool

Hi!

On Tue, Jul 01, 2025 at 12:14:32PM -0400, Michael Meissner wrote:
> This patch adds the support that can be used in developing GCC support
> for potential future PowerPC processors.
> 
> These changes are being added so that hardware designers can evaluate
> potential new features to be added to the PowerPC processors in the
> future.  It may be these features will be incorporated into real
> hardware using a different name in the future. Or it may be these
> features will not be incoporated into actual PowerPC hardware in the
> future.

None of this belongs in the commit message.  In the furure please post
stuff exactly as you want it to be committed!

>   * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): If
>   -mcpu=future, define _ARCH_FUTURE.
>   * config/rs6000/rs6000-cpus.def (FUTURE_MASKS_SERVER): New macro.
>   (POWERPC_MASKS): Add OPTION_MASK_FUTURE.
>   (future cpu): Add 'future' cpu.
>   * config/rs6000/rs6000-tables.opt: Regenerate.
>   * config/rs6000/rs6000.opt (-mfuture-internal): New internal option to
>   indicate the user used -mcpu=future.

NAK, as said before.

If the user wants to see what rs6000_cpu is set to, the user should just
look at rs6000_cpu, don't go via more indirect roundabout ways.

> +/* At present, we are not providing a unique processor option for 
> -mcpu=future.

What does this comment mean even?

> +   Until we define these changes, just use the power11 defaults.  */
> +RS6000_CPU ("future", PROCESSOR_POWER11, MASK_POWERPC64 | 
> FUTURE_MASKS_SERVER)

It isn't talking about this code for sure.

> +;; Users should not use -mfuture-internal, but we need to use a bit to 
> identify
> +;; when the user changes the default cpu via #pragma GCC target("cpu=future")
> +;; and then resets it later.

What does that mean?  It hints at a bug elsewhere, but not more than
hints unfortunately.


Segher

Re: [PATCH] aarch64: Use SVE2 BSL2N for vector EON

2025-07-15 Thread Richard Sandiford

Kyrylo Tkachov  writes:
> Hi all,
>
> SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
> (x & z) | (~x & ~z) which is ~(x ^ z).
> Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
> Advanced SIMD and SVE modes when TARGET_SVE2.
> This patch does that. The tied register requirements of BSL2N and the MOVPRFX
> rules mean we can't use the MOVPRFX form here so I have not included that
> alternative. Correct me if I'm wrong on this.

I think we can still use BSL2N, similarly to the patch from the other day.
It's just that the asm would need to be:

 movprfx\t%0, %1\;bsl2n\t%0, %0, %1, %2

(with constraints &w/w/w).

Thanks,
Richard

>
> For code like:
>
> uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
> svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
> svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return EON(a, 
> b); }
>
> We now generate:
> eon_q:
>   bsl2n z0.d, z0.d, z0.d, z1.d
>   ret
>
> eon_z:
>   bsl2n z0.d, z0.d, z0.d, z1.d
>   ret
>
> eon_z_mp:
>   bsl2n z1.d, z1.d, z1.d, z2.d
>   mov z0.d, z1.d
>   ret
>
> instead of the previous:
> eon_q:
>   eor v0.16b, v0.16b, v1.16b
>   not v0.16b, v0.16b
>   ret
>
> eon_z:
>   eor z0.d, z0.d, z1.d
>   ptrue p3.b, all
>   not z0.d, p3/m, z0.d
>   ret
>
> eon_z_mp:
>   eor z0.d, z1.d, z2.d
>   ptrue p3.b, all
>   not z0.d, p3/m, z0.d
>   ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for trunk?
> Thanks,
> Kyrill
>
> Signed-off-by: Kyrylo Tkachov 
>
> gcc/
>
>   * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
>   New pattern.
>   (*aarch64_sve2_eon_bsl2n_unpred): Likewise.
>
> gcc/testsuite/
>
>   * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.
>
> From 761b14804c8bbeae745cb7a2ab58e26a3775b096 Mon Sep 17 00:00:00 2001
> From: Kyrylo Tkachov 
> Date: Fri, 11 Jul 2025 07:23:16 -0700
> Subject: [PATCH 2/2] aarch64: Use SVE2 BSL2N for vector EON
>
> SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
> (x & z) | (~x & ~z) which is ~(x ^ z).
> Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
> Advanced SIMD and SVE modes when TARGET_SVE2.
> This patch does that.  The tied register requirements of BSL2N and the MOVPRFX
> rules mean we can't use the MOVPRFX form here so I have not included that
> alternative.  Correct me if I'm wrong on this.
>
> For code like:
>
> uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
> svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
> svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return EON(a, 
> b); }
>
> We now generate:
> eon_q:
> bsl2n   z0.d, z0.d, z0.d, z1.d
> ret
>
> eon_z:
> bsl2n   z0.d, z0.d, z0.d, z1.d
> ret
>
> eon_z_mp:
> bsl2n   z1.d, z1.d, z1.d, z2.d
> mov z0.d, z1.d
> ret
>
> instead of the previous:
> eon_q:
> eor v0.16b, v0.16b, v1.16b
> not v0.16b, v0.16b
> ret
>
> eon_z:
> eor z0.d, z0.d, z1.d
> ptrue   p3.b, all
> not z0.d, p3/m, z0.d
> ret
>
> eon_z_mp:
> eor z0.d, z1.d, z2.d
> ptrue   p3.b, all
> not z0.d, p3/m, z0.d
> ret
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Signed-off-by: Kyrylo Tkachov 
>
> gcc/
>
>   * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
>   New pattern.
>   (*aarch64_sve2_eon_bsl2n_unpred): Likewise.
>
> gcc/testsuite/
>
>   * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.
> ---
>  gcc/config/aarch64/aarch64-sve2.md| 28 ++
>  .../gcc.target/aarch64/sve2/eon_bsl2n.c   | 52 +++
>  2 files changed, 80 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/eon_bsl2n.c
>
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 6d6dc94cd81..a011b947f51 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -2053,6 +2053,34 @@
>}
>  )
>  
> +;; Vector EON (~(x, y)) using BSL2N.
> +(define_insn_and_rewrite "*aarch64_sve2_bsl2n_eon"
> +  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w")
> + (unspec:SVE_FULL_I
> +   [(match_operand 3)
> +(not:SVE_FULL_I
> +  (xor:SVE_FULL_I
> + (match_operand:SVE_FULL_I 1 "register_operand" "%0")
> + (match_operand:SVE_FULL_I 2 "register_operand" "w")))]
> + UNSPEC_PRED_X))]
> +  "TARGET_SVE2"
> +  "bsl2n\t%0.d, %0.d, %0.d, %2.d"
> +  "&& !CONSTANT_P (operands[3])"
> +  {
> +operands[3] = CONSTM1_RTX (mode);
> +  }
> +)
> +
> +(define_insn "*aarch64_sve2_eon_bsl2n_unpred"
> +  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
> +   (not:VDQ_I
> + (xor:VDQ_I
> +   (match_operand:VDQ_I 1 "register_operand" "%0

Re: [PATCH] c++, libstdc++, v5: Implement C++26 P3068R5 - constexpr exceptions [PR117785]

2025-07-15 Thread Jonathan Wakely

On Tue, 15 Jul 2025 at 13:55, Jakub Jelinek  wrote:
>
> On Tue, Jul 15, 2025 at 08:30:33AM -0400, Jason Merrill wrote:
> > > The diagnostic only changed for C++26, and I'm not sure why (the _S_at
> > > function isn't constexpr, so why wasn't it failing there before?)
> >
> > In C++23, we could see that the outermost thing in the expression is a call
> > to _M_error, and it's non-constexpr, so we give up at that point and never
> > look at the call to _S_at.
>
> Yeah.  Though, even in GNU++23 we could have
>   foo (({ if (x == 42) return 14; 23; }), 12);
> with non-constexpr foo, so maybe we could do that regardless of the C++ level.
> We've never handled that before during constant evaluation though and now
> handle it at least in some (for C++26 hopefully most or all) cases, say
>   ({ if (x == 42) return 14; 23 }) + x
> etc.
>
> > In C++26, evaluating the arguments to a function (including the object
> > argument) might throw before we get to calling the non-constexpr function,
> > so we need to evaluate them before giving up.
>
> There can be other reasons for different diagnostics between C++23 and C++26
> related to the constexpr exceptions patch, e.g.
> potential_constant_expression for C++26 can let some expressions through
> as potentially constant because it has to assume some call could throw,
> while in C++23 and earlier that would be impossible, and so something can be
> diagnosed only later for C++26.


OK, thank you both, I'm happy to just tweak the dg-error patterns now then.

[PATCH v2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Patrick Palka

Tested on x86_64-pc-linux-gnu, does this look OK for trunk only
(since it impacts ABI)?

Changes in v2:

 - Condition on forward_iterator instead of default_initializable.

-- >8 --

LWG 3569 adjusted join_view's iterator specification to handle non
default-constructible iterators by wrapping the corresponding data member
in std::optional, which we followed suit in r13-2649-g7aa80c82ecf3a3.

But this wrapping is unnecessary for iterators that are already
default-constructible.  Rather than unconditionally using std::optional
here, which introduces time/space overhead, this patch conditionalizes
our LWG 3569 changes on the iterator in question being non-forward (and
thus non default-constructible).  We check forwardness instead of
default-constructibility in order to accomodate input-only iterators
whose default constructor might be underconstrained.

libstdc++-v3/ChangeLog:

* include/std/ranges (join_view::_Iterator::_M_satisfy):
Adjust to handle non-std::optional _M_inner as per before LWG 3569.
(join_view::_Iterator::_M_get_inner): New.
(join_view::_Iterator::_M_inner): Don't wrap in std::optional if
the iterator is forward.  Initialize.
(join_view::_Iterator::operator*): Use _M_get_inner instead
of *_M_inner.
(join_view::_Iterator::operator++): Likewise.
(join_view::_Iterator::iter_move): Likewise.
(join_view::_Iterator::iter_swap): Likewise.
---
 libstdc++-v3/include/std/ranges | 49 +
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index efe62969d657..c9dc25ee52ef 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2971,7 +2971,12 @@ namespace views::__adaptor
  }
 
if constexpr (_S_ref_is_glvalue)
- _M_inner.reset();
+ {
+   if constexpr (forward_iterator<_Inner_iter>)
+ _M_inner = _Inner_iter();
+   else
+ _M_inner.reset();
+ }
  }
 
  static constexpr auto
@@ -3011,6 +3016,24 @@ namespace views::__adaptor
  return *_M_parent->_M_outer;
  }
 
+ constexpr _Inner_iter&
+ _M_get_inner()
+ {
+   if constexpr (forward_iterator<_Inner_iter>)
+ return _M_inner;
+   else
+ return *_M_inner;
+ }
+
+ constexpr const _Inner_iter&
+ _M_get_inner() const
+ {
+   if constexpr (forward_iterator<_Inner_iter>)
+ return _M_inner;
+   else
+ return *_M_inner;
+ }
+
  constexpr
  _Iterator(_Parent* __parent, _Outer_iter __outer) requires 
forward_range<_Base>
: _M_outer(std::move(__outer)), _M_parent(__parent)
@@ -3024,7 +3047,9 @@ namespace views::__adaptor
  [[no_unique_address]]
__detail::__maybe_present_t, _Outer_iter> 
_M_outer
  = decltype(_M_outer)();
- optional<_Inner_iter> _M_inner;
+ __conditional_t,
+ _Inner_iter, optional<_Inner_iter>> _M_inner
+   = decltype(_M_inner)();
  _Parent* _M_parent = nullptr;
 
public:
@@ -3048,7 +3073,7 @@ namespace views::__adaptor
 
  constexpr decltype(auto)
  operator*() const
- { return **_M_inner; }
+ { return *_M_get_inner(); }
 
  // _GLIBCXX_RESOLVE_LIB_DEFECTS
  // 3500. join_view::iterator::operator->() is bogus
@@ -3056,7 +3081,7 @@ namespace views::__adaptor
  operator->() const
requires __detail::__has_arrow<_Inner_iter>
  && copyable<_Inner_iter>
- { return *_M_inner; }
+ { return _M_get_inner(); }
 
  constexpr _Iterator&
  operator++()
@@ -3067,7 +3092,7 @@ namespace views::__adaptor
  else
return *_M_parent->_M_inner;
}();
-   if (++*_M_inner == ranges::end(__inner_range))
+   if (++_M_get_inner() == ranges::end(__inner_range))
  {
++_M_get_outer();
_M_satisfy();
@@ -3097,9 +3122,9 @@ namespace views::__adaptor
  {
if (_M_outer == ranges::end(_M_parent->_M_base))
  _M_inner = ranges::end(__detail::__as_lvalue(*--_M_outer));
-   while (*_M_inner == ranges::begin(__detail::__as_lvalue(*_M_outer)))
- *_M_inner = ranges::end(__detail::__as_lvalue(*--_M_outer));
-   --*_M_inner;
+   while (_M_get_inner() == 
ranges::begin(__detail::__as_lvalue(*_M_outer)))
+ _M_get_inner() = ranges::end(__detail::__as_lvalue(*--_M_outer));
+   --_M_get_inner();
return *this;
  }
 
@@ -3126,14 +3151,14 @@ namespace views::__adaptor
 
  friend constexpr decltype(auto)
  iter_move(const _Iterator& __i)
- noexce

[PATCH] libgcc: Fix aarch64 build

2025-07-15 Thread Andrew Pinski

For aarch64, libgcc is built with -Werror, after the latest
-Wunused-but-set* commit (r16-2258-g0eac9cfee8cb0b21d), a new warning
showed up:
```
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c: In function
‘__binary32_to_bid128’:
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:130:31: error:
variable ‘c3’ set but not used [-Werror=unused-but-set-variable=]
  130 | { unsigned long long c0,c1,c2,c3;   \
  |   ^~
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:146842:5: note:
in expansion of macro ‘__mul_10x256_to_256’
146842 | __mul_10x256_to_256 (z.w[5], z.w[4], z.w[3], z.w[2],
z.w[5], z.w[4],
   | ^~~
```

This fixes it by casting c3 to void after the last __mul_10x64 in
__mul_10x256_to_256 macro to mark it as being "used".


OK?

libgcc/config/libbid/ChangeLog:

* bid_binarydecimal.c (__mul_10x256_to_256): Mark c3 as being
used.

Signed-off-by: Andrew Pinski 
---
 libgcc/config/libbid/bid_binarydecimal.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libgcc/config/libbid/bid_binarydecimal.c 
b/libgcc/config/libbid/bid_binarydecimal.c
index daca2ffe306..12e32b9667a 100644
--- a/libgcc/config/libbid/bid_binarydecimal.c
+++ b/libgcc/config/libbid/bid_binarydecimal.c
@@ -132,6 +132,7 @@ UINT64 CY;  
\
   __mul_10x64(p1,c1,a1,c0); \
   __mul_10x64(p2,c2,a2,c1); \
   __mul_10x64(p3,c3,a3,c2); \
+  (void)c3;\
 }
 
 // Set up indices for low and high parts, depending on the endian-ness.
-- 
2.43.0

[Patch, fortran] PR121060 - ICE when argument is associate name created from type-bound operator result

2025-07-15 Thread Paul Richard Thomas

At first this PR looked as if it was going to be one of those horrible
typebound procedure/operator tangles. However, it turned out that this was
entirely down to the attempt  to extract the specific procedure for the
ASSOCIATE name during parsing. Returning NULL before expression resolution
fixed the problem.

Regtests fine - OK for mainline? I am inclined to backport, since the fix
is both simple and, at worst, defers the ICE. OK?

Paul


Change.Logs
Description: Binary data
diff --git a/gcc/fortran/interface.cc b/gcc/fortran/interface.cc
index f74fbf0f6e5..d08f683498d 100644
--- a/gcc/fortran/interface.cc
+++ b/gcc/fortran/interface.cc
@@ -4781,6 +4781,13 @@ matching_typebound_op (gfc_expr** tb_base,
 		gfc_actual_arglist* argcopy;
 		bool matches;
 
+		/* If expression matching comes here during parsing, eg. when
+		   parsing ASSOCIATE, generic TBPs have not yet been resolved
+		   and g->specific will not have been set. Wait for expression
+		   resolution by returning NULL.  */
+		if (!g->specific && !gfc_current_ns->resolved)
+		  return NULL;
+
 		gcc_assert (g->specific);
 		if (g->specific->error)
 		  continue;
diff --git a/gcc/testsuite/gfortran.dg/associate_75.f90 b/gcc/testsuite/gfortran.dg/associate_75.f90
new file mode 100644
index 000..c7c461a5cb6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associate_75.f90
@@ -0,0 +1,50 @@
+! { dg-do run }
+!
+! Test fix for PR121060.
+!
+! Contributed by Damian Rouson  
+!
+module subdomain_m
+  implicit none
+
+  type subdomain_t 
+real :: s_ = 99.
+  contains
+generic :: operator(.laplacian.) => laplacian
+procedure laplacian
+  end type
+
+contains
+
+  function laplacian(rhs)
+class(subdomain_t), intent(in) :: rhs
+type(subdomain_t) laplacian
+laplacian%s_ = rhs%s_ + 42
+  end function
+
+end module
+
+  use subdomain_m
+  implicit none
+
+  type operands_t
+real :: s_
+  end type
+
+  type(subdomain_t) phi
+  type(operands_t) operands
+
+  associate(laplacian_phi => .laplacian. phi) ! ICE because specific not found.
+operands = approximates(laplacian_phi%s_)
+  end associate
+
+  if (int (operands%s_) /= 42) stop 1
+contains
+
+  function approximates(actual)
+real actual 
+type(operands_t) approximates
+approximates%s_ = actual - 99
+  end function
+
+end

[PATCH] aarch64: Use SVE2 BSL2N for vector EON

2025-07-15 Thread Kyrylo Tkachov

Hi all,

SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
(x & z) | (~x & ~z) which is ~(x ^ z).
Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
Advanced SIMD and SVE modes when TARGET_SVE2.
This patch does that. The tied register requirements of BSL2N and the MOVPRFX
rules mean we can't use the MOVPRFX form here so I have not included that
alternative. Correct me if I'm wrong on this.

For code like:

uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return EON(a, 
b); }

We now generate:
eon_q:
bsl2n z0.d, z0.d, z0.d, z1.d
ret

eon_z:
bsl2n z0.d, z0.d, z0.d, z1.d
ret

eon_z_mp:
bsl2n z1.d, z1.d, z1.d, z2.d
mov z0.d, z1.d
ret

instead of the previous:
eon_q:
eor v0.16b, v0.16b, v1.16b
not v0.16b, v0.16b
ret

eon_z:
eor z0.d, z0.d, z1.d
ptrue p3.b, all
not z0.d, p3/m, z0.d
ret

eon_z_mp:
eor z0.d, z1.d, z2.d
ptrue p3.b, all
not z0.d, p3/m, z0.d
ret

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov 

gcc/

* config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
New pattern.
(*aarch64_sve2_eon_bsl2n_unpred): Likewise.

gcc/testsuite/

* gcc.target/aarch64/sve2/eon_bsl2n.c: New test.


0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch
Description: 0002-aarch64-Use-SVE2-BSL2N-for-vector-EON.patch

[PATCH] aarch64: Use SVE2 NBSL for vector NOR and NAND for Advanced SIMD modes

2025-07-15 Thread Kyrylo Tkachov

Hi all,

We already have patterns to use the NBSL instruction to implement vector
NOR and NAND operations for SVE types and modes. It is straightforward to
have similar patterns for the fixed-width Advanced SIMD modes as well, though
it requires combine patterns without the predicate operand and an explicit 'Z'
output modifier. This patch does so.

So now for example we generate for:

uint64x2_t nand_q(uint64x2_t a, uint64x2_t b) { return NAND(a, b); }
uint64x2_t nor_q(uint64x2_t a, uint64x2_t b) { return NOR(a, b); }

nand_q:
nbsl z0.d, z0.d, z1.d, z1.d
ret

nor_q:
nbsl z0.d, z0.d, z1.d, z0.d
ret

instead of the previous:
nand_q:
and v0.16b, v0.16b, v1.16b
not v0.16b, v0.16b
ret

nor_q:
orr v0.16b, v0.16b, v1.16b
not v0.16b, v0.16b
ret

The tied operand requirements for NBSL mean that we can generate the MOVPRFX
when the operands fall that way, but I guess having a 2-insn MOVPRFX form is
not worse than the current 2-insn codegen at least, and the MOVPRFX can be
fused by many cores.

Bootstrapped and tested on aarch64-none-linux-gnu.
Ok for trunk?
Thanks,
Kyrill

Signed-off-by: Kyrylo Tkachov 

gcc/

* config/aarch64/aarch64-sve2.md (*aarch64_sve2_unpred_nor):
New define_insn.
(*aarch64_sve2_nand_unpred): Likewise.

gcc/testsuite/

* gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c: New test.



0001-aarch64-Use-SVE2-NBSL-for-vector-NOR-and-NAND-for-Ad.patch
Description: 0001-aarch64-Use-SVE2-NBSL-for-vector-NOR-and-NAND-for-Ad.patch

[PATCH] openmp, fortran: Fix ICE when the procedure name cannot be found in declare variant directives [PR104428]

2025-07-15 Thread Kwok Cheung Yeung

This patch fixes an ICE due to a null-pointer dereference when finding 
the symbol for the procedure name in a declare variant directive, which 
occurs because the result of gfc_find_sym_tree is being dereferenced 
unconditionally. The result is now checked, and the symbol is set to 
NULL if it can't be found, resulting in a subsequent error.


If the symbol is implicitly typed, then the error will also occur. I 
think this makes sense as implicit variables are of type integer or 
real, which cannot be used to specify the variant procedure.


Tested on x86_64 host, okay for trunk?

KwokFrom 5b76d307bc4328c9ee580cfadccd3225837dce37 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Thu, 10 Jul 2025 17:19:28 +0100
Subject: [PATCH] openmp, fortran: Fix ICE when the procedure name cannot be
 found in declare variant directives [PR104428]

The result of searching for the procedure name symbol should be checked in
case the symbol cannot be found to avoid a null dereference.

gcc/fortran/

PR fortran/104428
* trans-openmp.cc (gfc_trans_omp_declare_variant): Check that proc_st
is non-NULL before dereferencing.  Add line number to error message.

gcc/testsuite/

PR fortran/104428
* gfortran.dg/gomp/pr104428.f90: New.
---
 gcc/fortran/trans-openmp.cc |  5 +++--
 gcc/testsuite/gfortran.dg/gomp/pr104428.f90 | 16 
 2 files changed, 19 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr104428.f90

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index f3d7cd4ffee..278e91c2c49 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -9714,11 +9714,12 @@ gfc_trans_omp_declare_variant (gfc_namespace *ns, 
gfc_namespace *parent_ns)
{
  gfc_symtree *proc_st;
  gfc_find_sym_tree (variant_proc_name, gfc_current_ns, 1, &proc_st);
- variant_proc_sym = proc_st->n.sym;
+ variant_proc_sym = proc_st ? proc_st->n.sym : NULL;
}
   if (variant_proc_sym == NULL)
{
- gfc_error ("Cannot find symbol %qs", variant_proc_name);
+ gfc_error ("Cannot find symbol %qs at %L", variant_proc_name,
+&odv->where);
  continue;
}
   set_selectors = omp_check_context_selector
diff --git a/gcc/testsuite/gfortran.dg/gomp/pr104428.f90 
b/gcc/testsuite/gfortran.dg/gomp/pr104428.f90
new file mode 100644
index 000..a09e8853a1d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/pr104428.f90
@@ -0,0 +1,16 @@
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+program p
+  interface
+subroutine x
+end subroutine x
+  end interface
+contains
+  subroutine foo
+!$omp declare variant(x) match(construct={do})
+  end
+  subroutine bar
+!$omp declare variant(y) match(construct={do}) ! { dg-error "Cannot find 
symbol .y." }
+  end
+end
-- 
2.43.0

Re: [PATCH 2/2] aarch64: Relax fpm_t assert to allow const_ints [PR120986]

2025-07-15 Thread Kyrylo Tkachov




> On 15 Jul 2025, at 15:01, Alex Coplan  wrote:
> 
> Hi,
> 
> This relaxes an overzealous assert that required the fpm_t argument to
> be in DImode when expanding FP8 intrinsics.  Of course this fails to
> account for modeless const_ints.
> 
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk and backport
> to GCC 15?
> 


Ok.
Thanks,
Kyrill

> Thanks,
> Alex
> 
> gcc/ChangeLog:
> 
> PR target/120986
> * config/aarch64/aarch64-sve-builtins.cc
> (function_expander::expand): Relax fpm_t assert to allow
> modeless const_ints.
> 
> gcc/testsuite/ChangeLog:
> 
> PR target/120986
> * gcc.target/aarch64/torture/pr120986-2.c: New test.
> ---
> gcc/config/aarch64/aarch64-sve-builtins.cc| 5 +++--
> gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c | 7 +++
> 2 files changed, 10 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c
> 
> <0002-aarch64-Relax-fpm_t-assert-to-allow-const_ints-PR120.patch>

Re: [PATCH 2/2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Patrick Palka

On Tue, 15 Jul 2025, Tomasz Kaminski wrote:

> On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka  wrote:
>   Tested on x86_64-pc-linux-gnu, does this look OK for trunk only
>   (since it impacts ABI)?
> 
> In theory an Iterator that meets all semantic requirements of the 
> input_iterator
> concept, could provide a default constructor that is unconstrained, but 
> ill-formed
> when invoked. This can be easily done accidentally, by having a default 
> member initializer.
> 
> #include 
> 
> struct NoDefault
> { NoDefault(int); };
> 
> template
> struct Iterator {
>   T x = T();
> };
> 
> static_assert(std::default_initializable>);
> 
> Default member initializers are not in immediate context, and checking 
> "std::default_initializable"
> is ill-formed. clang emits error here: https://godbolt.org/z/EafKn6h16
> 
> You can however, do this optimization for forward_iterator. The difference 
> here is that user-defined
> iterators provides iterator_category/iterator_concept that maps to 
> forward_iterator_tag or stronger,
> so we can check default_initializable.

Good point...  But it seems this is not only an issue in join_view (with
this patch), we already require elsewhere in  that the default
ctor of an input iterator is properly constrained:

  filter_view, transform_view, elements_view, stride_view, enumerate_view,
  to_input_view (and perhaps iota_view also counts)

And so these views would break similarly if the default ctor is
underconstrained IIUC.  I don't see why we'd want to start caring about
such iterators in join_view, if other fundamental views already don't?

> 
> 
> 
>   -- >8 --
> 
>   LWG 3569 adjusted join_view's iterator to handle adapting
>   non-default-constructible (input) iterators by wrapping the
>   corresponding data member with std::optional, and we followed suit in
>   r13-2649-g7aa80c82ecf3a3.
> 
>   But this wrapping is unnecessary for iterators that are already
>   default-constructible.  Rather than unconditionally using std::optional
>   here, which introduces time/space overhead, this patch conditionalizes
>   our LWG 3569 changes on the iterator in question being
>   non-default-constructible. 
> 
> 
>   libstdc++-v3/ChangeLog:
> 
>           * include/std/ranges (join_view::_Iterator::_M_satisfy):
>           Adjust to handle non-std::optional _M_inner as per before LWG 
> 3569.
>           (join_view::_Iterator::_M_get_inner): New.
>           (join_view::_Iterator::_M_inner): Don't wrap in std::optional if
>           the iterator is already default constructible.  Initialize.
>           (join_view::_Iterator::operator*): Use _M_get_inner instead
>           of *_M_inner.
>           (join_view::_Iterator::operator++): Likewise.
>           (join_view::_Iterator::iter_move): Likewise.
>           (join_view::_Iterator::iter_swap): Likewise.
>   ---
>    libstdc++-v3/include/std/ranges | 49 +
>    1 file changed, 37 insertions(+), 12 deletions(-)
> 
>   diff --git a/libstdc++-v3/include/std/ranges 
> b/libstdc++-v3/include/std/ranges
>   index efe62969d657..799fa7611ce2 100644
>   --- a/libstdc++-v3/include/std/ranges
>   +++ b/libstdc++-v3/include/std/ranges
>   @@ -2971,7 +2971,12 @@ namespace views::__adaptor
>                 }
> 
>               if constexpr (_S_ref_is_glvalue)
>   -             _M_inner.reset();
>   +             {
>   +               if constexpr (default_initializable<_Inner_iter>)
>   +                 _M_inner = _Inner_iter();
>   +               else
>   +                 _M_inner.reset();
>   +             }
>             }
> 
>             static constexpr auto
>   @@ -3011,6 +3016,24 @@ namespace views::__adaptor
>                 return *_M_parent->_M_outer;
>             }
> 
>   +         constexpr _Inner_iter&
>   +         _M_get_inner()
>   +         {
>   +           if constexpr (default_initializable<_Inner_iter>)
>   +             return _M_inner;
>   +           else
>   +             return *_M_inner;
>   +         }
>   +
>   +         constexpr const _Inner_iter&
>   +         _M_get_inner() const
>   +         {
>   +           if constexpr (default_initializable<_Inner_iter>)
>   +             return _M_inner;
>   +           else
>   +             return *_M_inner;
>   +         }
>   +
>             constexpr
>             _Iterator(_Parent* __parent, _Outer_iter __outer) requires 
> forward_range<_Base>
>               : _M_outer(std::move(__outer)), _M_parent(__parent)
>   @@ -3024,7 +3047,9 @@ namespace views::__adaptor
>             [[no_unique_address]]
>               __detail::__maybe_present_t, 
> _Outer_iter> _M_outer
>                 = decltype(_M_outer)();
>   -         optional<_Inner_iter> _M_inner;

Re: [PATCH] arm: avoid gcc_s dependency

2025-07-15 Thread Pierre Ossman


On 14/07/2025 15:26, Kyrylo Tkachov wrote:

+/* ARM EABI dummy unwinding routines.
+   Copyright 2014 Pierre Ossman for Cendio AB
+
+   This file is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 3, or (at your option) any
+   later version.

IANAL but for one the date should include at least the current year.


The code has been untouched since then, but I can update it if you like.


Also, I believe you need to either ensure you have the paperwork sorted to 
contribute the copyright to the FSF (see the copyright notice in other files in 
that directory) or otherwise contribute under the DCO as per:
https://gcc.gnu.org/dco.html

I cannot recommend one option or the other, that’s something for you (and your 
employer?) to agree on.



We have already set up copyright assignment to the FSF for previous 
patches. I assume that is still in effect. Is there some specific 
reference needed?


Rergards,
--
Pierre Ossman   Software Development
Cendio AB   https://cendio.com
Teknikringen 8  https://twitter.com/ThinLinc
583 30 Linköpinghttps://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Re: [PATCH] aarch64: Use SVE2 NBSL for vector NOR and NAND for Advanced SIMD modes

2025-07-15 Thread Richard Sandiford

Kyrylo Tkachov  writes:
> From 930789b3c366777c49d4eb2f4dc84b0374601504 Mon Sep 17 00:00:00 2001
> From: Kyrylo Tkachov 
> Date: Fri, 11 Jul 2025 02:50:32 -0700
> Subject: [PATCH 1/2] aarch64: Use SVE2 NBSL for vector NOR and NAND for
>  Advanced SIMD modes
>
> We already have patterns to use the NBSL instruction to implement vector
> NOR and NAND operations for SVE types and modes.  It is straightforward to
> have similar patterns for the fixed-width Advanced SIMD modes as well, though
> it requires combine patterns without the predicate operand and an explicit 'Z'
> output modifier.  This patch does so.
>
> So now for example we generate for:
>
> uint64x2_t nand_q(uint64x2_t a, uint64x2_t b) { return NAND(a, b); }
> uint64x2_t nor_q(uint64x2_t a, uint64x2_t b) { return NOR(a, b); }
>
> nand_q:
> nbslz0.d, z0.d, z1.d, z1.d
> ret
>
> nor_q:
> nbslz0.d, z0.d, z1.d, z0.d
> ret
>
> instead of the previous:
> nand_q:
> and v0.16b, v0.16b, v1.16b
> not v0.16b, v0.16b
> ret
>
> nor_q:
> orr v0.16b, v0.16b, v1.16b
> not v0.16b, v0.16b
> ret
>
> The tied operand requirements for NBSL mean that we can generate the MOVPRFX
> when the operands fall that way, but I guess having a 2-insn MOVPRFX form is
> not worse than the current 2-insn codegen at least, and the MOVPRFX can be
> fused by many cores.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Signed-off-by: Kyrylo Tkachov 
>
> gcc/
>
>   * config/aarch64/aarch64-sve2.md (*aarch64_sve2_unpred_nor):
>   New define_insn.
>   (*aarch64_sve2_nand_unpred): Likewise.
>
> gcc/testsuite/
>
>   * gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c: New test.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64-sve2.md| 29 
>  .../aarch64/sve2/nbsl_nor_nand_neon.c | 68 +++
>  2 files changed, 97 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c
>
> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
> b/gcc/config/aarch64/aarch64-sve2.md
> index 233a9b51c25..6d6dc94cd81 100644
> --- a/gcc/config/aarch64/aarch64-sve2.md
> +++ b/gcc/config/aarch64/aarch64-sve2.md
> @@ -1645,6 +1645,20 @@
>}
>  )
>  
> +(define_insn "*aarch64_sve2_unpred_nor"
> +  [(set (match_operand:VDQ_I 0 "register_operand")
> + (and:VDQ_I
> +   (not:VDQ_I
> + (match_operand:VDQ_I 1 "register_operand"))
> +   (not:VDQ_I
> + (match_operand:VDQ_I 2 "register_operand"]
> +  "TARGET_SVE2"
> +  {@ [ cons: =0 , %1 , 2 ; attrs: movprfx ]
> + [ w, 0  , w ; *  ] nbsl\t%Z0.d, %Z0.d, %Z2.d, %Z0.d
> + [ ?&w  , w  , w ; yes] movprfx\t%Z0, %Z1\;nbsl\t%Z0.d, 
> %Z0.d, %Z2.d, %Z1.d
> +  }
> +)
> +
>  ;; Use NBSL for vector NAND.
>  (define_insn_and_rewrite "*aarch64_sve2_nand"
>[(set (match_operand:SVE_FULL_I 0 "register_operand")
> @@ -1667,6 +1681,21 @@
>}
>  )
>  
> +;; Same as above but unpredicated and including Advanced SIMD modes.
> +(define_insn "*aarch64_sve2_nand_unpred"
> +  [(set (match_operand:VDQ_I 0 "register_operand")
> + (ior:VDQ_I
> +   (not:VDQ_I
> + (match_operand:VDQ_I 1 "register_operand"))
> +   (not:VDQ_I
> + (match_operand:VDQ_I 2 "register_operand"]
> +  "TARGET_SVE2"
> +  {@ [ cons: =0 , %1 , 2 ; attrs: movprfx ]
> + [ w, 0  , w ; *  ] nbsl\t%Z0.d, %Z0.d, %Z2.d, %Z2.d
> + [ ?&w  , w  , w ; yes] movprfx\t%Z0, %Z1\;nbsl\t%Z0.d, 
> %Z0.d, %Z2.d, %Z2.d
> +  }
> +)
> +
>  ;; Unpredicated bitwise select.
>  ;; (op3 ? bsl_mov : bsl_dup) == (((bsl_mov ^ bsl_dup) & op3) ^ bsl_dup)
>  (define_expand "@aarch64_sve2_bsl"
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c 
> b/gcc/testsuite/gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c
> new file mode 100644
> index 000..09bfc194f88
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/nbsl_nor_nand_neon.c
> @@ -0,0 +1,68 @@
> +/* { dg-options "-O2" } */
> +/* { dg-final { check-function-bodies "**" "" "" } } */
> +
> +#include 
> +
> +#define NAND(x, y)  (~((x) & (y)))
> +#define NOR(x, y)   (~((x) | (y)))
> +
> +/*
> +** nand_d:
> +**   nbslz0.d, z0.d, z1.d, z1.d
> +**   ret
> +*/
> +uint32x2_t nand_d(uint32x2_t a, uint32x2_t b) { return NAND(a, b); }
> +
> +/*
> +** nand_d_mp:
> +**   movprfx z0, z1
> +**   nbslz0.d, z0.d, z2.d, z2.d
> +**   ret
> +*/
> +uint32x2_t nand_d_mp(uint32x2_t c, uint32x2_t a, uint32x2_t b) { return 
> NAND(a, b); }
> +
> +/*
> +** nor_d:
> +**   nbslz0.d, z0.d, z1.d, z0.d
> +**   ret
> +*/
> +uint32x2_t nor_d(uint32x2_t a, uint32x2_t b) { return NOR(a, b); }
> +
> +/*
> +** nor_d_mp:
> +**   movprfx z0, z1
> +**   nbslz0.d, z0.d, z2.d, z1.d
> +**   ret
> +*/
> +uint32x2_t nor_d_mp(uint32x2_t c, uint32x2_t a, uint32x2_t b) { return 
> NOR(a, b); }
> +
> +/*
> +** nand_q:
> +**   nbslz0.d, z0.d, z1.d,

[PATCH v3 1/2] AArch64: precommit test for masked load vectorisation.

2025-07-15 Thread Karl Meakin

Commit the test file `mask_load_2.c` before the vectorisation analysis
is changed, so that the changes in codegen are more obvious in the next
commit.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: New test.
---
 .../gcc.target/aarch64/sve/mask_load_2.c  | 23 +++
 1 file changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c 
b/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c
new file mode 100644
index 000..38fcf4f7206
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-march=armv8-a+sve -msve-vector-bits=128 -O3" }
+
+typedef struct Array {
+int elems[3];
+} Array;
+
+int loop(Array **pp, int len, int idx) {
+int nRet = 0;
+
+#pragma GCC unroll 0
+for (int i = 0; i < len; i++) {
+Array *p = pp[i];
+if (p) {
+nRet += p->elems[idx];
+}
+}
+
+return nRet;
+}
+
+// { dg-final { scan-assembler-times {ld1w\tz[0-9]+\.d, p[0-7]/z} 0 } }
+// { dg-final { scan-assembler-times {add\tz[0-9]+\.s, p[0-7]/m}  0 } }
-- 
2.48.1

[PATCH v3 2/2] middle-end: Enable masked load with non-constant offset

2025-07-15 Thread Karl Meakin

The function `vect_check_gather_scatter` requires the `base` of the load
to be loop-invariant and the `off`set to be not loop-invariant. When faced
with a scenario where `base` is not loop-invariant, instead of giving up
immediately we can try swapping the `base` and `off`, if `off` is
actually loop-invariant.

Previously, it would only swap if `off` was the constant zero (and so
trivially loop-invariant). This is too conservative: we can still
perform the swap if `off` is a more complex but still loop-invariant
expression, such as a variable defined outside of the loop.

This allows loops like the function below to be vectorised, if the
target has masked loads and sufficiently large vector registers (eg
`-march=armv8-a+sve -msve-vector-bits=128`):

```c
typedef struct Array {
int elems[3];
} Array;

int loop(Array **pp, int len, int idx) {
int nRet = 0;

for (int i = 0; i < len; i++) {
Array *p = pp[i];
if (p) {
nRet += p->elems[idx];
}
}

return nRet;
}
```

gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_check_gather_scatter): Swap
`base` and `off` in more scenarios. Also assert at the end of
the function that `base` and `off` are loop-invariant and not
loop-invariant respectively.

gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: Update tests.
---
 .../gcc.target/aarch64/sve/mask_load_2.c  |  4 +--
 gcc/tree-vect-data-refs.cc| 26 ---
 2 files changed, 13 insertions(+), 17 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c 
b/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c
index 38fcf4f7206..66d95101a14 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c
@@ -19,5 +19,5 @@ int loop(Array **pp, int len, int idx) {
 return nRet;
 }
 
-// { dg-final { scan-assembler-times {ld1w\tz[0-9]+\.d, p[0-7]/z} 0 } }
-// { dg-final { scan-assembler-times {add\tz[0-9]+\.s, p[0-7]/m}  0 } }
+// { dg-final { scan-assembler-times {ld1w\tz[0-9]+\.d, p[0-7]/z} 1 } }
+// { dg-final { scan-assembler-times {add\tz[0-9]+\.s, p[0-7]/m}  1 } }
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index c84cd29116e..ef03a9fb178 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4667,26 +4667,19 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, 
loop_vec_info loop_vinfo,
   if (off == NULL_TREE)
 off = size_zero_node;
 
-  /* If base is not loop invariant, either off is 0, then we start with just
- the constant offset in the loop invariant BASE and continue with base
- as OFF, otherwise give up.
- We could handle that case by gimplifying the addition of base + off
- into some SSA_NAME and use that as off, but for now punt.  */
+  /* BASE must be loop invariant.  If it is not invariant, but OFF is, then we
+   * can fix that by swapping BASE and OFF.  */
   if (!expr_invariant_in_loop_p (loop, base))
 {
-  if (!integer_zerop (off))
+  if (!expr_invariant_in_loop_p (loop, off))
return false;
-  off = base;
-  base = size_int (pbytepos);
-}
-  /* Otherwise put base + constant offset into the loop invariant BASE
- and continue with OFF.  */
-  else
-{
-  base = fold_convert (sizetype, base);
-  base = size_binop (PLUS_EXPR, base, size_int (pbytepos));
+
+  std::swap (base, off);
 }
 
+  base = fold_convert (sizetype, base);
+  base = size_binop (PLUS_EXPR, base, size_int (pbytepos));
+
   /* OFF at this point may be either a SSA_NAME or some tree expression
  from get_inner_reference.  Try to peel off loop invariants from it
  into BASE as long as possible.  */
@@ -4864,6 +4857,9 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, 
loop_vec_info loop_vinfo,
   offset_vectype = NULL_TREE;
 }
 
+  gcc_checking_assert (expr_invariant_in_loop_p (loop, base));
+  gcc_checking_assert (!expr_invariant_in_loop_p (loop, off));
+
   info->ifn = ifn;
   info->decl = decl;
   info->base = base;
-- 
2.48.1

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Andrew Pinski

On Tue, Jul 15, 2025 at 6:06 AM Jakub Jelinek  wrote:
>
> On Tue, Jul 15, 2025 at 08:21:50AM -0400, Jason Merrill wrote:
> > Given the above that seems rather unlikely, but I suppose it's fine if you
> > want to do it that way.  The patch is OK either way.
>
> Committed just the v2 patch.  I can test your patch next with other patches,
> or do you want to test/commit it yourself?

Just an FYI, this broke aarch64 build since libgcc has -Werror on
there and we get warnings now in libbid code:
```
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c: In function
‘__binary32_to_bid128’:
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:130:31: error:
variable ‘c3’ set but not used [-Werror=unused-but-set-variable=]
  130 | { unsigned long long c0,c1,c2,c3;   \
  |   ^~
../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:146842:5: note:
in expansion of macro ‘__mul_10x256_to_256’
146842 | __mul_10x256_to_256 (z.w[5], z.w[4], z.w[3], z.w[2],
z.w[5], z.w[4],
   | ^~~
```
https://builder.sourceware.org/buildbot/#/builders/311/builds/1483

I will see if there is anything simple to be done in the next few hours.

Thanks,
Andrew


>
> Jakub
>

[pushed] c++: constexpr uninitialized union [PR120577]

2025-07-15 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

This was failing for two reasons:

1) We were wrongly treating the basic_string constructor as
zero-initializing the object, which it doesn't.
2) Given that, when we went to look for a value for the anonymous union,
we concluded that it was value-initialized, and trying to evaluate that
broke because we weren't setting ctx->ctor for it.

This patch fixes both issues, #1 by setting CONSTRUCTOR_NO_CLEARING and #2
by inserting a new CONSTRUCTOR for the member rather than evaluate it out of
context, which is consistent with cxx_eval_store_expression.

PR c++/120577

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Set
CONSTRUCTOR_NO_CLEARING on initial value for ctor.
(cxx_eval_component_reference): Make value-initialization
of an aggregate member explicit.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-union9.C: New test.
---
 gcc/cp/constexpr.cc   | 18 --
 gcc/testsuite/g++.dg/cpp2a/constexpr-union9.C | 33 +++
 2 files changed, 48 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-union9.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index eb19784dbba..ee06858f715 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -4201,10 +4201,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
tree t,
  && TREE_CODE (new_obj) == COMPONENT_REF
  && TREE_CODE (TREE_TYPE (TREE_OPERAND (new_obj, 0))) == 
UNION_TYPE)
{
+ tree ctor = build_constructor (TREE_TYPE (new_obj), NULL);
+ CONSTRUCTOR_NO_CLEARING (ctor) = true;
  tree activate = build2 (INIT_EXPR, TREE_TYPE (new_obj),
- new_obj,
- build_constructor (TREE_TYPE (new_obj),
-NULL));
+ new_obj, ctor);
  cxx_eval_constant_expression (ctx, activate,
lval, non_constant_p, overflow_p,
jump_target);
@@ -5793,6 +5793,18 @@ cxx_eval_component_reference (const constexpr_ctx *ctx, 
tree t,
 }
 
   /* If there's no explicit init for this field, it's value-initialized.  */
+
+  if (AGGREGATE_TYPE_P (TREE_TYPE (t)))
+{
+  /* As in cxx_eval_store_expression, insert an empty CONSTRUCTOR
+and copy the flags.  */
+  constructor_elt *e = get_or_insert_ctor_field (whole, part);
+  e->value = value = build_constructor (TREE_TYPE (part), NULL);
+  CONSTRUCTOR_ZERO_PADDING_BITS (value)
+   = CONSTRUCTOR_ZERO_PADDING_BITS (whole);
+  return value;
+}
+
   value = build_value_init (TREE_TYPE (t), tf_warning_or_error);
   return cxx_eval_constant_expression (ctx, value,
   lval,
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-union9.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-union9.C
new file mode 100644
index 000..7db1030ab20
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-union9.C
@@ -0,0 +1,33 @@
+// PR c++/120577
+// { dg-do compile { target c++20 } }
+
+template  struct optional {
+  union {
+_Tp __val_;
+  };
+  template 
+  constexpr optional(_Args... __args)
+  : __val_(__args...) {}
+};
+template 
+constexpr optional<_Tp> make_optional(_Args... __args) {
+  return optional<_Tp>(__args...);
+}
+
+struct __non_trivial_if {
+  constexpr __non_trivial_if() {}
+};
+struct allocator : __non_trivial_if {};
+struct __padding {};
+struct __short {
+  [[__no_unique_address__]] __padding __padding_;
+  int __data_;
+};
+struct basic_string {
+  union {
+__short __s;
+  };
+  [[__no_unique_address__]] allocator __alloc_;
+  constexpr basic_string(int, int) {}
+};
+auto opt = make_optional(4, 'X');

base-commit: 9956dc37cb3d71c1fc7a89b45cc291645c11817b
-- 
2.50.0

Re: [PATCH 2/2] libstdc++: Conditionalize LWG 3569 changes to join_view

2025-07-15 Thread Patrick Palka

On Tue, 15 Jul 2025, Patrick Palka wrote:

> On Tue, 15 Jul 2025, Tomasz Kaminski wrote:
> 
> > 
> > 
> > On Tue, Jul 15, 2025 at 3:55 PM Patrick Palka  wrote:
> >   On Tue, 15 Jul 2025, Tomasz Kaminski wrote:
> > 
> >   > On Tue, Jul 15, 2025 at 5:51 AM Patrick Palka  
> > wrote:
> >   >       Tested on x86_64-pc-linux-gnu, does this look OK for trunk 
> > only
> >   >       (since it impacts ABI)?
> >   >
> >   > In theory an Iterator that meets all semantic requirements of the 
> > input_iterator
> >   > concept, could provide a default constructor that is unconstrained, 
> > but ill-formed
> >   > when invoked. This can be easily done accidentally, by having a 
> > default member initializer.
> >   >
> >   > #include 
> >   >
> >   > struct NoDefault
> >   > { NoDefault(int); };
> >   >
> >   > template
> >   > struct Iterator {
> >   >   T x = T();
> >   > };
> >   >
> >   > static_assert(std::default_initializable>);
> >   >
> >   > Default member initializers are not in immediate context, and 
> > checking "std::default_initializable"
> >   > is ill-formed. clang emits error here: 
> > https://godbolt.org/z/EafKn6h16
> >   >
> >   > You can however, do this optimization for forward_iterator. The 
> > difference here is that user-defined
> >   > iterators provides iterator_category/iterator_concept that maps to 
> > forward_iterator_tag or stronger,
> >   > so we can check default_initializable.
> > 
> >   Good point...  But it seems this is not only an issue in join_view 
> > (with
> >   this patch), we already require elsewhere in  that the default
> >   ctor of an input iterator is properly constrained:
> > 
> >     filter_view, transform_view, elements_view, stride_view, 
> > enumerate_view,
> >     to_input_view (and perhaps iota_view also counts)
> > 
> > Could you please elaborate? I understand that for this view, they default 
> > constructor
> > requires that iterator is default_initializable, but if you are never 
> > calling this constructor,
> > the view will function correctly for my not-properly constrained iterator.
> >
> > However, for this case the standard does not impose default_initializable 
> > requirement
> > for iterator, when we are incrementing it.
> 
> I might be confused, is supporting such underconstrained iterators
> a QoI issue or a correctness issue?
> 
> If it's QoI, note that before P2325R3 (approved June 2021) even C++20
> input iterators were required to satisfy default_initializable, so I
> reckon it's quite rare to see C++20 input iterator written after that
> with an underconstrained default ctor.  I'm not sure supporting them is
> worth the tradeoff of pessimizing join_view for input-only iterators.

... worth the tradeoff of pessimizing join_view for properly constrained
default initializable input-only iterators rather.  That's a bit of a
corner case that I guess we shouldn't care much about either, so I don't
feel strongly about using default_initializable vs forward_iterator even
if this is a QoI issue, and am happy to use forward_iterator as you
prefer :)

> 
> If it's a correctness issue, I definitely agree with your changes :)
> 
> >  
> > 
> >   And so these views would break similarly if the default ctor is
> >   underconstrained IIUC.  I don't see why we'd want to start caring 
> > about
> >   such iterators in join_view, if other fundamental views already don't?
> > 
> >   >
> >   >
> >   >
> >   >       -- >8 --
> >   >
> >   >       LWG 3569 adjusted join_view's iterator to handle adapting
> >   >       non-default-constructible (input) iterators by wrapping the
> >   >       corresponding data member with std::optional, and we followed 
> > suit in
> >   >       r13-2649-g7aa80c82ecf3a3.
> >   >
> >   >       But this wrapping is unnecessary for iterators that are 
> > already
> >   >       default-constructible.  Rather than unconditionally using 
> > std::optional
> >   >       here, which introduces time/space overhead, this patch 
> > conditionalizes
> >   >       our LWG 3569 changes on the iterator in question being
> >   >       non-default-constructible. 
> >   >
> >   >
> >   >       libstdc++-v3/ChangeLog:
> >   >
> >   >               * include/std/ranges 
> > (join_view::_Iterator::_M_satisfy):
> >   >               Adjust to handle non-std::optional _M_inner as per 
> > before LWG 3569.
> >   >               (join_view::_Iterator::_M_get_inner): New.
> >   >               (join_view::_Iterator::_M_inner): Don't wrap in 
> > std::optional if
> >   >               the iterator is already default constructible.  
> > Initialize.
> >   >               (join_view::_Iterator::operator*): Use _M_get_inner 
> > instead
> >   >               of *_M_inner.
> >   >               (join_view::_Iterator::

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Jakub Jelinek

On Tue, Jul 15, 2025 at 08:36:46AM -0700, Andrew Pinski wrote:
> On Tue, Jul 15, 2025 at 6:06 AM Jakub Jelinek  wrote:
> >
> > On Tue, Jul 15, 2025 at 08:21:50AM -0400, Jason Merrill wrote:
> > > Given the above that seems rather unlikely, but I suppose it's fine if you
> > > want to do it that way.  The patch is OK either way.
> >
> > Committed just the v2 patch.  I can test your patch next with other patches,
> > or do you want to test/commit it yourself?
> 
> Just an FYI, this broke aarch64 build since libgcc has -Werror on
> there and we get warnings now in libbid code:
> ```
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c: In function
> ‘__binary32_to_bid128’:
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:130:31: error:
> variable ‘c3’ set but not used [-Werror=unused-but-set-variable=]
>   130 | { unsigned long long c0,c1,c2,c3;   \
>   |   ^~
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:146842:5: note:
> in expansion of macro ‘__mul_10x256_to_256’
> 146842 | __mul_10x256_to_256 (z.w[5], z.w[4], z.w[3], z.w[2],
> z.w[5], z.w[4],
>| ^~~
> ```
> https://builder.sourceware.org/buildbot/#/builders/311/builds/1483
> 
> I will see if there is anything simple to be done in the next few hours.

#define __mul_10x64(sum,carryout,input,carryin) \
{ unsigned long long s3 = (input) + ((input) >> 2); \
  (carryout) = ((s3 < (unsigned long long)(input))<<3) + (s3>>61);  \
  s3 = (s3<<3) + ((input&3)<<1);\
  (sum) = s3 + (carryin);   \
  if ((unsigned long long)(sum) < s3) ++(carryout); \
}

// Multiply a 256-bit number by 10, assuming no overflow

#define __mul_10x256_to_256(p3,p2,p1,p0,a3,a2,a1,a0)\
{ unsigned long long c0,c1,c2,c3;   \
  __mul_10x64(p0,c0,a0,0ull);   \
  __mul_10x64(p1,c1,a1,c0); \
  __mul_10x64(p2,c2,a2,c1); \
  __mul_10x64(p3,c3,a3,c2); \
}

So yeah, it is clear why -Wunused-but-set-variable=1 doesn't
warn about it, there is a useless assignment to c3 and then conditional
++c3.  And it is clear why it does warn now.

Not having c3 would mean having special macro for the final step,
so I think it is better to just

2025-07-15  Jakub Jelinek  

* bid_binarydecimal.c (__mul_10x256_to_256): Quiet
-Wunused-but-set-variable=3 warning.

--- libgcc/config/libbid/bid_binarydecimal.c.jj 2025-04-08 14:09:53.599413906 
+0200
+++ libgcc/config/libbid/bid_binarydecimal.c2025-07-15 18:02:15.494219552 
+0200
@@ -132,6 +132,7 @@ UINT64 CY;
   __mul_10x64(p1,c1,a1,c0); \
   __mul_10x64(p2,c2,a2,c1); \
   __mul_10x64(p3,c3,a3,c2); \
+  (void)c3;\
 }
 
 // Set up indices for low and high parts, depending on the endian-ness.


Untested...

Jakub

Re: [PATCH] libgcc: Fix aarch64 build

2025-07-15 Thread Jakub Jelinek

On Tue, Jul 15, 2025 at 09:02:09AM -0700, Andrew Pinski wrote:
> For aarch64, libgcc is built with -Werror, after the latest
> -Wunused-but-set* commit (r16-2258-g0eac9cfee8cb0b21d), a new warning
> showed up:
> ```
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c: In function
> ‘__binary32_to_bid128’:
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:130:31: error:
> variable ‘c3’ set but not used [-Werror=unused-but-set-variable=]
>   130 | { unsigned long long c0,c1,c2,c3;   \
>   |   ^~
> ../../../gcc/libgcc/config/libbid/bid_binarydecimal.c:146842:5: note:
> in expansion of macro ‘__mul_10x256_to_256’
> 146842 | __mul_10x256_to_256 (z.w[5], z.w[4], z.w[3], z.w[2],
> z.w[5], z.w[4],
>| ^~~
> ```
> 
> This fixes it by casting c3 to void after the last __mul_10x64 in
> __mul_10x256_to_256 macro to mark it as being "used".
> 
> 
> OK?
> 
> libgcc/config/libbid/ChangeLog:
> 
>   * bid_binarydecimal.c (__mul_10x256_to_256): Mark c3 as being
>   used.
> 
> Signed-off-by: Andrew Pinski 

Ok (just posted the same patch ;) ).

Jakub

[PATCH v3 0/2] middle-end: Enable masked load with non-constant offset

2025-07-15 Thread Karl Meakin

Resending this patch series because it seems the v2 patch didn't go through
properly (commit v2 1/2 is missing on sourceware). No changes from v2.

I also forgot to mention I do not have commit rights to the repo yet. So I will
need someone else to merge this for me. Thanks

Changelog:
* v3: No changes.
* v2: Make assertions in `vect_check_gather_scatter` into checking assertions.
* v1: Initial patch

Karl Meakin (2):
  AArch64: precommit test for masked load vectorisation.
  middle-end: Enable masked load with non-constant offset

 .../gcc.target/aarch64/sve/mask_load_2.c  | 23 
 gcc/tree-vect-data-refs.cc| 26 ---
 2 files changed, 34 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/mask_load_2.c

--
2.48.1

Re: [PATCH v5] RISC-V: Mips P8700 Conditional Move Support.

2025-07-15 Thread Jeff Law





On 7/14/25 11:34 PM, Umesh Kalappa wrote:

Updated the test for rv32 accordingly and no regress found for runs like
"runtest --tool gcc 
--target_board='riscv-sim/-march=rv32gc_zba_zbb_zbc_zbs/-mabi=ilp32d/-mcmodel=medlow' 
riscv.exp" and
"runtest --tool gcc 
--target_board='riscv-sim/-march=rv64gc_zba_zbb_zbc_zbs/-mabi=lp64d/-mcmodel=medlow' 
riscv.exp"

lint warnings can be ignored for riscv-cores.def and riscv-ext-mips.def

gcc/ChangeLog:

 *config/riscv/riscv-cores.def(RISCV_CORE): Updated the supported march.
 *config/riscv/riscv-ext-mips.def(DEFINE_RISCV_EXT):
 New file added for mips conditional mov extension.
 *config/riscv/riscv-ext.def: Likewise.
 *config/riscv/t-riscv: Generates riscv-ext.opt
 *config/riscv/riscv-ext.opt: Generated file.
 *config/riscv/riscv.cc(riscv_expand_conditional_move): Updated for 
mips cmov
 and outlined some code that handle arch cond move.
 *config/riscv/riscv.md(movcc): updated expand for MIPS CCMOV.
 *config/riscv/mips-insn.md: New file for mips-p8700 ccmov insn.
 *gcc/doc/riscv-ext.texi: Updated for mips cmov.

gcc/testsuite/ChangeLog:

 *testsuite/gcc.target/riscv/mipscondmov.c: Test file for mips.ccmov 
insn.
---




diff --git a/gcc/config/riscv/mips-insn.md b/gcc/config/riscv/mips-insn.md
new file mode 100644
index 000..de53638d587
--- /dev/null
+++ b/gcc/config/riscv/mips-insn.md
@@ -0,0 +1,36 @@
+;; Machine description for MIPS custom instructions.
+;; Copyright (C) 2025 Free Software Foundation, Inc.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_insn "*movcc_bitmanip"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+   (if_then_else:GPR
+ (any_eq:X (match_operand:X 1 "register_operand" "r")
+(match_operand:X 2 "const_0_operand" "J"))
+(match_operand:GPR 3 "reg_or_0_operand" "rJ")
+(match_operand:GPR 4 "reg_or_0_operand" "rJ")))]
This was misformatted.  The "any_eq" should either be on the same line 
as the if_then_else or indented relative to the if_then_else.  The final 
selection of the location of the any_eq expression will influence how 
operands 2, 3 and 4 get formatted.  Essentially oeprand 2 would line up 
with operand 1 (since they're both operands of the any_eq.  operands 3 
and 4 would line up with the any_eq since like the any_eq, they are 
operands of the if_then_else.  I've fixed this too.







@@ -4897,3 +4897,4 @@
  (include "sifive-p600.md")
  (include "generic-vector-ooo.md")
  (include "generic-ooo.md")
+(include "mips-insn.md")
Minor adjustment needed here.  Kito recently clarified that we have 
groups of includes.  Some are ratified RVI extensions, there's another 
group for vector extensions and a final group for scheduling modules.


Your patch bits the mips-insn.md into the group for scheduling models. 
It needs to move into the vector extension group.  I've taken care of that.



I fixed various nits in the ChangeLog to make the pre-commit hooks happy 
and pushed this to the trunk.


jeff

Re: [Patch, fortran] PR121060 - ICE when argument is associate name created from type-bound operator result

2025-07-15 Thread Jerry Delisle

OK Paul, thanks. We probably ought to back port it to 15

On Tue, Jul 15, 2025, 9:22 AM Paul Richard Thomas <
paul.richard.tho...@gmail.com> wrote:

> At first this PR looked as if it was going to be one of those horrible
> typebound procedure/operator tangles. However, it turned out that this was
> entirely down to the attempt  to extract the specific procedure for the
> ASSOCIATE name during parsing. Returning NULL before expression resolution
> fixed the problem.
>
> Regtests fine - OK for mainline? I am inclined to backport, since the fix
> is both simple and, at worst, defers the ICE. OK?
>
> Paul
>
>

ICE with new IMPORT feature

2025-07-15 Thread Paul Richard Thomas

Dear All,

The failure that Steve mentioned below is fixed. ChangeLogs and patch are
self-describing.

Regtests fine - OK for mainline?

Paul

On Sat, 12 Jul 2025 at 19:57, Steve Kargl 
wrote:

> All, Paul,
>
> In testing Paul's recent addition of support for IMPORT,
> I have uncovered an ICE due to mangled source code.  The
> code leads to a NULL pointer dereference.  The patch that
> follows my .sig fixes the issue.  Note the testcase has one
> FAIL.
>
> % gmake check-fortran RUNTESTFLAGS="dg.exp=import13.f90"
>   ...
> Running /home/kargl/gcc/gcc/gcc/testsuite/gfortran.dg/dg.exp ...
> FAIL: gfortran.dg/import13.f90   -O  (test for excess errors)
> ...
>
>
diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 111ebc5f845..af425754d08 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -5272,13 +5272,15 @@ gfc_match_import (void)
   switch (m)
 	{
 	case MATCH_YES:
-	  if (gfc_current_ns->parent !=  NULL
+	  if (gfc_current_ns->parent != NULL
 	  && gfc_find_symbol (name, gfc_current_ns->parent, 1, &sym))
 	{
 	   gfc_error ("Type name %qs at %C is ambiguous", name);
 	   return MATCH_ERROR;
 	}
-	  else if (!sym && gfc_current_ns->proc_name->ns->parent !=  NULL
+	  else if (!sym
+		   && gfc_current_ns->proc_name
+		   && gfc_current_ns->proc_name->ns->parent
 		   && gfc_find_symbol (name,
    gfc_current_ns->proc_name->ns->parent,
    1, &sym))
@@ -5289,7 +5291,8 @@ gfc_match_import (void)
 
 	  if (sym == NULL)
 	{
-	  if (gfc_current_ns->proc_name->attr.if_source == IFSRC_IFBODY)
+	  if (gfc_current_ns->proc_name
+		  && gfc_current_ns->proc_name->attr.if_source == IFSRC_IFBODY)
 		{
 		  gfc_error ("Cannot IMPORT %qs from host scoping unit "
 			 "at %C - does not exist.", name);
diff --git a/gcc/testsuite/gfortran.dg/import13.f90 b/gcc/testsuite/gfortran.dg/import13.f90
new file mode 100644
index 000..3bcfec33723
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/import13.f90
@@ -0,0 +1,21 @@
+! { dg-do compile }
+!
+! Contributed by Steve Kargl  
+!
+program foo
+   implicit none
+   integer i
+   i = 42
+   if (i /= 42) stop 1
+   call bah
+   contains
+  subroutine bah   ! { dg-error "is already defined at" }
+ i = 43
+ if (i /= 43) stop 2
+  end subroutine bah
+  subroutine bah   ! { dg-error "is already defined at" }
+ ! import statement missing a comma
+ import none   ! { dg-error "Unexpected IMPORT statement" }
+ i = 44! { dg-error "Unexpected assignment" }
+  end subroutine bah   ! { dg-error "Expecting END PROGRAM" }
+end program foo


Change.Logs
Description: Binary data

[PATCH 1/2] aarch64: Fix predication of FP8 FDOT insns [PR120986]

2025-07-15 Thread Alex Coplan

Hi,

The predication of the SVE2 FP8 dot product insns was relying on the
architectural dependency:

FEAT_FP8DOT2 => FEAT_FP8DOT4

which was relaxed in GCC as of
r15-7480-g299a8e2dc667e795991bc439d2cad5ea5bd379e2, thus leading to
unrecognisable insn ICEs when compiling a two-way FDOT with just
+fp8dot2.  This patch fixes the predication of the insns to test for
the correct feature bit depending on the value of the mode iterator.

Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk and backport
to GCC 15?

Thanks,
Alex

gcc/ChangeLog:

PR target/120986
* config/aarch64/aarch64-sve2.md (@aarch64_sve_dot):
Adjust insn predicate to use new mode attribute which checks
for the correct feature bit depending on the mode.
(@aarch64_sve_dot_lane): Likewise.
* config/aarch64/iterators.md (HAVE_FP8_DOT_INSN): New.

gcc/testsuite/ChangeLog:

PR target/120986
* gcc.target/aarch64/torture/pr120986-1.c: New test.
---
 gcc/config/aarch64/aarch64-sve2.md|  4 ++--
 gcc/config/aarch64/iterators.md   |  4 
 gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c | 10 ++
 3 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c

diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md
index 660901d4b3f..0c96d1305d2 100644
--- a/gcc/config/aarch64/aarch64-sve2.md
+++ b/gcc/config/aarch64/aarch64-sve2.md
@@ -2155,7 +2155,7 @@ (define_insn "@aarch64_sve_dot"
 	   (match_operand:VNx16QI 3 "register_operand")
 	   (reg:DI FPM_REGNUM)]
 	  UNSPEC_DOT_FP8))]
-  "TARGET_SSVE_FP8DOT4 && !(mode == VNx8HFmode && !TARGET_SSVE_FP8DOT2)"
+  ""
   {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ]
  [ w, 0 , w , w ; *  ] fdot\t%0., %2.b, %3.b
  [ ?&w  , w , w , w ; yes] movprfx\t%0, %1\;fdot\t%0., %2.b, %3.b
@@ -2171,7 +2171,7 @@ (define_insn "@aarch64_sve_dot_lane"
 	   (match_operand:SI 4 "const_int_operand")
 	   (reg:DI FPM_REGNUM)]
 	  UNSPEC_DOT_LANE_FP8))]
-  "TARGET_SSVE_FP8DOT4 && !(mode == VNx8HFmode && !TARGET_SSVE_FP8DOT2)"
+  ""
   {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ]
  [ w, 0 , w , y ; *  ] fdot\t%0., %2.b, %3.b[%4]
  [ ?&w  , w , w , y ; yes] movprfx\t%0, %1\;fdot\t%0., %2.b, %3.b[%4]
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index c59fcd679d7..b5a51a8598e 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -2725,6 +2725,10 @@ (define_mode_attr aligned_fpr [(VNx16QI "w") (VNx8HI "w")
 (define_mode_attr LD1_EXTENDQ_MEM [(VNx4SI "VNx1SI") (VNx4SF "VNx1SI")
    (VNx2DI "VNx1DI") (VNx2DF "VNx1DI")])
 
+(define_mode_attr HAVE_FP8_DOT_INSN [(VNx4SF "TARGET_SSVE_FP8DOT4")
+ (VNx8HF "TARGET_SSVE_FP8DOT2")])
+
+
 ;; ---
 ;; Code Iterators
 ;; ---
diff --git a/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c
new file mode 100644
index 000..8777f1b7711
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.2-a+sve2+fp8dot2" } */
+#include 
+
+/* This triggered an ICE with an unrecognizable insn due to incorrect gating of
+   the insn in the backend.  */
+svfloat16_t foo(svfloat16_t a, svmfloat8_t b, svmfloat8_t c, unsigned long fpm)
+{
+  return svdot_lane_fpm (a, b, c, 0, fpm);
+}

[PATCH 2/2] aarch64: Relax fpm_t assert to allow const_ints [PR120986]

2025-07-15 Thread Alex Coplan

Hi,

This relaxes an overzealous assert that required the fpm_t argument to
be in DImode when expanding FP8 intrinsics.  Of course this fails to
account for modeless const_ints.

Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk and backport
to GCC 15?

Thanks,
Alex

gcc/ChangeLog:

PR target/120986
* config/aarch64/aarch64-sve-builtins.cc
(function_expander::expand): Relax fpm_t assert to allow
modeless const_ints.

gcc/testsuite/ChangeLog:

PR target/120986
* gcc.target/aarch64/torture/pr120986-2.c: New test.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc| 5 +++--
 gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c | 7 +++
 2 files changed, 10 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 2b627a95060..19d6e36d948 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -4589,8 +4589,9 @@ function_expander::expand ()
 {
   /* The last element of these functions is always an fpm_t that must be
  written to FPMR before the call to the instruction itself. */
-  gcc_assert (args.last ()->mode == DImode);
-  emit_move_insn (gen_rtx_REG (DImode, FPM_REGNUM), args.last ());
+  rtx fpm = args.last ();
+  gcc_assert (CONST_INT_P (fpm) || GET_MODE (fpm) == DImode);
+  emit_move_insn (gen_rtx_REG (DImode, FPM_REGNUM), fpm);
 }
   return base->expand (*this);
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c
new file mode 100644
index 000..1218dead9dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/torture/pr120986-2.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=armv8.2-a+sve2+fp8dot2" } */
+#include 
+svfloat16_t foo(svfloat16_t a, svmfloat8_t b, svmfloat8_t c)
+{
+  return svdot_lane_fpm (a, b, c, 0, 0);
+}

Re: [PATCH] c, c++: Extend -Wunused-but-set-* warnings [PR44677]

2025-07-15 Thread Jakub Jelinek

On Tue, Jul 15, 2025 at 08:21:50AM -0400, Jason Merrill wrote:
> Given the above that seems rather unlikely, but I suppose it's fine if you
> want to do it that way.  The patch is OK either way.

Committed just the v2 patch.  I can test your patch next with other patches,
or do you want to test/commit it yourself?

Jakub

Re: [PATCH] aarch64: Use SVE2 BSL2N for vector EON

2025-07-15 Thread Richard Sandiford

Kyrylo Tkachov  writes:
>> On 15 Jul 2025, at 15:50, Richard Sandiford  
>> wrote:
>> 
>> Kyrylo Tkachov  writes:
>>> Hi all,
>>> 
>>> SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
>>> (x & z) | (~x & ~z) which is ~(x ^ z).
>>> Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
>>> Advanced SIMD and SVE modes when TARGET_SVE2.
>>> This patch does that. The tied register requirements of BSL2N and the 
>>> MOVPRFX
>>> rules mean we can't use the MOVPRFX form here so I have not included that
>>> alternative. Correct me if I'm wrong on this.
>> 
>> I think we can still use BSL2N, similarly to the patch from the other day.
>> It's just that the asm would need to be:
>> 
>> movprfx\t%0, %1\;bsl2n\t%0, %0, %1, %2
>> 
>> (with constraints &w/w/w).
>
> Thanks, something like the attach seems to work.
> I’ll do wider testing…

LGTM, thanks.  OK if testing passes.

Richard

>
> Kyrill
>
>> 
>> Thanks,
>> Richard
>> 
>>> 
>>> For code like:
>>> 
>>> uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
>>> svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
>>> svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return 
>>> EON(a, b); }
>>> 
>>> We now generate:
>>> eon_q:
>>> bsl2n z0.d, z0.d, z0.d, z1.d
>>> ret
>>> 
>>> eon_z:
>>> bsl2n z0.d, z0.d, z0.d, z1.d
>>> ret
>>> 
>>> eon_z_mp:
>>> bsl2n z1.d, z1.d, z1.d, z2.d
>>> mov z0.d, z1.d
>>> ret
>>> 
>>> instead of the previous:
>>> eon_q:
>>> eor v0.16b, v0.16b, v1.16b
>>> not v0.16b, v0.16b
>>> ret
>>> 
>>> eon_z:
>>> eor z0.d, z0.d, z1.d
>>> ptrue p3.b, all
>>> not z0.d, p3/m, z0.d
>>> ret
>>> 
>>> eon_z_mp:
>>> eor z0.d, z1.d, z2.d
>>> ptrue p3.b, all
>>> not z0.d, p3/m, z0.d
>>> ret
>>> 
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> Ok for trunk?
>>> Thanks,
>>> Kyrill
>>> 
>>> Signed-off-by: Kyrylo Tkachov 
>>> 
>>> gcc/
>>> 
>>> * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
>>> New pattern.
>>> (*aarch64_sve2_eon_bsl2n_unpred): Likewise.
>>> 
>>> gcc/testsuite/
>>> 
>>> * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.
>>> 
>>> From 761b14804c8bbeae745cb7a2ab58e26a3775b096 Mon Sep 17 00:00:00 2001
>>> From: Kyrylo Tkachov 
>>> Date: Fri, 11 Jul 2025 07:23:16 -0700
>>> Subject: [PATCH 2/2] aarch64: Use SVE2 BSL2N for vector EON
>>> 
>>> SVE2 BSL2N (x, y, z) = (x & z) | (~y & ~z). When x == y this computes:
>>> (x & z) | (~x & ~z) which is ~(x ^ z).
>>> Thus, we can use it to match RTL patterns (not (xor (...) (...))) for both
>>> Advanced SIMD and SVE modes when TARGET_SVE2.
>>> This patch does that.  The tied register requirements of BSL2N and the 
>>> MOVPRFX
>>> rules mean we can't use the MOVPRFX form here so I have not included that
>>> alternative.  Correct me if I'm wrong on this.
>>> 
>>> For code like:
>>> 
>>> uint64x2_t eon_q(uint64x2_t a, uint64x2_t b) { return EON(a, b); }
>>> svuint64_t eon_z(svuint64_t a, svuint64_t b) { return EON(a, b); }
>>> svuint64_t eon_z_mp(svuint64_t c, svuint64_t a, svuint64_t b) { return 
>>> EON(a, b); }
>>> 
>>> We now generate:
>>> eon_q:
>>>bsl2n   z0.d, z0.d, z0.d, z1.d
>>>ret
>>> 
>>> eon_z:
>>>bsl2n   z0.d, z0.d, z0.d, z1.d
>>>ret
>>> 
>>> eon_z_mp:
>>>bsl2n   z1.d, z1.d, z1.d, z2.d
>>>mov z0.d, z1.d
>>>ret
>>> 
>>> instead of the previous:
>>> eon_q:
>>>eor v0.16b, v0.16b, v1.16b
>>>not v0.16b, v0.16b
>>>ret
>>> 
>>> eon_z:
>>>eor z0.d, z0.d, z1.d
>>>ptrue   p3.b, all
>>>not z0.d, p3/m, z0.d
>>>ret
>>> 
>>> eon_z_mp:
>>>eor z0.d, z1.d, z2.d
>>>ptrue   p3.b, all
>>>not z0.d, p3/m, z0.d
>>>ret
>>> 
>>> Bootstrapped and tested on aarch64-none-linux-gnu.
>>> 
>>> Signed-off-by: Kyrylo Tkachov 
>>> 
>>> gcc/
>>> 
>>> * config/aarch64/aarch64-sve2.md (*aarch64_sve2_bsl2n_eon):
>>> New pattern.
>>> (*aarch64_sve2_eon_bsl2n_unpred): Likewise.
>>> 
>>> gcc/testsuite/
>>> 
>>> * gcc.target/aarch64/sve2/eon_bsl2n.c: New test.
>>> ---
>>> gcc/config/aarch64/aarch64-sve2.md| 28 ++
>>> .../gcc.target/aarch64/sve2/eon_bsl2n.c   | 52 +++
>>> 2 files changed, 80 insertions(+)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/eon_bsl2n.c
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>> b/gcc/config/aarch64/aarch64-sve2.md
>>> index 6d6dc94cd81..a011b947f51 100644
>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>> @@ -2053,6 +2053,34 @@
>>>   }
>>> )
>>> 
>>> +;; Vector EON (~(x, y)) using BSL2N.
>>> +(define_insn_and_rewrite "*aarch64_sve2_bsl2n_eon"
>>> +  [(set (match_operand:SVE_FULL_I 0 "register_operand" "=w")
>>> + (unspec:SVE_FULL_I
>>> +   [(match_operand 3)
>>> +(not:SVE_FULL_I
>>> +  (xor:SVE_FULL_I
>>> + (match_operand:SVE_FULL_I 1 "register_operand" "%0")
>>> + (match_operand:SVE_FULL_I 2 "register_operand" "w")))]
>>>

Re: [PATCH] expand: Allow fixed-point arithmetic for RDIV_EXPR.

2025-07-15 Thread Richard Biener

On Tue, Jul 15, 2025 at 11:32 AM Robin Dapp  wrote:
>
> Hi,
>
> r16-2175-g5aa21765236730 introduced an assert for floating-point modes
> when expanding an RDIV_EXPR but forgot fixed-point modes.  This patch
> adds ALL_FIXED_POINT_MODE_P to the assert.
>
> Bootstrap and regtest running on x86, aarch64, and power10.  Regtested
> on rv64gcv.  Regtest on arm running, needed to set it up still.

OK.

> Regards
>  Robin
>
> PR middle-end/121065
>
> gcc/ChangeLog:
>
> * cfgexpand.cc (expand_debug_expr): Allow fixed-point modes for
> RDIV_EXPR.
> * optabs-tree.cc (optab_for_tree_code): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/arm/pr121065.c: New test.
> ---
>  gcc/cfgexpand.cc|  3 ++-
>  gcc/optabs-tree.cc  |  3 ++-
>  gcc/testsuite/gcc.target/arm/pr121065.c | 11 +++
>  3 files changed, 15 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/arm/pr121065.c
>
> diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
> index a656ccebf17..8a55f4f472a 100644
> --- a/gcc/cfgexpand.cc
> +++ b/gcc/cfgexpand.cc
> @@ -5358,7 +5358,8 @@ expand_debug_expr (tree exp)
>return simplify_gen_binary (MULT, mode, op0, op1);
>
>  case RDIV_EXPR:
> -  gcc_assert (FLOAT_MODE_P (mode));
> +  gcc_assert (FLOAT_MODE_P (mode)
> + || ALL_FIXED_POINT_MODE_P (mode));
>/* Fall through.  */
>  case TRUNC_DIV_EXPR:
>  case EXACT_DIV_EXPR:
> diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc
> index 9308a6dfd65..0de74c7966a 100644
> --- a/gcc/optabs-tree.cc
> +++ b/gcc/optabs-tree.cc
> @@ -82,7 +82,8 @@ optab_for_tree_code (enum tree_code code, const_tree type,
> return unknown_optab;
>/* FALLTHRU */
>  case RDIV_EXPR:
> -  gcc_assert (FLOAT_TYPE_P (type));
> +  gcc_assert (FLOAT_TYPE_P (type)
> + || ALL_FIXED_POINT_MODE_P (TYPE_MODE (type)));
>/* FALLTHRU */
>  case TRUNC_DIV_EXPR:
>  case EXACT_DIV_EXPR:
> diff --git a/gcc/testsuite/gcc.target/arm/pr121065.c 
> b/gcc/testsuite/gcc.target/arm/pr121065.c
> new file mode 100644
> index 000..dfc6059a46d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/pr121065.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mcpu=cortex-m55" } */
> +
> +_Accum sa;
> +char c;
> +
> +void
> +div_csa ()
> +{
> +  c /= sa;
> +}
> --
> 2.50.0
>
>

1 2 >

1 - 100 of 123 matches

Mail list logo