Re: [PATCH] libstdc++: Fix ref_view branch of views::as_const [PR119135]

2025-03-13 Thread Jonathan Wakely
On Thu, 13 Mar 2025 at 03:54, Patrick Palka  wrote:
>
> On Wed, 12 Mar 2025, Patrick Palka wrote:
>
> > On Wed, 12 Mar 2025, Patrick Palka wrote:
> >
> > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14 and
> > > perhaps 13?


OK for trunk and 14 and 13.


> > >
> > > N.B. the use of a constrained auto instead of a separate static_assert
> > > in the testcase is unfortunate but I opted for local consistency for
> > > now.
> > >
> > > -- >8 --
> > >
> > > Unlike for span and empty_view, the range_reference_t of
> > > ref_view doesn't correspond to X.  This patch fixes the ref_view
> > > branch of views::as_const to correctly query the underlying range
> > > type X.
> > >
> > > PR libstdc++/119135
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/std/ranges: Include .
> > > (views::__detail::__is_ref_view): Replace with ...
> > > (views::__detail::__is_constable_ref_view): ... this.
> > > (views::_AsConst::operator()): Correct the ref_view branch.
> > > * testsuite/std/ranges/adaptors/as_const/1.cc (test03): Extend
> > > test.
> > > ---
> > >  libstdc++-v3/include/std/ranges  | 12 ++--
> > >  .../testsuite/std/ranges/adaptors/as_const/1.cc  |  4 
> > >  2 files changed, 10 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/include/std/ranges 
> > > b/libstdc++-v3/include/std/ranges
> > > index c2a2d6f4e05..31d62454895 100644
> > > --- a/libstdc++-v3/include/std/ranges
> > > +++ b/libstdc++-v3/include/std/ranges
> > > @@ -48,6 +48,7 @@
> > >  #include 
> > >  #include 
> > >  #if __cplusplus > 202002L
> > > +#include 
> > >  #include 
> > >  #endif
> > >  #include 
> > > @@ -9324,10 +9325,11 @@ namespace views::__adaptor
> > >  namespace __detail
> > >  {
> > >template
> > > -   inline constexpr bool __is_ref_view = false;
> > > +   inline constexpr bool __is_constable_ref_view = false;
> > >
> > >template
> > > -   inline constexpr bool __is_ref_view> = true;
> > > +   inline constexpr bool __is_constable_ref_view>
> > > + = constant_range;
> > >
> > >template
> > > concept __can_as_const_view = requires { 
> > > as_const_view(std::declval<_Range>()); };
> > > @@ -9349,10 +9351,8 @@ namespace views::__adaptor
> > >   return views::empty;
> > > else if constexpr (std::__detail::__is_span<_Tp>)
> > >   return span > > _Tp::extent>(std::forward<_Range>(__r));
> > > -   else if constexpr (__detail::__is_ref_view<_Tp>
> > > -  && constant_range)
> > > - return ref_view(static_cast
> > > - (std::forward<_Range>(__r).base()));
> > > +   else if constexpr (__detail::__is_constable_ref_view<_Tp>)
> > > + return ref_view(std::as_const(__r.base()));
> >
> > Whoops, just noticed that I got rid of the perfect forwarding of __r
> > here for no good reason.  It shouldn't matter since its base() member
> > function is const, but consider the std::forward restored for
> > consistency.
>
> Like so:
>
> -- >8 --
>
> Subject: [PATCH v2] libstdc++: Fix ref_view branch of views::as_const 
> [PR119135]
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14 and
> perhaps 13?
>
> N.B. the use of a constrained auto instead of a separate static_assert
> in the testcase is unfortunate but I opted for local consistency for
> now.
>
> -- >8 --
>
> Unlike for span and empty_view, the range_reference_t of
> ref_view doesn't correspond to X.  This patch fixes the ref_view
> branch of views::as_const to correctly query its underlying range
> type X.
>
> PR libstdc++/119135
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges: Include .
> (views::__detail::__is_ref_view): Replace with ...
> (views::__detail::__is_constable_ref_view): ... this.
> (views::_AsConst::operator()): Correct the ref_view branch.
> * testsuite/std/ranges/adaptors/as_const/1.cc (test03): Extend
> test.
> ---
>  libstdc++-v3/include/std/ranges  | 12 ++--
>  .../testsuite/std/ranges/adaptors/as_const/1.cc  |  4 
>  2 files changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index c2a2d6f4e05..ef277b81bd3 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -48,6 +48,7 @@
>  #include 
>  #include 
>  #if __cplusplus > 202002L
> +#include 
>  #include 
>  #endif
>  #include 
> @@ -9324,10 +9325,11 @@ namespace views::__adaptor
>  namespace __detail
>  {
>template
> -   inline constexpr bool __is_ref_view = false;
> +   inline constexpr bool __is_constable_ref_view = false;
>
>template
> -   inline constexpr bool __is_ref_view> = true;
> +   inline constexpr bool __is_constable_ref_view>
> + = constant_range;
>
>template
> concept __can_as_const_view = requires { 
> as_const_view(std::declval<_Rang

[PATCH v2 2/2] match.pd: Extend pointer alignment folds

2025-03-13 Thread Richard Sandiford
We have long had the fold:

/* Pattern match
 tem = (sizetype) ptr;
 tem = tem & algn;
 tem = -tem;
 ... = ptr p+ tem;
   and produce the simpler and easier to analyze with respect to alignment
 ... = ptr & ~algn;  */

But the gimple in gcc.target/aarch64/sve/pr98119.c has a variant in
which a constant is added before the conversion, giving:

 tem = (sizetype) (ptr p+ CST);
 tem = tem & algn;
 tem = -tem;
 ... = ptr p+ tem;

This case is also valid if algn fits within the trailing zero bits
of CST.  Adding CST then has no effect.

Similarly the testcase has:

 tem = (sizetype) (ptr p+ CST1);
 tem = tem & algn;
 tem = CST2 - tem;
 ... = ptr p+ tem;

This folds to:

 ... = (ptr & ~algn) p+ CST2;

if algn fits within the trailing zero bits of both CST1 and CST2.

An alternative would be:

 ... = (ptr p+ CST2) & ~algn;

but I would expect the alignment to be more easily shareable than
the CST2 addition, given that the CST2 addition wasn't being applied
by a POINTER_PLUS_EXPR.

gcc/
* match.pd: Extend pointer alignment folds so that they handle
the case where a constant is added before or after the alignment.

gcc/testsuite/
* gcc.dg/pointer-arith-11.c: New test.
* gcc.dg/pointer-arith-12.c: Likewise.
---
 gcc/match.pd| 27 
 gcc/testsuite/gcc.dg/pointer-arith-11.c | 39 
 gcc/testsuite/gcc.dg/pointer-arith-12.c | 82 +
 3 files changed, 148 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-12.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7017fd15277..89612d1b15b 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3037,6 +3037,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (with { tree algn = wide_int_to_tree (TREE_TYPE (@0), ~wi::to_wide (@1)); }
(bit_and @0 { algn; })))
 
+/* Also match cases in which a constant is applied:
+
+   (1) tem = (sizetype) ptr;  ---> tem = (sizetype) (ptr + CST);
+   (2) tem = -tem ---> tem = CST - tem;
+
+   and where "& align" masks only trailing zeros of CST.  (1) then has no
+   effect, whereas (2) adds CST to the result.  */
+(simplify
+ (pointer_plus @0 (negate (bit_and (convert (pointer_plus @0 INTEGER_CST@1))
+  INTEGER_CST@2)))
+ (if (tree_int_cst_min_precision (@2, UNSIGNED) <= tree_ctz (@1))
+  (with { tree algn = wide_int_to_tree (TREE_TYPE (@0), ~wi::to_wide (@2)); }
+   (bit_and @0 { algn; }
+(simplify
+ (pointer_plus @0 (minus:s INTEGER_CST@1 (bit_and (convert @0) INTEGER_CST@2)))
+ (if (tree_int_cst_min_precision (@2, UNSIGNED) <= tree_ctz (@1))
+  (with { tree algn = wide_int_to_tree (TREE_TYPE (@0), ~wi::to_wide (@2)); }
+   (pointer_plus (bit_and @0 { algn; }) @1
+(simplify
+ (pointer_plus @0 (minus:s INTEGER_CST@1
+  (bit_and (convert (pointer_plus @0 INTEGER_CST@2))
+   INTEGER_CST@3)))
+ (with { auto mask_width = tree_int_cst_min_precision (@3, UNSIGNED); }
+  (if (mask_width <= tree_ctz (@1) && mask_width <= tree_ctz (@2))
+   (with { tree algn = wide_int_to_tree (TREE_TYPE (@0), ~wi::to_wide (@3)); }
+(pointer_plus (bit_and @0 { algn; }) @1)
+
 /* Try folding difference of addresses.  */
 (simplify
  (minus (convert ADDR_EXPR@0) (convert (pointer_plus @1 @2)))
diff --git a/gcc/testsuite/gcc.dg/pointer-arith-11.c 
b/gcc/testsuite/gcc.dg/pointer-arith-11.c
new file mode 100644
index 000..e9390ef0838
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pointer-arith-11.c
@@ -0,0 +1,39 @@
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* { dg-final { scan-tree-dump-times { & -16B?;} 4 "optimized" { target lp64 } 
} } */
+/* { dg-final { scan-tree-dump-times { \+ 16;} 3 "optimized" } } */
+/* { dg-final { scan-tree-dump-not { & 15;} "optimized" } } */
+/* { dg-final { scan-tree-dump-not { \+ 96;} "optimized" } } */
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+char *
+f1 (char *x)
+{
+  char *y = x + 32;
+  x += -((uintptr_t) y & 15);
+  return x;
+}
+
+char *
+f2 (char *x)
+{
+  x += 16 - ((uintptr_t) x & 15);
+  return x;
+}
+
+char *
+f3 (char *x)
+{
+  char *y = x + 32;
+  x += 16 - ((uintptr_t) y & 15);
+  return x;
+}
+
+char *
+f4 (char *x)
+{
+  char *y = x + 16;
+  x += 16 - ((uintptr_t) y & 15);
+  return x;
+}
diff --git a/gcc/testsuite/gcc.dg/pointer-arith-12.c 
b/gcc/testsuite/gcc.dg/pointer-arith-12.c
new file mode 100644
index 000..ebdcbd3c967
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pointer-arith-12.c
@@ -0,0 +1,82 @@
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* { dg-final { scan-tree-dump-not { & -16B?;} "optimized" } } */
+/* { dg-final { scan-tree-dump-times { & 15;} 10 "optimized" } } */
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+char *
+f1 (char *x)
+{
+  char *y = x + 97;
+  x += -((uintptr_t) y & 15);
+  return x;
+}
+
+char *
+f2 (char *x)
+{
+  char *y = x + 98;
+  x +=

Re: [patch] honor prefix and suffix when installing cobol binaries

2025-03-13 Thread Richard Biener
On Thu, Mar 13, 2025 at 11:09 AM Matthias Klose  wrote:
>
> the gcobol and gcobc binaries are currently installed without honoring
> the program prefix and program suffix. Honor these while installing.
>
> Ok for the trunk?

OK.

Thanks,
Richard.

> Matthias
>


Re: [PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 04:11:51PM +0100, Andreas Schwab wrote:
> On Mär 13 2025, Richard Biener wrote:
> 
> > The following makes sure to record -D_FORTIFY_SOURCE=n and
> > -U_FORTIFY_SOURCE in the DW_AT_producer debuginfo attribute when
> > present on the compiler command line.
> 
> Should this also handle defines passed via -Wp?

Yes, that is the common way of doing that (I think from the times
when it conflicted with gcj -D option).

Jakub



[PATCH] c++: Make explicit instantiations not vague linkage

2025-03-13 Thread Nathaniel Shead
I discovered from some further testing that I broke 'import std' in some
cases with my last patch; this fixes that.

This still isn't sufficient I've found to fix PR119154 completely, as
there's still more cases where this assert fires due to performing
import_export_decl on non-DECL_REALLY_EXTERN at EOF.  I'll continue
trying to reduce and find each case.

But either way I think this is a needed improvement; bootstrapped and
regtested on x86_64-pc-linux-gnu (so far just modules.exp), OK for trunk
if full regtest succeeds?

-- >8 --

My change in r15-8012 for PR c++/119154 caused a bug with explicit
instantation declarations.  The change cleared DECL_INTERFACE_KNOWN for
all vague-linkage entities, including explicit instantiations.  When we
then perform lazy loading at EOF (due to processing deferred function
bodies), expand_or_defer_fn ends up calling import_export_decl which
will error because DECL_INTERFACE_KNOWN is still unset but no definition
is available in the file, violating some assertions.

It turns out that for function templates marked inline we would not
respect an 'extern template' imported in general, either; this patch
fixes both of these issues by always treating explicit instantiations as
external, and so marking DECL_INTERFACE_KNOWN eagerly.

For an explicit instantiation declaration we don't want to emit the body
of the function as it must be emitted in a different TU anyway.  And for
explicit instantiation definitions we similarly know that it will have
been emitted in the interface TU we streamed it in from, so there's
no need to emit it.

gcc/cp/ChangeLog:

* decl2.cc (vague_linkage_p): Explicit instantiations are not
vague linkage.

gcc/testsuite/ChangeLog:

* g++.dg/modules/extern-tpl-3_a.C: New test.
* g++.dg/modules/extern-tpl-3_b.C: New test.
* g++.dg/modules/extern-tpl-4_a.C: New test.
* g++.dg/modules/extern-tpl-4_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/decl2.cc   |  3 ++
 gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C | 11 +
 gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C | 12 +
 gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C | 24 ++
 gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C | 46 +++
 5 files changed, 96 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 4a9fb1c3c00..712fdc45d40 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -2480,6 +2480,9 @@ vague_linkage_p (tree decl)
   /* Unfortunately, import_export_decl has not always been called
  before the function is processed, so we cannot simply check
  DECL_COMDAT.  */
+  if (DECL_LANG_SPECIFIC (decl)
+  && DECL_EXPLICIT_INSTANTIATION (decl))
+return false;
   if (DECL_COMDAT (decl)
   || (TREE_CODE (decl) == FUNCTION_DECL
  && DECL_DECLARED_INLINE_P (decl)
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
new file mode 100644
index 000..def3cd1413d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
@@ -0,0 +1,11 @@
+// { dg-additional-options "-fmodules -Wno-global-module" }
+// { dg-module-cmi M }
+
+module;
+template 
+struct S {
+  S() {}
+};
+export module M;
+extern template class S;
+S s;
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
new file mode 100644
index 000..5d96937ce02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
@@ -0,0 +1,12 @@
+// { dg-additional-options "-fmodules" }
+
+template 
+struct S {
+  S() {}
+};
+
+void foo() { S x;}
+
+import M;
+
+// Lazy loading of extern S at EOF should not ICE
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
new file mode 100644
index 000..16f1b041307
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
@@ -0,0 +1,24 @@
+// { dg-additional-options "-fmodules" }
+// { dg-module-cmi M }
+
+export module M;
+
+export template  inline void a() {}
+extern template void a();
+extern template void a();
+template void a();
+
+export template  void b() {}
+extern template void b();
+extern template void b();
+template void b();
+
+export template  inline int c = 123;
+extern template int c;
+extern template int c;
+template int c;
+
+export template  int d = 123;
+extern template int d;
+extern template int d;
+template int d;
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C
new file mode 100644
index 000..1dd4afe9f6b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C
@@ -0,0 +1,46 @@
+// { dg-additional-options "-fmodule

Re: [PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 03:51:15PM +0100, Richard Biener wrote:
> On Thu, 13 Mar 2025, Jakub Jelinek wrote:
> 
> > On Thu, Mar 13, 2025 at 03:44:21PM +0100, Richard Biener wrote:
> > > +  case OPT_D:
> > > +  case OPT_U:
> > > + if (strncmp (options[i].arg, "_FORTIFY_SOURCE",
> > > +  strlen ("_FORTIFY_SOURCE")) == 0)
> > 
> > I'd say you want to verify that after that substring there is either
> > '\0' or "=".
> > Otherwise you'll record -D_FORTIFY_SOURCE_NOT_REALLY=1 which doesn't
> > matter at all.
> 
> I had that first and thought it wasn't worth the cycles, but I can
> surely add that (and thus also separate -U and -D handling).

If you use sizeof ("_FORTIFY_SOURCE") - 1 instead of strlen, there won't
be too many extra cycles even at -O0.
And I think it is fine to handle -U and -D together.
if (startswith (options[i].arg, "_FORTIFY_SOURCE")
&& (options[i].arg[sizeof ("_FORTIFY_SOURCE") - 1] == '\0'
|| (options[i].opt_index == OPT_D
&& options[i].arg[sizeof ("_FORTIFY_SOURCE") - 1] == '=')))

Jakub



Re: [PATCH] c, c++: Set DECL_NOT_GIMPLE_REG_P on *PART_EXPR operand on lhs of MODIFY_EXPR [PR119120]

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Jason Merrill wrote:

> On 3/13/25 3:54 AM, Richard Biener wrote:
> > On Thu, 13 Mar 2025, Jakub Jelinek wrote:
> > 
> >> Hi!
> >>
> >> On Wed, Mar 12, 2025 at 02:01:14PM +0100, Richard Biener wrote:
> >>> On Wed, 12 Mar 2025, Jakub Jelinek wrote:
> >>>
>  On Tue, Mar 11, 2025 at 12:13:13PM +0100, Richard Biener wrote:
> > On Tue, 11 Mar 2025, Jakub Jelinek wrote:
> >
> >> On Tue, Mar 11, 2025 at 10:18:18AM +0100, Richard Biener wrote:
> >>> I think the patch as-is is more robust, but still - ugh ... I wonder
> >>> whether we can instead avoid introducing the COMPLEX_EXPR at all
> >>> at -O0?
> >>
> >> Can we set DECL_NOT_GIMPLE_REG_P at -O0 during gimplification (where
> >> we've already handled some uses/setters of it), at least when
> >> gimplify_modify_expr_complex_part sees {REAL,IMAG}PART_EXPR on
> >> {VAR,PARM,RESULT}_DECL?
> >
> > Yes, that should work for LHS __real / __imag.
> 
>  Unfortunately it doesn't.
> 
>  Although successfully bootstrapped on x86_64-linux and i686-linux,
>  it caused g++.dg/cpp1z/decomp2.C, g++.dg/torture/pr109262.C and
>  g++.dg/torture/pr88149.C regressions.
> 
>  Minimal testcase is -O0:
>  void
>  foo (float x, float y)
>  {
> __complex__ float z = x + y * 1.0fi;
> __real__ z = 1.0f;
>  }
>  which ICEs with
>  pr88149.c: In function ‘foo’:
>  pr88149.c:2:1: error: non-register as LHS of binary operation
>   2 | foo (float x, float y)
> | ^~~
>  z = COMPLEX_EXPR <_2, y.0>;
>  pr88149.c:2:1: internal compiler error: ‘verify_gimple’ failed
>  When the initialization is being gimplified, z is still
>  not DECL_NOT_GIMPLE_REG_P and so is_gimple_reg is true for it and
>  so it gimplifies it as
> z = COMPLEX_EXPR <_2, y.0>;
>  later, instead of building
> _3 = IMAGPART_EXPR ;
> z = COMPLEX_EXPR <1.0e+0, _3>;
>  like before, the patch forces z to be not a gimple reg and uses
> REALPART_EXPR  = 1.0e+0;
>  but it is too late, nothing fixes up the gimplification of the
>  COMPLEX_EXPR
>  anymore.
> >>>
> >>> Ah, yeah - setting DECL_NOT_GIMPLE_REG_P "after the fact" doesn't work.
> >>>
>  So, I think we'd really need to do it the old way with adjusted naming
>  of the flag, so assume for all non-addressable
>  VAR_DECLs/PARM_DECLs/RESULT_DECLs with COMPLEX_TYPE if (!optimize) they
>  are DECL_NOT_GIMPLE_REG_P (perhaps with the exception of
>  get_internal_tmp_var), and at some point (what) if at all optimize that
>  away if the partial accesses aren't done.
> >>>
> >>> We could of course do that in is_gimple_reg (), but I'm not sure if
> >>> all places that would need to check do so.  Alternatively gimplify
> >>>
> >>> __real x = ..
> >>>
> >>> into
> >>>
> >>> tem[DECL_NOT_GIMPLE_REG_P] = x;
> >>> __real tem = ...;
> >>> x = tem;
> >>
> >> We can't do that, that again causes the undesirable copying of often
> >> uninitialized part(s).
> >>
> >>> when 'x' is a is_gimple_reg?  Of course for -O0 this would be quite bad.
> >>> Likewise for your idea - where would we do this optimization when not
> >>> optimizing?
> >>>
> >>> So it would need to be the frontend(s) setting DECL_NOT_GIMPLE_REG_P
> >>> when producing lvalue __real/__imag accesses?
> >>
> >> The following patch sets it in the FEs during genericization.
> >> I think Fortran doesn't have a way to modify just real or just complex
> >> part separately.
> >>
> >> In short, this patch is for code like
> >>_ComplexT __t;
> >>__real__ __t = __z.real();
> >>__imag__ __t = __z.imag();
> >>_M_value *= __t;
> >>return *this;
> >> at -O0 which used to appear widely even in libstdc++ before GCC 9
> >> and happens in real-world code.  At -O0 for debug info reasons (see
> >> PR119190) we don't want to aggressively DCE statements and when we
> >> since r0-100845 try to rewrite vars with COMPLEX_TYPE into SSA form
> >> aggressively, the above results in copying of uninitialized data
> >> when expanding COMPLEX_EXPRs added so that the vars can be in SSA form.
> >> The patch detects during genericization the partial initialization and
> >> doesn't rewrite such vars to SSA at -O0.  This has to be done before
> >> gimplification starts, otherwise e.g. the attached testcase ICEs.
> >>
> >> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > LGTM, please leave frontend maintainers a chance to comment though.
> 
> No objection.
> 
> Though I notice that the documentation of DECL_NOT_GIMPLE_REG_P seems
> backwards?

Oops - I'll fix that up.

Richard.

> Jason
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] c, c++: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Jakub Jelinek
Hi!

Apparently some programs in the wild use
#if __has_attribute(musttail)
  __attribute__((musttail)) return foo ();
#else
  return foo ();
#endif
clang supports musttail both as a standard attribute ([[clang::musttail]]
which we also support for compatibility) and the above worked just
fine with GCC 14 which had __has_attribute(musttail) 0.  Now that it is
0, this doesn't compile anymore.
So, either we need to ensure that __has_attribute(musttail) is 0
and just __has_c{,pp}_attribute({gnu,clang}::musttail) are non-zero,
or IMHO better we just make it work in the attribute form, especially for
C < C23 I can see why some projects would prefer that form.
While [[gnu::musttail]] is rejected as an error in C11 etc. before GCC 15,
rather than just handled as an unknown attribute.
I view this as both a regression and compatibility issue.
The patch handles it in similar spots to fallthrough/assume attributes
inside of __attribute__.

While working on it, I've noticed we weren't diagnosing arguments to the
clang::musttail attribute (fixed by the c-attribs.cc hunk) and newly
on the __attribute__ form attribute (in that case the arguments aren't just
skipped, they are always parsed and because we don't call decl_attributes
etc., it wouldn't be diagnosed without a manual check).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-03-13  Jakub Jelinek  

PR c/116545
gcc/
* doc/extend.texi (musttail statement attribute): Document
that musttail GNU attribute can be used as well.
gcc/c-family/
* c-attribs.cc (c_common_clang_attributes): Add musttail.
gcc/c/
* c-parser.cc (c_parser_declaration_or_fndef): Parse
__attribute__((musttail)) return.
(c_parser_handle_musttail): Diagnose attribute arguments.
(c_parser_statement_after_labels): Parse
__attribute__((musttail)) return.
gcc/cp/
* parser.cc (cp_parser_expression_statement): Parse
__attribute__((musttail)) return.
gcc/testsuite/
* c-c++-common/musttail15.c: New test.
* c-c++-common/musttail16.c: New test.
* c-c++-common/musttail17.c: New test.
* c-c++-common/musttail18.c: New test.
* c-c++-common/musttail19.c: New test.
* c-c++-common/musttail20.c: New test.
* c-c++-common/musttail21.c: New test.
* c-c++-common/musttail22.c: New test.
* c-c++-common/musttail23.c: New test.
* c-c++-common/musttail24.c: New test.
* g++.dg/musttail7.C: New test.
* g++.dg/musttail8.C: New test.
* g++.dg/musttail12.C: New test.
* g++.dg/musttail13.C: New test.

--- gcc/doc/extend.texi.jj  2025-03-11 22:45:16.731350772 +0100
+++ gcc/doc/extend.texi 2025-03-13 11:48:29.285124840 +0100
@@ -10241,18 +10241,22 @@ have to optimize it to just @code{return
 @cindex @code{musttail} statement attribute
 @item musttail
 
-The @code{gnu::musttail} or @code{clang::musttail} attribute
-can be applied to a @code{return} statement with a return-value expression
-that is a function call.  It asserts that the call must be a tail call that
-does not allocate extra stack space, so it is safe to use tail recursion
-to implement long running loops.
+The @code{gnu::musttail} or @code{clang::musttail} standard attribute
+or @code{musttail} GNU attribute can be applied to a @code{return} statement
+with a return-value expression that is a function call.  It asserts that the
+call must be a tail call that does not allocate extra stack space, so it is
+safe to use tail recursion to implement long running loops.
 
 @smallexample
 [[gnu::musttail]] return foo();
 @end smallexample
 
+@smallexample
+__attribute__((musttail)) return bar();
+@end smallexample
+
 If the compiler cannot generate a @code{musttail} tail call it will report
-an error. On some targets tail calls may never be supported.
+an error.  On some targets tail calls may never be supported.
 Tail calls cannot reference locals in memory, which may affect
 builds without optimization when passing small structures, or passing
 or returning large structures.  Enabling @option{-O1} or @option{-O2} can
--- gcc/c-family/c-attribs.cc.jj2025-03-08 00:07:01.862908135 +0100
+++ gcc/c-family/c-attribs.cc   2025-03-13 11:41:15.400098638 +0100
@@ -651,7 +651,9 @@ const struct scoped_attribute_specs c_co
 /* Attributes also recognized in the clang:: namespace.  */
 const struct attribute_spec c_common_clang_attributes[] = {
   { "flag_enum", 0, 0, false, true, false, false,
- handle_flag_enum_attribute, NULL }
+ handle_flag_enum_attribute, NULL },
+  { "musttail",  0, 0, false, false, false,
+ false, handle_musttail_attribute, NULL }
 };
 
 const struct scoped_attribute_specs c_common_clang_attribute_table =
--- gcc/c/c-parser.cc.jj2025-03-13 00:41:44.018181529 +0100
+++ gcc/c/c-parser.cc   2025-03-13 11:43:23.5463

COBOL: testsuite and running NIST85 (was: Re: [PATCH][v3] Simple cobol.dg testsuite)

2025-03-13 Thread Simon Sobisch



Am 13.03.2025 um 12:49 schrieb Richard Biener:

On Thu, 13 Mar 2025, Sam James wrote:


Simon Sobisch  writes:


Thanks for your work on adding a testsuite. Can you please explain why
you do this when a complete testsuite exists in autoconf (autotest)
format (which roots back to decade of work in GnuCOBOL, with all
copyrights for that already with the FSF)?



I don't think any of us were aware of it ("we" being "the general GCC
developer community", not the COBOL folks, for the purposes of this
email) until yesterday when richi mused about it on IRC maybe existing
and we went looking out of curiosity.

I agree that having that testsuite integrated would be fantastic.


Is the existence of this in upstream [1] just unknown (because it was
not part of the initial patches [for reasons I not understood])?



I would've personally liked to see the NIST testsuite integration at
least in the initial patches, but it is what it is. I don't think the
GnuCOBOL testsuite was brought up at all (and I think most of us weren't
aware of it) in the patch upstreaming discussions.

Now that we *are* aware of it, it seems desirable to have for sure.


Is the format such a big issue (note: previous discussions elaborated
"a test suite is very important and other frontends also use a
framework other than dejagnu)?

If dejagnu is the way to go:

* Shouldn't there be deprecation of autotest in autoconf (of course
   only if that preference is also outside of gcc)?


It's a GCC / GNU toolchain-only preference because it allows easily
doing cross + simulator testing, and all of our tools are used to its
format.


That's indeed the main reason.


Thanks for the explanation. That's totally fine.




It's definitely not perfect. Years ago (way before I followed GCC),
there was talk of replacing dejagnu, just efforts failed.



* Shouldn't there be a (at least semi automated) script / migration
   tool (at least for this specific time in place to convert the "UAT"
   once into dejagnu format)?


Yes. Having testsuite integration is seen as critical at this
point. richi just wanted to present this as a non-COBOL person to give
us something to play with.


Yes, and to give people familiar with how GCC tests are done a place
to put regression tests going forward.

I do think that integrating the testsuites the COBOLworx folks have
is important and of course integrating tests from GNU Cobol is desirable
as well.  Whether we can or want to integrate tests based on autotest
is another question - I'd probably avoid that, even as short-term
solution, as such tend to stay forever.


I agree. Note: COBOLworx started by using the GnuCOBOL testsuite; even 
with the current UAT's state it would be a lot of manual work to 
re-synchronize them, so going one step further to dejagnu seems to not 
make it much harder either.
It will definitely be useful if the "original test file names" (like 
run_subscripts.at, or at least run_subscripts) are kept somewhere - a 
comment like "auto-translated from run_subscripts.at" is enough - and 
maybe they can stay in one file each (I don't know enough about dejagnu 
to comment on that).


The main point is that it seems most reasonable to convert those files 
into dejagnu format once (so obviously a "script working good enough, 
not installed" comes into mind), instead of writing it from scratch.



What would be nice is to have a common separate test harness you can
test an installed compiler against - I'm not sure whether the GNU
Cobol test harness or the COBOLworx one qualifies here.  The NIST
one probably does, but it seems to require "plumbing" that's not
part of NIST and that, in implementation, might differ from GNU Cobol
to COBOLworx.


That's a good opportunity to be picky: it is GnuCOBOL (one word, COBOL 
in upper-case) :-)


And yes: a common separate test harness is most reasonable and that's 
exactly what the idea of NIST was.
If you ever wonder: GnuCOBOL uses make (with one sub-directory per 
"Module") along with perl [2].
This allows to not only do testing (or just extraction of the files) 
along with counting and tracking time, but also to automate some of the 
required "needs manual inspection".


And given gcobc, I'd argue that gcobol should not fail the following 
(and ideally show its superior compile and run time):


$> tar -xvf gnucobol-3.*.tar.*
$> cd gnucobol-3.*/
$> ./configure  # for automake and autoconf doing the setup
$> cd tests/cobol85
$> make test COBC=gcobc-15

... just tried that:
gcobol: error: unrecognized command-line option ‘-std=cobol85’

--> seems like the gcobc should drop that and set the right flags for 
gcobol here (I know, should be on bugzilla, or just fixed)



$> make test COBC=gcobc-15 COBC_FLAGS=--debug
Compiling EXEC85 program
warning: --debug implies -fstack-check: ignored
EXEC85.cob:1:73: error: syntax error, unexpected NAME, expecting 
FUNCTION or PROGRAM-ID
1 | 000100 IDENTIFICATION DIVISION. 
EXEC84.2
  | 
^

cobol1: error: failed c

[pushed] testsuite: Add -fno-tree-sink to sve/pr96357.c

2025-03-13 Thread Richard Sandiford
gcc.target/aarch64/sve/pr96357.c started failing after
r15-518-g99b1daae18c095d6, which tweaked the heuristics
about when to sink code.  The testcase has:

  double i = d, j = 1.0 - f, k = j ? d : j;
  if (k == 1.0)
i = 0.0;
  *l = *n = *g = *h = i * 0.5;

where k == 1.0 is false if j is zero (since k is then also 0).
So we end up with a diamond whose condition is j != 0 && d == 1.
The else branch of the diamond is the only one that uses the result
of i = d, so after the patch, we sink the conversion to there.
And that seems like a reasonable thing to do.

However, aarch64 doesn't yet allow int->double conversions to be
predicated, so ifcvt cannot handle the sunk form, meaning that we
can no longer vectorise.

The testcase is highly artificial and so shouldn't IMO be used
to tune the sinking heuristics.  Instead I think we should just
disable sinking for the test.  An alternative would be to add
-ffast-math, but I think that would interfere more with the
original intent.

Tested on aarch64-linux-gnu & pushed, but I'm happy to revisit
if others would prefer a different fix.

Richard


gcc/testsuite/
* gcc.target/aarch64/sve/pr96357.c: Add -fno-tree-sink.
---
 gcc/testsuite/gcc.target/aarch64/sve/pr96357.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr96357.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pr96357.c
index 5d8fd8b53c3..9a7f912e529 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/pr96357.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr96357.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-fpermissive -O3 -march=armv8.2-a+sve" } */
+/* { dg-options "-fpermissive -O3 -march=armv8.2-a+sve -fno-tree-sink" } */
 
 int d;
 
-- 
2.25.1



Re: [PATCH 1/6] [RFC] c: Add front-end hook for related vector type.

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Tejas Belagod wrote:

> On 3/12/25 4:45 PM, Richard Biener wrote:
> > On Wed, 12 Mar 2025, Tejas Belagod wrote:
> > 
> >> On 3/10/25 7:21 PM, Richard Biener wrote:
> >>> On Sat, 8 Mar 2025, Tejas Belagod wrote:
> >>>
>  On 3/8/25 12:55 AM, Tejas Belagod wrote:
> > On 3/7/25 5:34 PM, Richard Biener wrote:
> >> On Fri, 7 Mar 2025, Tejas Belagod wrote:
> >>
> >>> On 3/7/25 4:38 PM, Richard Biener wrote:
>  On Fri, 7 Mar 2025, Tejas Belagod wrote:
> 
> > Given a vector mode and its corresponding element mode, this new new
> > language
> > hook returns a vector type that has properties of the vector mode
> > and
> > element
> > type of that of the element mode.  For eg. on AArch64, given VNx16BI
> > and
> > QImode
> > it returns VNx16QI i.e. the wider mode to BImode that is an SVE
> > mode.
> 
>  What's the rationale for this to be a new frontend hook?  It seems
>  to be a composition of a target hook (related_mode) and a
>  frontend hook (type_for_mode).
> >>>
> >>> I don't know this part of the FE very well, so pardon if its wrong way
> >>> to
> >>> do
> >>> it.
> >>>
> >>> I was trying to find a generic way to determine a wider vtype for a
> >>> given
> >>> vmode in a language-agnostic way. It looks like lang hooks are the
> >>> generic
> >>> way
> >>> for the mid-end to communicate to the FE to determine types the FE
> >>> seems
> >>> fit,
> >>> so I decided to make it a langhook.
> >>
> >> Who is supposed to call this hook and for what reason?  Why would the
> >> frontend be involved here?
> >>
> >
> > Ah, sorry, I should've mentioned. This hook is called in Patch 4/6
> > during
> > gimplification (gimplify_compound_lval) that implements the subscript
> > operator for svbool_t - this hook returns a 'container' type for an
> > n-bit
> > boolean type which in this case is a byte vector for the 1- bit svbool_t
> > vector. I involve the FE here in the same principle as for eg.
> > TYPE_FOR_SIZE
> > as the FE is best-placed to return the right 'container' type as defined
> > by
> > the language. The type returned by the FE is used to unpack svbool_t to
> > its
> > container vector type to implement the subscript operator.
> >
> >>> And how can it survive without
>  a default implementation?
> 
> >>>
> >>> I saw a comment in langhooks-def.h that says:
> >>>
> >>> /* Types hooks.  There are no reasonable defaults for most of them,
> >>>       so we create a compile-time error instead.  */
> >>>
> >>> So I assumed it was OK to have a NULL default which presumably fails
> >>> at
> >>> build
> >>> time when a hook is not defined for a language. Is there a more
> >>> graceful
> >>> way
> >>> to remedy this?
> >>
> >> Well, you made the default a NULL pointer - what do you do at the use
> >> point of the hook when it's not defined?
> >>
> >
> > True. I saw some of the other type hooks had NULL, so AIUI, I imagined
> > it
> > could be NULL and it would crash when used for a FE that didn't
> > implement
> > it. I admit I'm not even sure if this is the right way to do this.
> >
> > So, before I embarked on a 'default' implementation (which I'm not fully
> > sure how to do) my main intention was to clarify (via this RFC) if the
> > langhook approach was indeed the best way for gimple to obtain the
> > related
> > vtype it needed for the vbool type it was unpacking to do the subscript
> > operation on?
> >
> 
>  Thinking about this a bit more, I realize my mistake - I've made this a
>  langhook only for the purpose of gimplify to communicate to the FE to
>  call
>  c_common_related_vtype_for_mode (). I think I need to go back to the
>  drawing
>  board on this one - I'm not so convinced now that this is actually
>  serving
>  a
>  new langhook need.
> 
>  If this new 'hook' is just a wrapper for targetm.vectorize.related_mode
>  ()
>  and
>  type_for_mode () I can probably just call them directly during gimplify
>  or
>  c-common.cc instead of inventing this new hook.
> >>>
> >>> That was my thinking.  The alternative is to (pre-)gimplify this either
> >>> during genericization or in the already existing gimplify_expr
> >>> langhook.  Or perform the "decay" as part of GENERIC building.
> >>>
> >>
> >> When I tried to decay svbool to char[] during genericization, it is quite
> >> straighforward for svbool[] reads, but for writes I need to introduce a
> >> temporary (because VEC_CONVERT IFN can't be an lvalue) and I don't know if
> >> Generic is too early to allow for introducing temporaries.
> > 
> > You might be doi

Re: [PATCH] Move COMP/XOR optimization from match.pd into reassoc [PR116860]

2025-03-13 Thread Konstantinos Eleftheriou
Hi, thanks for the feedback!

I have sent a new version, keeping the match.pd patterns, fixing the
formatting issues and changing std::set to hash_set
(https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677526.html).

Konstantinos

On Mon, Mar 10, 2025 at 6:49 PM Andrew Pinski  wrote:
>
> On Mon, Mar 10, 2025 at 7:52 AM Konstantinos Eleftheriou
>  wrote:
> >
> > Testcases for patterns `((a ^ b) & c) cmp d | a != b -> (0 cmp d | a != b)`
> > and `(a ^ b) cmp c | a != b -> (0 cmp c | a != b)` were failing on some
> > targets, like PowerPC.
> >
> > This patch moves the optimization to reassoc. Doing so, we can now handle
> > cases where the related conditions appear in an AND expression too. Also,
> > we can optimize cases where we have intermediate expressions between the
> > related ones in the AND/OR expression on some targets. This is not handled
> > on targets like PowerPC, where each condition of the AND/OR expression
> > is placed into a different basic block.
> >
> > Bootstrapped/regtested on x86 and AArch64.
> >
> > PR tree-optimization/116860
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Remove the following patterns:
> > ((a ^ b) & c) cmp d | a != b -> (0 cmp d | a != b)
> > (a ^ b) cmp c | a != b -> (0 cmp c | a != b)
>
> I suspect removing them from match is wrong.
>
> > * tree-ssa-reassoc.cc (INCLUDE_SET): Include  for std::set.
> > (INCLUDE_ALGORITHM): Include  for std::set_intersection.
> > (solve_expr): New function.
> > (find_terminal_nodes): New function.
> > (get_terminal_nodes): New function.
> > (optimize_cmp_xor_exprs): New function.
> > (optimize_range_tests): Call optimize_cmp_xor_exprs.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/tree-ssa/fold-xor-and-or.c: Renamed to fold-xor-and-or-1.c.
> > * gcc.dg/tree-ssa/fold-xor-and-or-1.c:
> > Add new test cases, remove logical-op-non-short-circuit=1.
> > * gcc.dg/tree-ssa/fold-xor-or.c: Likewise.
> > * gcc.dg/tree-ssa/fold-xor-and-or-2.c: New test.
> > ---
> >  gcc/match.pd  |  30 --
> >  ...{fold-xor-and-or.c => fold-xor-and-or-1.c} |  42 ++-
> >  .../gcc.dg/tree-ssa/fold-xor-and-or-2.c   |  55 +++
> >  gcc/testsuite/gcc.dg/tree-ssa/fold-xor-or.c   |  42 ++-
> >  gcc/tree-ssa-reassoc.cc   | 344 ++
> >  5 files changed, 465 insertions(+), 48 deletions(-)
> >  rename gcc/testsuite/gcc.dg/tree-ssa/{fold-xor-and-or.c => 
> > fold-xor-and-or-1.c} (50%)
> >  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-2.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 5c679848bdf..b78ee6eaf4c 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3871,36 +3871,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  (if (types_match (type, TREE_TYPE (@0)))
> >   (bit_xor @0 { build_one_cst (type); } ))
> >
> > -/* ((a ^ b) & c) cmp d || a != b --> (0 cmp d || a != b). */
> > -(for cmp (simple_comparison)
> > -  (simplify
> > -(bit_ior
> > -  (cmp:c
> > -   (bit_and:c
> > - (bit_xor:c @0 @1)
> > - tree_expr_nonzero_p@2)
> > -   @3)
> > -  (ne@4 @0 @1))
> > -(bit_ior
> > -  (cmp
> > -   { build_zero_cst (TREE_TYPE (@0)); }
> > -   @3)
> > -  @4)))
> > -
> > -/* (a ^ b) cmp c || a != b --> (0 cmp c || a != b). */
> > -(for cmp (simple_comparison)
> > -  (simplify
> > -(bit_ior
> > -  (cmp:c
> > -   (bit_xor:c @0 @1)
> > -   @2)
> > -  (ne@3 @0 @1))
> > -(bit_ior
> > -  (cmp
> > -   { build_zero_cst (TREE_TYPE (@0)); }
> > -   @2)
> > -  @3)))
> > -
> >  /* We can't reassociate at all for saturating types.  */
> >  (if (!TYPE_SATURATING (type))
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c 
> > b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
> > similarity index 50%
> > rename from gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c
> > rename to gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
> > index 99e83d8e5aa..23edf9f4342 100644
> > --- a/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
> > @@ -1,55 +1,79 @@
> >  /* { dg-do compile } */
> > -/* { dg-options "-O3 -fdump-tree-optimized --param 
> > logical-op-non-short-circuit=1" } */
> > +/* { dg-options "-O3 -fdump-tree-optimized" } */
> >
> >  typedef unsigned long int uint64_t;
> >
> > -int cmp1(int d1, int d2) {
> > +int cmp1_or(int d1, int d2) {
> >if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2)
> >  return 0;
> >return 1;
> >  }
> >
> > -int cmp2(int d1, int d2) {
> > +int cmp2_or(int d1, int d2) {
> >if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0)
> >  return 0;
> >return 1;
> >  }
> >
> > -int cmp3(int d1, int d2) {
> > +int cmp3_or(int d1, int d2) {
> >if (10 > (0xabcd & (d2 ^ d1)) || d2 != d1)
> >  return 0;
> >return 1;
> >  }
> >
> > -int cmp4(int d1, int d

Re: libatomic: use HWCAPs in AArch64 ifunc tests

2025-03-13 Thread Wilco Dijkstra
Hi Richard,

> Could you give details?  I thought it was always known that trapped
> system register accesses were slow.  In the previous versions, the
> checks seemed to be presented as an up-front price worth paying for
> faster atomic operations, on the systems that would use those paths.
> Now the checks are being presented as something that are good to remove
> to make the code simpler and faster.

The system register checks came from early versions (~2 years ago) when
there was no HWCAP defined yet, so Victor added them for testing.
The idea was to leave them for a while so you could get new atomics
on an older kernel, and remove them once newer kernels became available.

> There have been a few changes to this code in the current release cycle,
> and each time it seems like the new version is being presented as better
> than the previous one with single-sentence justifications.

I'm not sure which commits with single-sentence justifications you mean?
There has been only 1 commit in the current cycle after the rcpc3 code was
added, and that was a minor cleanup that also fixed a bug.

> Could we instead have a comment in the code discussing the various
> approaches that we could take, including the ones that previous versions
> took, describes the trade-offs, and explains why we've chosen to do what
> we've chosen to do?

This should be in the kernel documentation - the advice is: use HWCAPs
and avoid system register reads. Note I fixed libgcc/config/aarch64/cpuinfo.c
to remove all system register reads as well.

Cheers,
Wilco

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Qing Zhao


> On Mar 12, 2025, at 12:40, Martin Uecker  wrote:
> 
> Am Mittwoch, dem 12.03.2025 um 16:20 + schrieb Qing Zhao:
>> 
>>> On Mar 10, 2025, at 15:34, Martin Uecker  wrote:
>>> 
>>> Am Montag, dem 10.03.2025 um 15:00 -0400 schrieb John McCall:
 
>>> 
>>> ...
>>> 
 That said, my preference is still to just give preference to the field 
 name,
 which sidesteps any need for disambiguation syntax and avoids this whole
 problem where structs can be broken by just adding a global variable that
 happens to collide with a field.
>>> 
>>> I don't think it is a good idea when the 'n' in 'buf' refers to the
>>> previous global 'n' coming before and the 'n' in attribute 
>>> refers to a member 'n' coming later in the following example.
>>> 
>>> constexpr int n = 1;
>>> 
>>> struct foo {
>>> char *p [[gnu::counted_by(n)]];
>>> char buf[n];
>>> int n;
>>> };
>>> 
>>> How are you going to explain this to anyone?
>>> 
>>> 
>>> And neither global names nor struct members may always be under
>>> the control of the programmer.  Also that simply bringing
>>> a new identifier into scope can break code elsewhere worries me.
>>> 
>>> 
>>> Finally, the following does not even compile in C++.
>>> 
>>> struct foo {
>>> char buf[n];
>>> const static int n = 2;
>>> };
>>> 
>>> While the next example is also ok in C++.
>>> 
>>> constexpr int n = 2;
>>> 
>>> struct foo {
>>> char buf[n];
>>> };
>>> 
>>> With both declarations of 'n' the example has UB in C++. 
>>> So I am not convinced the proposed rules make a lot
>>> of sense for C++ either.
>>> 
>>> 
>>> Disambiguation with '__self__.'  completely avoids all these issues
>>> while keeping the door open for later improvements.  
>>> 
>>> I still think one could use designator syntax, i.e. '.n', which
>>> would be clearer and intuitive for both C and C++ programmers.
>> 
>> I think the major reason to use __self.n instead of .n is:
>> 
>> The dot (.) operator, i.e., the member access operator in C, is used to 
>> access the member of an _instance_ of 
>> a structure/union.
>> We should declare a variable with a structure type first, and then append 
>> this member access operator to this 
>> variable and followed by the member name to access the member, and then use 
>> it in the expressions.
> 
> For a designator
> 
> struct foo { int n; } a = { .n = 1 };
> 
> we also refer to a member 'n' of an instance 'a' of a structure type.
> The instance is simply implied by the context.
> 
> For 
> 
> struct foo { int n; char *x __counted_by(.n) };
> 
> is also refers to a member of an instance of the struct. The
> instance is the 'a' which is later used in an expression 'a.x'
> So the instance would again be implied by the context.
> 
> So for me this makes perfect sense in both cases (and
> for both C and C++)

Why does ‘.n' also make sense in C++?

Qing
> 
>> 
>> To me, this is clearer. But I am okay with the designator syntax.
> 
> I am also okay with __self__ if people have concerns about
> resuing the designator syntax.  We could still always drop the
> requirement for writing __self__  later. 
> 
> Martin
> 
>> 
>> Qing
>> 
>>> 
>>> 
>>> Martin
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 



Re: [PATCH] c, c++: Set DECL_NOT_GIMPLE_REG_P on *PART_EXPR operand on lhs of MODIFY_EXPR [PR119120]

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 10:00:48AM -0400, Jason Merrill wrote:
> On 3/13/25 3:54 AM, Richard Biener wrote:
> > On Thu, 13 Mar 2025, Jakub Jelinek wrote:
> > 
> > > Hi!
> > > 
> > > On Wed, Mar 12, 2025 at 02:01:14PM +0100, Richard Biener wrote:
> > > > On Wed, 12 Mar 2025, Jakub Jelinek wrote:
> > > > 
> > > > > On Tue, Mar 11, 2025 at 12:13:13PM +0100, Richard Biener wrote:
> > > > > > On Tue, 11 Mar 2025, Jakub Jelinek wrote:
> > > > > > 
> > > > > > > On Tue, Mar 11, 2025 at 10:18:18AM +0100, Richard Biener wrote:
> > > > > > > > I think the patch as-is is more robust, but still - ugh ... I 
> > > > > > > > wonder
> > > > > > > > whether we can instead avoid introducing the COMPLEX_EXPR at all
> > > > > > > > at -O0?
> > > > > > > 
> > > > > > > Can we set DECL_NOT_GIMPLE_REG_P at -O0 during gimplification 
> > > > > > > (where
> > > > > > > we've already handled some uses/setters of it), at least when
> > > > > > > gimplify_modify_expr_complex_part sees {REAL,IMAG}PART_EXPR on
> > > > > > > {VAR,PARM,RESULT}_DECL?
> > > > > > 
> > > > > > Yes, that should work for LHS __real / __imag.
> > > > > 
> > > > > Unfortunately it doesn't.
> > > > > 
> > > > > Although successfully bootstrapped on x86_64-linux and i686-linux,
> > > > > it caused g++.dg/cpp1z/decomp2.C, g++.dg/torture/pr109262.C and
> > > > > g++.dg/torture/pr88149.C regressions.
> > > > > 
> > > > > Minimal testcase is -O0:
> > > > > void
> > > > > foo (float x, float y)
> > > > > {
> > > > >__complex__ float z = x + y * 1.0fi;
> > > > >__real__ z = 1.0f;
> > > > > }
> > > > > which ICEs with
> > > > > pr88149.c: In function ‘foo’:
> > > > > pr88149.c:2:1: error: non-register as LHS of binary operation
> > > > >  2 | foo (float x, float y)
> > > > >| ^~~
> > > > > z = COMPLEX_EXPR <_2, y.0>;
> > > > > pr88149.c:2:1: internal compiler error: ‘verify_gimple’ failed
> > > > > When the initialization is being gimplified, z is still
> > > > > not DECL_NOT_GIMPLE_REG_P and so is_gimple_reg is true for it and
> > > > > so it gimplifies it as
> > > > >z = COMPLEX_EXPR <_2, y.0>;
> > > > > later, instead of building
> > > > >_3 = IMAGPART_EXPR ;
> > > > >z = COMPLEX_EXPR <1.0e+0, _3>;
> > > > > like before, the patch forces z to be not a gimple reg and uses
> > > > >REALPART_EXPR  = 1.0e+0;
> > > > > but it is too late, nothing fixes up the gimplification of the 
> > > > > COMPLEX_EXPR
> > > > > anymore.
> > > > 
> > > > Ah, yeah - setting DECL_NOT_GIMPLE_REG_P "after the fact" doesn't work.
> > > > 
> > > > > So, I think we'd really need to do it the old way with adjusted naming
> > > > > of the flag, so assume for all non-addressable
> > > > > VAR_DECLs/PARM_DECLs/RESULT_DECLs with COMPLEX_TYPE if (!optimize) 
> > > > > they
> > > > > are DECL_NOT_GIMPLE_REG_P (perhaps with the exception of
> > > > > get_internal_tmp_var), and at some point (what) if at all optimize 
> > > > > that
> > > > > away if the partial accesses aren't done.
> > > > 
> > > > We could of course do that in is_gimple_reg (), but I'm not sure if
> > > > all places that would need to check do so.  Alternatively gimplify
> > > > 
> > > > __real x = ..
> > > > 
> > > > into
> > > > 
> > > > tem[DECL_NOT_GIMPLE_REG_P] = x;
> > > > __real tem = ...;
> > > > x = tem;
> > > 
> > > We can't do that, that again causes the undesirable copying of often
> > > uninitialized part(s).
> > > 
> > > > when 'x' is a is_gimple_reg?  Of course for -O0 this would be quite bad.
> > > > Likewise for your idea - where would we do this optimization when not
> > > > optimizing?
> > > > 
> > > > So it would need to be the frontend(s) setting DECL_NOT_GIMPLE_REG_P
> > > > when producing lvalue __real/__imag accesses?
> > > 
> > > The following patch sets it in the FEs during genericization.
> > > I think Fortran doesn't have a way to modify just real or just complex
> > > part separately.
> > > 
> > > In short, this patch is for code like
> > > _ComplexT __t;
> > > __real__ __t = __z.real();
> > > __imag__ __t = __z.imag();
> > > _M_value *= __t;
> > > return *this;
> > > at -O0 which used to appear widely even in libstdc++ before GCC 9
> > > and happens in real-world code.  At -O0 for debug info reasons (see
> > > PR119190) we don't want to aggressively DCE statements and when we
> > > since r0-100845 try to rewrite vars with COMPLEX_TYPE into SSA form
> > > aggressively, the above results in copying of uninitialized data
> > > when expanding COMPLEX_EXPRs added so that the vars can be in SSA form.
> > > The patch detects during genericization the partial initialization and
> > > doesn't rewrite such vars to SSA at -O0.  This has to be done before
> > > gimplification starts, otherwise e.g. the attached testcase ICEs.
> > > 
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > LGTM, please leave frontend maintainers a chance to comment though.
> 
> No objection.
> 
> Though I notice that the documentation of DECL_NOT_GIMP

[pushed] libstdc++: Add P1206R7 from_range members to container adaptors [PR111055]

2025-03-13 Thread tkaminsk
From: Tomasz Kamiński 

This is another piece of P1206R7, adding new members to std::stack,
std::queue, and std::priority_queue.

PR libstdc++/111055

libstdc++-v3/ChangeLog:

* include/bits/stl_queue.h (queue(from_range_t, _Rg&&))
(queue(from_range_t, _Rg&&, const _Alloc&), push_range):
Define.
(priority_queue(from_range_t, R&&, const Compare&))
(push_range): Define.
* include/bits/stl_stack.h (stack(from_range_t, R&&))
(stack(from_range_t, R&&, const Alloc&), push_range): Define.
* testsuite/util/testsuite_iterators.h (test_range_nocopy): Define.
* testsuite/23_containers/priority_queue/cons_from_range.cc: New test.
* testsuite/23_containers/priority_queue/members/push_range.cc: New 
test.
* testsuite/23_containers/queue/cons_from_range.cc: New test.
* testsuite/23_containers/queue/members/push_range.cc: New test.
* testsuite/23_containers/stack/cons_from_range.cc: New test.
* testsuite/23_containers/stack/members/push_range.cc: New test.
---
Pushed to trunk.
Approved by Jonathan Wakely on sourceforge:
https://forge.sourceware.org/gcc/gcc-TEST/pulls/43#issuecomment-780.

 libstdc++-v3/include/bits/stl_queue.h | 102 
 libstdc++-v3/include/bits/stl_stack.h |  46 
 .../priority_queue/cons_from_range.cc | 111 ++
 .../priority_queue/members/push_range.cc  |  86 ++
 .../23_containers/queue/cons_from_range.cc|  88 ++
 .../23_containers/queue/members/push_range.cc |  73 
 .../23_containers/stack/cons_from_range.cc|  89 ++
 .../23_containers/stack/members/push_range.cc |  74 
 .../testsuite/util/testsuite_iterators.h  |  11 ++
 9 files changed, 680 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/priority_queue/cons_from_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/priority_queue/members/push_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/queue/cons_from_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/queue/members/push_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/stack/cons_from_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/stack/members/push_range.cc

diff --git a/libstdc++-v3/include/bits/stl_queue.h 
b/libstdc++-v3/include/bits/stl_queue.h
index 627d5e4e63b..2a4b62918a0 100644
--- a/libstdc++-v3/include/bits/stl_queue.h
+++ b/libstdc++-v3/include/bits/stl_queue.h
@@ -61,6 +61,10 @@
 #if __cplusplus >= 201103L
 # include 
 #endif
+#if __glibcxx_ranges_to_container // C++ >= 23
+# include  // ranges::to
+# include  // ranges::copy
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -209,6 +213,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: c(__first, __last, __a) { }
 #endif
 
+#if __glibcxx_ranges_to_container // C++ >= 23
+  /**
+   * @brief Construct a queue from a range.
+   * @since C++23
+   */
+  template<__detail::__container_compatible_range<_Tp> _Rg>
+   queue(from_range_t, _Rg&& __rg)
+   : c(ranges::to<_Sequence>(std::forward<_Rg>(__rg)))
+   { }
+
+  /**
+   * @brief Construct a queue from a range.
+   * @since C++23
+   */
+  template<__detail::__container_compatible_range<_Tp> _Rg,
+  typename _Alloc>
+   queue(from_range_t, _Rg&& __rg, const _Alloc& __a)
+   : c(ranges::to<_Sequence>(std::forward<_Rg>(__rg), __a))
+   { }
+#endif
+
   /**
*  Returns true if the %queue is empty.
*/
@@ -301,6 +326,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 #endif
 
+#if __glibcxx_ranges_to_container // C++ >= 23
+  template<__detail::__container_compatible_range<_Tp> _Rg>
+   void
+   push_range(_Rg&& __rg)
+   {
+ if constexpr (requires { c.append_range(std::forward<_Rg>(__rg)); })
+   c.append_range(std::forward<_Rg>(__rg));
+ else
+   ranges::copy(__rg, std::back_inserter(c));
+   }
+#endif
+
   /**
*  @brief  Removes first element.
*
@@ -359,6 +396,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 queue(_InputIterator, _InputIterator, _Allocator)
 -> queue<_ValT, deque<_ValT, _Allocator>>;
 #endif
+
+#if __glibcxx_ranges_to_container // C++ >= 23
+  template
+queue(from_range_t, _Rg&&) -> queue>;
+
+  template
+queue(from_range_t, _Rg&&, _Alloc)
+-> queue,
+deque, _Alloc>>;
+#endif
 #endif
 
   /**
@@ -719,6 +766,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 #endif
 
+#if __glibcxx_ranges_to_container // C++ >= 23
+  /**
+   * @brief Construct a priority_queue from a range.
+   * @since C++23
+   *
+   * @{
+   */
+  template<__detail::__container_compatible_range<_Tp> _Rg>
+   priority_queue(from_range_t, _Rg&& __rg,
+  const _Compare& __x = _Compare())
+   : c(ranges::

Re: libatomic: use HWCAPs in AArch64 ifunc tests

2025-03-13 Thread Richard Sandiford
Wilco Dijkstra  writes:
> Hi Richard,
>
>> Could you give details?  I thought it was always known that trapped
>> system register accesses were slow.  In the previous versions, the
>> checks seemed to be presented as an up-front price worth paying for
>> faster atomic operations, on the systems that would use those paths.
>> Now the checks are being presented as something that are good to remove
>> to make the code simpler and faster.
>
> The system register checks came from early versions (~2 years ago) when
> there was no HWCAP defined yet, so Victor added them for testing.
> The idea was to leave them for a while so you could get new atomics
> on an older kernel, and remove them once newer kernels became available.

Hmm, ok.  But the RCPC3 one was added in June last year (so part of the
current GCC 15 release cycle), and then changed to a different form by
your patch in January this year.  Now the proposal (two months later)
is to remove it altogether.

But perhaps the most constructive thing is for me to bow out of this
one and leave the decision to other maintainers.

Thanks,
Richard


Re: [PATCH] c, c++: Set DECL_NOT_GIMPLE_REG_P on *PART_EXPR operand on lhs of MODIFY_EXPR [PR119120]

2025-03-13 Thread Jason Merrill

On 3/13/25 3:54 AM, Richard Biener wrote:

On Thu, 13 Mar 2025, Jakub Jelinek wrote:


Hi!

On Wed, Mar 12, 2025 at 02:01:14PM +0100, Richard Biener wrote:

On Wed, 12 Mar 2025, Jakub Jelinek wrote:


On Tue, Mar 11, 2025 at 12:13:13PM +0100, Richard Biener wrote:

On Tue, 11 Mar 2025, Jakub Jelinek wrote:


On Tue, Mar 11, 2025 at 10:18:18AM +0100, Richard Biener wrote:

I think the patch as-is is more robust, but still - ugh ... I wonder
whether we can instead avoid introducing the COMPLEX_EXPR at all
at -O0?


Can we set DECL_NOT_GIMPLE_REG_P at -O0 during gimplification (where
we've already handled some uses/setters of it), at least when
gimplify_modify_expr_complex_part sees {REAL,IMAG}PART_EXPR on
{VAR,PARM,RESULT}_DECL?


Yes, that should work for LHS __real / __imag.


Unfortunately it doesn't.

Although successfully bootstrapped on x86_64-linux and i686-linux,
it caused g++.dg/cpp1z/decomp2.C, g++.dg/torture/pr109262.C and
g++.dg/torture/pr88149.C regressions.

Minimal testcase is -O0:
void
foo (float x, float y)
{
   __complex__ float z = x + y * 1.0fi;
   __real__ z = 1.0f;
}
which ICEs with
pr88149.c: In function ‘foo’:
pr88149.c:2:1: error: non-register as LHS of binary operation
 2 | foo (float x, float y)
   | ^~~
z = COMPLEX_EXPR <_2, y.0>;
pr88149.c:2:1: internal compiler error: ‘verify_gimple’ failed
When the initialization is being gimplified, z is still
not DECL_NOT_GIMPLE_REG_P and so is_gimple_reg is true for it and
so it gimplifies it as
   z = COMPLEX_EXPR <_2, y.0>;
later, instead of building
   _3 = IMAGPART_EXPR ;
   z = COMPLEX_EXPR <1.0e+0, _3>;
like before, the patch forces z to be not a gimple reg and uses
   REALPART_EXPR  = 1.0e+0;
but it is too late, nothing fixes up the gimplification of the COMPLEX_EXPR
anymore.


Ah, yeah - setting DECL_NOT_GIMPLE_REG_P "after the fact" doesn't work.


So, I think we'd really need to do it the old way with adjusted naming
of the flag, so assume for all non-addressable
VAR_DECLs/PARM_DECLs/RESULT_DECLs with COMPLEX_TYPE if (!optimize) they
are DECL_NOT_GIMPLE_REG_P (perhaps with the exception of
get_internal_tmp_var), and at some point (what) if at all optimize that
away if the partial accesses aren't done.


We could of course do that in is_gimple_reg (), but I'm not sure if
all places that would need to check do so.  Alternatively gimplify

__real x = ..

into

tem[DECL_NOT_GIMPLE_REG_P] = x;
__real tem = ...;
x = tem;


We can't do that, that again causes the undesirable copying of often
uninitialized part(s).


when 'x' is a is_gimple_reg?  Of course for -O0 this would be quite bad.
Likewise for your idea - where would we do this optimization when not
optimizing?

So it would need to be the frontend(s) setting DECL_NOT_GIMPLE_REG_P
when producing lvalue __real/__imag accesses?


The following patch sets it in the FEs during genericization.
I think Fortran doesn't have a way to modify just real or just complex
part separately.

In short, this patch is for code like
  _ComplexT __t;
  __real__ __t = __z.real();
  __imag__ __t = __z.imag();
  _M_value *= __t;
  return *this;
at -O0 which used to appear widely even in libstdc++ before GCC 9
and happens in real-world code.  At -O0 for debug info reasons (see
PR119190) we don't want to aggressively DCE statements and when we
since r0-100845 try to rewrite vars with COMPLEX_TYPE into SSA form
aggressively, the above results in copying of uninitialized data
when expanding COMPLEX_EXPRs added so that the vars can be in SSA form.
The patch detects during genericization the partial initialization and
doesn't rewrite such vars to SSA at -O0.  This has to be done before
gimplification starts, otherwise e.g. the attached testcase ICEs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


LGTM, please leave frontend maintainers a chance to comment though.


No objection.

Though I notice that the documentation of DECL_NOT_GIMPLE_REG_P seems 
backwards?


Jason



[PATCH] Fixup DECL_NOT_GIMPLE_REG_P description

2025-03-13 Thread Richard Biener
When I changed DECL_GIMPLE_REG_P over to DECL_NOT_GIMPLE_REG_P I
failed to update its description.

Pushed to trunk.

* tree.h (DECL_NOT_GIMPLE_REG_P): Update description.
---
 gcc/tree.h | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/tree.h b/gcc/tree.h
index 21f3cd5525c..6f45359f103 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -2999,12 +2999,11 @@ extern tree vector_element_bits_tree (const_tree);
   (DECL_P (DECL)   \
&& (lookup_attribute ("persistent", DECL_ATTRIBUTES (DECL)) != NULL_TREE))
 
-/* For function local variables of COMPLEX and VECTOR types,
-   indicates that the variable is not aliased, and that all
-   modifications to the variable have been adjusted so that
-   they are killing assignments.  Thus the variable may now
-   be treated as a GIMPLE register, and use real instead of
-   virtual ops in SSA form.  */
+/* For function local variables indicates that the variable
+   should not be treated as a GIMPLE register.  In particular
+   this means that partial definitions can appear and the
+   variable cannot be written into SSA form and instead uses
+   virtual operands to represent the use-def dataflow.  */
 #define DECL_NOT_GIMPLE_REG_P(DECL) \
   DECL_COMMON_CHECK (DECL)->decl_common.not_gimple_reg_flag
 
-- 
2.43.0


[PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Richard Biener
The following makes sure to record -D_FORTIFY_SOURCE=n and
-U_FORTIFY_SOURCE in the DW_AT_producer debuginfo attribute when
present on the compiler command line.

We seem to want this internally now, not sure whether others also
need this (I'm happily carrying this downstream).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

* opts.cc (gen_producer_string): Record -D and -U
with _FORTIFY_SOURCE prefix.
---
 gcc/opts.cc | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/opts.cc b/gcc/opts.cc
index 4eda7ea49d0..7ed0563a651 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -3823,9 +3823,7 @@ gen_command_line_string (cl_decoded_option *options,
   case OPT_v:
   case OPT_w:
   case OPT_L:
-  case OPT_D:
   case OPT_I:
-  case OPT_U:
   case OPT_SPECIAL_unknown:
   case OPT_SPECIAL_ignore:
   case OPT_SPECIAL_warn_removed:
@@ -3861,6 +3859,16 @@ gen_command_line_string (cl_decoded_option *options,
   case OPT_fchecking_:
/* Ignore these.  */
continue;
+  case OPT_D:
+  case OPT_U:
+   if (strncmp (options[i].arg, "_FORTIFY_SOURCE",
+strlen ("_FORTIFY_SOURCE")) == 0)
+ {
+   switches.safe_push (options[i].orig_option_with_args_text);
+   len += strlen (options[i].orig_option_with_args_text) + 1;
+ }
+   /* Otherwise ignore these. */
+   continue;
   case OPT_flto_:
{
  const char *lto_canonical = "-flto";
-- 
2.43.0


[PATCH+wwwdocs] Add link to the algo...@gcc.gnu.org mailing list

2025-03-13 Thread Jose E. Marchesi
Hello people!

This patch adds a link to the Algol 68 front-end development list to
lists.html.  OK?

Thanks!

---
 htdocs/lists.html | 5 +
 1 file changed, 5 insertions(+)

diff --git a/htdocs/lists.html b/htdocs/lists.html
index 03e4a2a2..d5f1d3c8 100644
--- a/htdocs/lists.html
+++ b/htdocs/lists.html
@@ -94,6 +94,11 @@ before subscribing and posting to 
these lists.
   Patches for the jit branch should go to both this list and
   gcc-patches.
 
+  https://gcc.gnu.org/ml/algol68/";>algol68 is
+  the discussion and development list for the Algol 68 language front
+  end of GCC, and the corresponding runtime library.  Patches to ga68
+  and libga68 should go to this list.
+
   https://gcc.gnu.org/ml/gnutools-advocacy/";>gnutools-advocacy
   is for discussion of marketing, promotion, recruiting and advocacy for
   the entire GNU Toolchain (Binutils, GAS, GCC, GDB, GLIBC, GLD, and 
Gold).
-- 
2.30.2



Re: [PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 03:44:21PM +0100, Richard Biener wrote:
> +  case OPT_D:
> +  case OPT_U:
> + if (strncmp (options[i].arg, "_FORTIFY_SOURCE",
> +  strlen ("_FORTIFY_SOURCE")) == 0)

I'd say you want to verify that after that substring there is either
'\0' or "=".
Otherwise you'll record -D_FORTIFY_SOURCE_NOT_REALLY=1 which doesn't
matter at all.

> +   {
> + switches.safe_push (options[i].orig_option_with_args_text);
> + len += strlen (options[i].orig_option_with_args_text) + 1;
> +   }
> + /* Otherwise ignore these. */
> + continue;
>case OPT_flto_:
>   {
> const char *lto_canonical = "-flto";

Otherwise LGTM.

Jakub



Re: [PATCH][_Hashtable] Fix hash code cache usage

2025-03-13 Thread Florian Weimer
* Jonathan Wakely:

> On Thu, 13 Mar 2025 at 09:24, Florian Weimer  wrote:
>>
>> * Jonathan Wakely:
>>
>> > On Thu, 13 Mar 2025 at 06:50, Florian Weimer  wrote:
>> >>
>> >> * François Dumont:
>> >>
>> >> > +  // Get hash code for a node that comes from another _Hashtable.
>> >> > +  // Reuse a cached hash code if the hash function is stateless,
>> >> > +  // otherwise recalculate it using our own hash function.
>> >> > +  __hash_code
>> >> > +  _M_hash_code_ext(const __node_value_type& __from) const
>> >> > +  {
>> >> > + if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value)
>> >> > +   return __from._M_hash_code;
>> >> > + else
>> >> > +   return this->_M_hash_code(_ExtractKey{}(__from._M_v()));
>> >> > +  }
>> >>
>> >> Does C++ support stateful hash functions?  I don't think so, and I don't
>> >> see it documented as a GNU extension, either.
>> >
>> > It does, yes. That's why the hash function isn't required to be
>> > default constructible, and has to be stored in the container and why
>> > doing swap on two containers has to swap the hash functions as well.
>>
>> Interesting.  I have trouble reconciling this with the Cpp17Hash
>> requirement that “The value” h(k) “shall depend only on the argument k
>> for the duration of the program.”
>
>
> That's for a given value of the type, h. For any two values of the
> type h and h2, it's not required that h2(k) == h(k).

But the wording for the comp object for ordered containers is very
different and makes it clear that the instance matters.

I still think this is more confusing than necessary.  If there isn't
some general rule that empty objects can be considered stateless, that
should be added somewhere, too. 8-)

Thanks,
Florian



Re: [PATCH v2 0/2] Two match.pd folds for sve/pr98119.c

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Richard Sandiford wrote:

> Updates in v2:
> 
> - Make one of the pointer-arith-11.c scans require LP64, since
>   the constant is printed differently for ILP32, and would be
>   printed differently still for 16-bit targets.
> 
> - Avoid use of element_precision in scalar-only folds
> 
> - Use tree_int_cst_sgn instead of wi::to_widest.
> 
> Original message:
> 
> gcc.target/aarch64/sve/pr98119.c has been failing since r15-268.
> The test expects pointer alignment to be done with a plain AND,
> but that no longer happens.
> 
> It was combine that previously generated the ANDs, but I think it's
> fair to argue that it isn't combine's job to handle this case.
> The test's gimple output is pretty suboptimal.
> 
> The current gimple is:
> 
>   _21 = (unsigned long) vectp_x.4_22;
>   _20 = _21 >> 1;
>   _17 = (unsigned int) _20;
>   _16 = _17 & 15;
>   _9 = (sizetype) _16;
>   _18 = (unsigned int) _9;
>   _32 = _18 w* 2;
>   _33 = -_32;// -(_21 & 30)
>   vectp_x.6_10 = x_13(D) + _33;  // x_31(D) & ~30
>   _41 = 18446744073709551584 - _32;  // -32 - (_21 & 30)
>   vectp_x.9_38 = x_13(D) + _41;  // (x_31(D) - 32) & ~30
>   _55 = _16 + 1000;
> 
> This series adds two groups of folds that together give:
> 
>   _21 = (unsigned long) vectp_x.4_22;
>   _20 = _21 >> 1;
>   _17 = (unsigned int) _20;
>   _16 = _17 & 15;
>   vectp_x.6_10 = x_13(D) & -31B;
>   _18 = x_13(D) & -31B;
>   vectp_x.9_38 = _18 + 18446744073709551584;
>   _55 = _16 + 1000;
> 
> The duplicate "x_13(D) & -31B"s are unfortunate, but RTL CSE does get
> rid of them.
> 
> I asked on IRC whether new folds to fix this kind of regression were
> acceptable, but I admit that the patches ended up being bigger than
> I'd imagined.  I'll fully understand if they seem like too much for
> stage 4 after all.
> 
> Bootstrapped & regression-tested on aarch64-linux-gnu.  Also tested
> in its original form on x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
> 
> 
> Richard Sandiford (2):
>   match.pd: Fold ((X >> C1) & C2) * (1 << C1)
>   match.pd: Extend pointer alignment folds
> 
>  gcc/match.pd | 55 +
>  gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c | 59 ++
>  gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c | 15 
>  gcc/testsuite/gcc.dg/pointer-arith-11.c  | 39 ++
>  gcc/testsuite/gcc.dg/pointer-arith-12.c  | 82 
>  5 files changed, 250 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-11.c
>  create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-12.c
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH 1/2] match.pd: Fold ((X >> C1) & C2) * (1 << C1)

2025-03-13 Thread Richard Biener
On Wed, 12 Mar 2025, Richard Sandiford wrote:

> Thanks for the review!
> 
> Andrew Pinski  writes:
> > On Wed, Mar 12, 2025 at 12:00 PM Richard Sandiford
> >  wrote:
> >>
> >> Using a combination of rules, we were able to fold
> >>
> >>   ((X >> C1) & C2) * (1 << C1)  -->  X & (C2 << C1)
> >>
> >> if everything was done at the same precision, but we couldn't fold
> >> it if the AND was done at a different precision.  The optimisation is
> >> often (but not always) valid for that case too.
> >>
> >> This patch adds a dedicated rule for the case where different precisions
> >> are involved.
> >>
> >> An alternative would be to extend the individual folds that together
> >> handle the same-precision case so that those rules handle differing
> >> precisions.  But the risk is that that could replace narrow operations
> >> with wide operations, which would be especially harmful on targets
> >> like avr.  It's also not obviously free of cycles.
> >>
> >> I also wondered whether the converts should be non-optional.
> >>
> >> gcc/
> >> * match.pd: Fold ((X >> C1) & C2) * (1 << C1) to X & (C2 << C1).
> >>
> >> gcc/testsuite/
> >> * gcc.dg/fold-mul-and-lshift-1.c: New test.
> >> * gcc.dg/fold-mul-and-lshift-2.c: Likewise.
> >> ---
> >>  gcc/match.pd | 29 ++
> >>  gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c | 59 
> >>  gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c | 15 +
> >>  3 files changed, 103 insertions(+)
> >>  create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
> >>  create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c
> >>
> >> diff --git a/gcc/match.pd b/gcc/match.pd
> >> index 5c679848bdf..3197d1cac75 100644
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -5231,6 +5231,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>   (if (mask)
> >>(bit_op (shift (convert @0) @1) { mask; })))
> >>
> >> +/* Fold ((X >> C1) & C2) * (1 << C1) into X & (C2 << C1), including cases 
> >> where
> >> +   the & happens in a different type.  It is the conversion case that 
> >> isn't
> >> +   a composition of other folds.
> >> +
> >> +   Let the type of the * and >> be T1 and the type of the & be T2.
> >> +   The fold is valid if the conversion to T2 preserves all information;
> >> +   that is, if T2 is wider than T1 or drops no more than C1 bits from T1.
> >> +   In that case, the & might operate on bits that are dropped by the
> >> +   later conversion to T1 and the multiplication by (1 << C1), but those
> >> +   bits are also dropped by ANDing with C2 << C1 (converted to T1).
> >> +
> >> +   If the conversion to T2 is not information-preserving, we have to be
> >> +   careful about the later conversion to T1 acting as a sign extension.
> >> +   We need either T2 to be unsigned or the top (sign) bit of C2 to be 
> >> clear.
> >> +   That is equivalent to testing whether C2 is nonnegative.  */
> >> +(simplify
> >> + (mult
> >> +  (convert? (bit_and (convert? (rshift @0 INTEGER_CST@1)) INTEGER_CST@2))
> >> +  INTEGER_CST@3)
> >> + (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
> >> +  (with { auto prec = element_precision (type); }
> > Since we know this needs to be a scalar, Using TREE_PRECISION here is fine 
> > too.
> 
> Yeah, agreed.  I'd wondered whether to use TREE_PRECISION instead,
> but then I'd also wondered about trying to make the fold work for ectors.
> Guess I ended up between two stools.
> 
> >> +   (if (wi::ltu_p (wi::to_widest (@1), prec))
> >
> > I think using wi::to_wide is better than using wi::to_widest here.
> 
> What's the reason for preferring wi::to_wide?  wi::to_widest should
> usually be more efficient for this kind of check, since the tree
> representation allows the underlying HWIs to be used directly.
> wi::to_wide instead requires masking off bits above the precision.

I prefer ::to_widest.

The original patch is OK with changing element_precision to 
TYPE_PRECISION.

Richard.

> E.g. on an --enable-checking=release compiler:
> 
> bool
> foo (tree t, unsigned int n)
> {
>   return wi::ltu_p (wi::to_widest (t), n);
> }
> 
> gives:
> 
> 188c:   79400c02ldrhw2, [x0, #6]
> 1890:   7100045fcmp w2, #0x1
> 1894:   5460b.eq18a0  int)+0x14>  // b.none
> 1898:   5280mov w0, #0x0// #0
> 189c:   d65f03c0ret
> 18a0:   f9400800ldr x0, [x0, #16]
> 18a4:   eb21401fcmp x0, w1, uxtw
> 18a8:   1a9f27e0csetw0, cc  // cc = lo, ul, last
> 18ac:   d65f03c0ret
> 
> whereas:
> 
> bool
> foo (tree t, unsigned int n)
> {
>   return wi::ltu_p (wi::to_wide (t), n);
> }
> 
> gives:
> 
> 188c:   79400802ldrhw2, [x0, #4]
> 1890:   7100045fcmp w2, #0x1
> 1894:   5460b.eq18a0  int)+0x14>  // b.none
> 1898:   5280mov 

Re: [PATCH][_Hashtable] Fix hash code cache usage

2025-03-13 Thread Jonathan Wakely
On Thu, 13 Mar 2025 at 11:51, Florian Weimer  wrote:
>
> * Jonathan Wakely:
>
> > On Thu, 13 Mar 2025 at 09:24, Florian Weimer  wrote:
> >>
> >> * Jonathan Wakely:
> >>
> >> > On Thu, 13 Mar 2025 at 06:50, Florian Weimer  wrote:
> >> >>
> >> >> * François Dumont:
> >> >>
> >> >> > +  // Get hash code for a node that comes from another _Hashtable.
> >> >> > +  // Reuse a cached hash code if the hash function is stateless,
> >> >> > +  // otherwise recalculate it using our own hash function.
> >> >> > +  __hash_code
> >> >> > +  _M_hash_code_ext(const __node_value_type& __from) const
> >> >> > +  {
> >> >> > + if constexpr (__and_<__hash_cached, is_empty<_Hash>>::value)
> >> >> > +   return __from._M_hash_code;
> >> >> > + else
> >> >> > +   return this->_M_hash_code(_ExtractKey{}(__from._M_v()));
> >> >> > +  }
> >> >>
> >> >> Does C++ support stateful hash functions?  I don't think so, and I don't
> >> >> see it documented as a GNU extension, either.
> >> >
> >> > It does, yes. That's why the hash function isn't required to be
> >> > default constructible, and has to be stored in the container and why
> >> > doing swap on two containers has to swap the hash functions as well.
> >>
> >> Interesting.  I have trouble reconciling this with the Cpp17Hash
> >> requirement that “The value” h(k) “shall depend only on the argument k
> >> for the duration of the program.”
> >
> >
> > That's for a given value of the type, h. For any two values of the
> > type h and h2, it's not required that h2(k) == h(k).
>
> But the wording for the comp object for ordered containers is very
> different and makes it clear that the instance matters.
>
> I still think this is more confusing than necessary.  If there isn't
> some general rule that empty objects can be considered stateless, that
> should be added somewhere, too. 8-)

Could you submit an issue (or two)?
https://cplusplus.github.io/LWG/lwg-active.html#submit_issue


[PATCH v2 1/2] match.pd: Fold ((X >> C1) & C2) * (1 << C1)

2025-03-13 Thread Richard Sandiford
Using a combination of rules, we were able to fold

  ((X >> C1) & C2) * (1 << C1)  -->  X & (C2 << C1)

if everything was done at the same precision, but we couldn't fold
it if the AND was done at a different precision.  The optimisation is
often (but not always) valid for that case too.

This patch adds a dedicated rule for the case where different precisions
are involved.

An alternative would be to extend the individual folds that together
handle the same-precision case so that those rules handle differing
precisions.  But the risk is that that could replace narrow operations
with wide operations, which would be especially harmful on targets
like avr.  It's also not obviously free of cycles.

I also wondered whether the converts should be non-optional.

gcc/
* match.pd: Fold ((X >> C1) & C2) * (1 << C1) to X & (C2 << C1).

gcc/testsuite/
* gcc.dg/fold-mul-and-lshift-1.c: New test.
* gcc.dg/fold-mul-and-lshift-2.c: Likewise.
---
 gcc/match.pd | 28 ++
 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c | 59 
 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c | 15 +
 3 files changed, 102 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
 create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 5c679848bdf..7017fd15277 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5231,6 +5231,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (mask)
   (bit_op (shift (convert @0) @1) { mask; })))
 
+/* Fold ((X >> C1) & C2) * (1 << C1) into X & (C2 << C1), including cases where
+   the & happens in a different type.  It is the conversion case that isn't
+   a composition of other folds.
+
+   Let the type of the * and >> be T1 and the type of the & be T2.
+   The fold is valid if the conversion to T2 preserves all information;
+   that is, if T2 is wider than T1 or drops no more than C1 bits from T1.
+   In that case, the & might operate on bits that are dropped by the
+   later conversion to T1 and the multiplication by (1 << C1), but those
+   bits are also dropped by ANDing with C2 << C1 (converted to T1).
+
+   If the conversion to T2 is not information-preserving, we have to be
+   careful about the later conversion to T1 acting as a sign extension.
+   We need either T2 to be unsigned or the top (sign) bit of C2 to be clear.
+   That is equivalent to testing whether C2 is nonnegative.  */
+(simplify
+ (mult (convert? (bit_and (convert? (rshift @0 INTEGER_CST@1)) INTEGER_CST@2))
+   INTEGER_CST@3)
+ (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
+  && wi::ltu_p (wi::to_widest (@1), TYPE_PRECISION (type)))
+  (with { unsigned int shift = tree_to_uhwi (@1);
+ unsigned int prec = TYPE_PRECISION (type); }
+   (if ((prec <= TYPE_PRECISION (TREE_TYPE (@2)) + shift
+|| tree_int_cst_sgn (@2) >= 0)
+   && wi::to_wide (@3) == wi::set_bit_in_zero (shift, prec))
+(with { auto mask = wide_int::from (wi::to_wide (@2), prec, UNSIGNED); }
+ (bit_and @0 { wide_int_to_tree (type, mask << shift); }))
+
 /* ~(~X >> Y) -> X >> Y (for arithmetic shift).  */
 (simplify
  (bit_not (convert1?:s (rshift:s (convert2?@0 (bit_not @1)) @2)))
diff --git a/gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c 
b/gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
new file mode 100644
index 000..b1ce10495e3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
@@ -0,0 +1,59 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not { >> } "optimized" } } */
+/* { dg-final { scan-tree-dump-not { \* } "optimized" } } */
+
+unsigned int
+f1 (unsigned int x)
+{
+x >>= 1;
+unsigned long y = x;
+y &= 255;
+x = y;
+x *= 2;
+return x;
+}
+
+unsigned int
+f2 (unsigned int x)
+{
+x >>= 1;
+unsigned long y = x;
+y &= -2UL;
+x = y;
+x *= 2;
+return x;
+}
+
+unsigned int
+f3 (unsigned int x)
+{
+x >>= 1;
+unsigned short y = x;
+y &= 255;
+x = y;
+x *= 2;
+return x;
+}
+
+unsigned int
+f4 (unsigned int x)
+{
+x >>= 1;
+short y = x;
+y &= (unsigned short) ~0U >> 1;
+x = y;
+x *= 2;
+return x;
+}
+
+unsigned int
+f5 (unsigned int x)
+{
+x >>= 16;
+short y = x;
+y &= -2;
+x = y;
+x *= 1 << 16;
+return x;
+}
diff --git a/gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c 
b/gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c
new file mode 100644
index 000..86eabef0fef
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target int32 } } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump { >> 15;} "optimized" } } */
+/* { dg-final { scan-tree-dump { \* 32768;} "optimized" } } */
+
+unsigned int
+f1 (unsigned int x)
+{
+x >>= 15;
+short y = x;
+y &= -2;
+x = y;
+x *= 

[PATCH v2 0/2] Two match.pd folds for sve/pr98119.c

2025-03-13 Thread Richard Sandiford
Updates in v2:

- Make one of the pointer-arith-11.c scans require LP64, since
  the constant is printed differently for ILP32, and would be
  printed differently still for 16-bit targets.

- Avoid use of element_precision in scalar-only folds

- Use tree_int_cst_sgn instead of wi::to_widest.

Original message:

gcc.target/aarch64/sve/pr98119.c has been failing since r15-268.
The test expects pointer alignment to be done with a plain AND,
but that no longer happens.

It was combine that previously generated the ANDs, but I think it's
fair to argue that it isn't combine's job to handle this case.
The test's gimple output is pretty suboptimal.

The current gimple is:

  _21 = (unsigned long) vectp_x.4_22;
  _20 = _21 >> 1;
  _17 = (unsigned int) _20;
  _16 = _17 & 15;
  _9 = (sizetype) _16;
  _18 = (unsigned int) _9;
  _32 = _18 w* 2;
  _33 = -_32;// -(_21 & 30)
  vectp_x.6_10 = x_13(D) + _33;  // x_31(D) & ~30
  _41 = 18446744073709551584 - _32;  // -32 - (_21 & 30)
  vectp_x.9_38 = x_13(D) + _41;  // (x_31(D) - 32) & ~30
  _55 = _16 + 1000;

This series adds two groups of folds that together give:

  _21 = (unsigned long) vectp_x.4_22;
  _20 = _21 >> 1;
  _17 = (unsigned int) _20;
  _16 = _17 & 15;
  vectp_x.6_10 = x_13(D) & -31B;
  _18 = x_13(D) & -31B;
  vectp_x.9_38 = _18 + 18446744073709551584;
  _55 = _16 + 1000;

The duplicate "x_13(D) & -31B"s are unfortunate, but RTL CSE does get
rid of them.

I asked on IRC whether new folds to fix this kind of regression were
acceptable, but I admit that the patches ended up being bigger than
I'd imagined.  I'll fully understand if they seem like too much for
stage 4 after all.

Bootstrapped & regression-tested on aarch64-linux-gnu.  Also tested
in its original form on x86_64-linux-gnu.  OK to install?

Richard


Richard Sandiford (2):
  match.pd: Fold ((X >> C1) & C2) * (1 << C1)
  match.pd: Extend pointer alignment folds

 gcc/match.pd | 55 +
 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c | 59 ++
 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c | 15 
 gcc/testsuite/gcc.dg/pointer-arith-11.c  | 39 ++
 gcc/testsuite/gcc.dg/pointer-arith-12.c  | 82 
 5 files changed, 250 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-1.c
 create mode 100644 gcc/testsuite/gcc.dg/fold-mul-and-lshift-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-11.c
 create mode 100644 gcc/testsuite/gcc.dg/pointer-arith-12.c

-- 
2.25.1



COBOL: Implementation of STOP RUN / GOBACK [was: [PATCH][v3] Simple cobol.dg testsuite]

2025-03-13 Thread Simon Sobisch

> Earlier in this discussion of a testsuite, the question came up about
> generating an error return in COBOL source code.
>
> In COBOL, "GOBACK ERROR 1." is the equivalent of a C "return 1;".
> When executed in  the initial "top-level" program-id, it results in
> the value 1 being passed back to the _start stub.
>
> "STOP RUN ERROR 1." is the equivalent of (and is in fact implemented
> with) "exit(1)".
>
> Bob D.

Let's speak COBOL here and please re-consider if this is the best option 
available in GCC [note: I'm also interested for the implementation in 
GnuCOBOL, but that's a tangent for this list].


The syntax of the STOP statement and the rules are the following 
("noise" words written in parenthesis, optional items in brackets, pipe 
gives alternatives within braces):


--

STOP  RUN
[ (WITH)  { ERROR | NORMAL }  (STATUS)  [ { identifier | literal } ]  ]


rules: literal should be non-zero length, if numeric it should be an 
integer (no sign, no decimal place);
any constraints on the value of the literal or the contents of the 
identifier are  defined by the implementor



If the ERROR phrase is specified, the operating system will indicate an 
error termination of the run unit if such a capability exists within the 
operating system.


If the NORMAL phrase is specified, the operating system will indicate a 
normal termination of the run unit if such a capability exists within 
the operating system.



During execution of the STOP statement with a literal or an identifier 
specified, the literal or the contents of the identifier are passed to 
the operating system.


--

exit() allows us to "pass to the operating system" directly; but it 
doesn't directly say "success" or "fail".



Obviously the statements
   STOP RUN WITH NORMAL STATUS 41
and
   STOP RUN ERROR 41

Should have a different result for the operating system.
As those numbers must be unsigned, it could be reasonable to translate 
that to  exit (41)  and  exit (-41).


While a "STOP RUN ERROR 0" would be possible as well, there could be an 
implementor-defined constraint (which can be enforced for literals) that 
zero is not valid.


This would mean that STOP RUN == STOP RUN WITH NORMAL STATUS == STOP RUN 
WITH NORMAL STATUS 0 == exit (0) and

STOP RUN WITH ERROR STATUS == STOP RUN WITH ERROR STATUS 1 == exit (-1)


Then we have additional the question on how to translate

   STOP RUN WITH ERROR "Don't do that, Jon!"

in which case something like

   fflush (stderr);
   fprintf(stderr, "%s\n", "Don't do that, Jon!");
   exit (-1);

or even something along

void cobol_stop_run __attribute__((noreturn))
(int status, char *message) {

   const int exit_status = error ? status * -1 : status;
   const FILE* stream = error ? stderr : stdout;
   const char* prefix = error ?
  : _("runtime exited with normal status");

   if (error) {
  fflush (stderr );
  fprintf (stderr , _("runtime exited with error status %d: %s\n",
   status, message);
  exit (-status);
   } else {
  fflush (stdout);
  fprintf (stderr , _("runtime exited with normal status %d: %s\n",
   status, message);
  exit (status);
   }
}

Side note: I'd highly suggest to keep abort() for runtime-covered error 
handling (index out of bounds, program not found, ...)


Simon


Re: [PATCH] Fix a pasto in ao_compare::compare_ao_refs

2025-03-13 Thread Richard Biener
On Mon, Mar 10, 2025 at 11:59 PM Martin Jambor  wrote:
>
> Hi,
>
> when reading the function ao_compare::compare_ao_refs I came accross
> what I believe to ba a copy-and-paste error which this patch fixes.
>
> Bootstrapped, LTO-bootstrapped and tested on x86_64-linux.  OK for
> master?

OK.  I'll note this must have previously disallowed almost everything?

Thanks,
Richard.

> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2025-03-10  Martin Jambor  
>
> * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a
> copy-and-paste error.
> ---
>  gcc/tree-ssa-alias.cc | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
> index 2489aa6b808..e93d5187d50 100644
> --- a/gcc/tree-ssa-alias.cc
> +++ b/gcc/tree-ssa-alias.cc
> @@ -4355,12 +4355,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref 
> *ref2,
> c1 = p1, nskipped1 = i;
>i++;
>  }
> +  i = 0;
>for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 
> 0))
>  {
>if (component_ref_to_zero_sized_trailing_array_p (p2))
> end_struct_ref2 = p2;
>if (ends_tbaa_access_path_p (p2))
> -   c2 = p2, nskipped1 = i;
> +   c2 = p2, nskipped2 = i;
>i++;
>  }
>
> --
> 2.47.1
>


[PATCH v2] reassoc: Optimize CMP/XOR expressions [PR116860]

2025-03-13 Thread Konstantinos Eleftheriou
Testcases for match.pd patterns
`((a ^ b) & c) cmp d | a != b -> (0 cmp d | a != b)` and
`(a ^ b) cmp c | a != b -> (0 cmp c | a != b)` were failing on some targets,
like PowerPC.

This patch adds an implemenetation for the optimization in reassoc. Doing so,
we can now handle cases where the related conditions appear in an AND
expression too. Also, we can optimize cases where we have intermediate
expressions between the related ones in the AND/OR expression on some targets.
This is not handled on targets like PowerPC, where each condition of the
AND/OR expression is placed into a different basic block.

Bootstrapped/regtested on x86 and AArch64.

PR tree-optimization/116860

gcc/ChangeLog:

* tree-ssa-reassoc.cc (solve_expr): New function.
(find_terminal_nodes): New function.
(get_terminal_nodes): New function.
(optimize_cmp_xor_exprs): New function.
(optimize_range_tests): Call optimize_cmp_xor_exprs.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/fold-xor-and-or.c: Renamed to fold-xor-and-or-1.c.
* gcc.dg/tree-ssa/fold-xor-and-or-1.c:
Add new test cases, remove logical-op-non-short-circuit=1.
* gcc.dg/tree-ssa/fold-xor-or.c: Likewise.
* gcc.dg/tree-ssa/fold-xor-and-or-2.c: New test.
---
 ...{fold-xor-and-or.c => fold-xor-and-or-1.c} |  42 ++-
 .../gcc.dg/tree-ssa/fold-xor-and-or-2.c   |  59 +++
 gcc/testsuite/gcc.dg/tree-ssa/fold-xor-or.c   |  42 ++-
 gcc/tree-ssa-reassoc.cc   | 354 ++
 4 files changed, 479 insertions(+), 18 deletions(-)
 rename gcc/testsuite/gcc.dg/tree-ssa/{fold-xor-and-or.c => 
fold-xor-and-or-1.c} (50%)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-2.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c 
b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
similarity index 50%
rename from gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c
rename to gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
index 99e83d8e5aa..23edf9f4342 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-1.c
@@ -1,55 +1,79 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-optimized --param 
logical-op-non-short-circuit=1" } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
 
 typedef unsigned long int uint64_t;
 
-int cmp1(int d1, int d2) {
+int cmp1_or(int d1, int d2) {
   if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2)
 return 0;
   return 1;
 }
 
-int cmp2(int d1, int d2) {
+int cmp2_or(int d1, int d2) {
   if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0)
 return 0;
   return 1;
 }
 
-int cmp3(int d1, int d2) {
+int cmp3_or(int d1, int d2) {
   if (10 > (0xabcd & (d2 ^ d1)) || d2 != d1)
 return 0;
   return 1;
 }
 
-int cmp4(int d1, int d2) {
+int cmp4_or(int d1, int d2) {
   if (d2 != d1 || 10 > (0xabcd & (d2 ^ d1)))
 return 0;
   return 1;
 }
 
-int cmp1_64(uint64_t d1, uint64_t d2) {
+int cmp1_and(int d1, int d2) {
+  if (!(((d1 ^ d2) & 0xabcd) == 0) && d1 == d2)
+return 0;
+  return 1;
+}
+
+int cmp2_and(int d1, int d2) {
+  if (d1 == d2 && !(((d1 ^ d2) & 0xabcd) == 0))
+return 0;
+  return 1;
+}
+
+int cmp1_or_64(uint64_t d1, uint64_t d2) {
   if (((d1 ^ d2) & 0xabcd) == 0 || d1 != d2)
 return 0;
   return 1;
 }
 
-int cmp2_64(uint64_t d1, uint64_t d2) {
+int cmp2_or_64(uint64_t d1, uint64_t d2) {
   if (d1 != d2 || ((d1 ^ d2) & 0xabcd) == 0)
 return 0;
   return 1;
 }
 
-int cmp3_64(uint64_t d1, uint64_t d2) {
+int cmp3_or_64(uint64_t d1, uint64_t d2) {
   if (10 > (0xabcd & (d2 ^ d1)) || d2 != d1)
 return 0;
   return 1;
 }
 
-int cmp4_64(uint64_t d1, uint64_t d2) {
+int cmp4_or_64(uint64_t d1, uint64_t d2) {
   if (d2 != d1 || 10 > (0xabcd & (d2 ^ d1)))
 return 0;
   return 1;
 }
 
+int cmp1_and_64(uint64_t d1, uint64_t d2) {
+  if (!(((d1 ^ d2) & 0xabcd) == 0) && d1 == d2)
+return 0;
+  return 1;
+}
+
+int cmp2_and_64(uint64_t d1, uint64_t d2) {
+  if (d1 == d2 && !(((d1 ^ d2) & 0xabcd) == 0))
+return 0;
+  return 1;
+}
+
 /* The if should be removed, so the condition should not exist */
 /* { dg-final { scan-tree-dump-not "d1_\[0-9\]+.D. \\^ d2_\[0-9\]+.D." 
"optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-2.c
new file mode 100644
index 000..eea44d616b4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/fold-xor-and-or-2.c
@@ -0,0 +1,59 @@
+/* This test is not working across all targets (e.g. it fails on PowerPC, 
+   because each condition of the AND/OR expression is placed into
+   a different basic block). Therefore, it is gated for x86-64 and AArch64,
+   where we know that it has to pass.  */
+/* { dg-do compile { target { aarch64-*-* x86_64-*-* } } } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+
+typedef unsigned long int uint64_t;
+
+int cmp1_or_inter(int d1, int d2, int d3) {
+  if (((d1 ^ d2) & 0xabcd) == 0 || d3 != 10 || d1 != d2)

[committed v2 2/2] libstdc++: Implement for C++26 (P3370R1)

2025-03-13 Thread Jonathan Wakely
This is the second part of the P3370R1 proposal just approved by the
committee in Wrocław. This adds C++ equivalents of the functions added
to C23 by WG14 N2683.

These functions are in the global namespace, but to avoid collisions
with the same functions defined by other standard library
implementations, this change defines them in namespace __gnu_cxx and
then adds them to the global namespace.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add stdckdint.h.
* include/Makefile.in: Regenerate.
* src/c++23/std.compat.cc.in: Export  functions.
* include/c_compatibility/stdckdint.h: New file.
* testsuite/26_numerics/stdckdint/1.cc: New test.
* testsuite/26_numerics/stdckdint/2_neg.cc: New test.

Reviewed-by: Patrick Palka 
---

Reviewed by Tomasz and Patrick at 
https://forge.sourceware.org/gcc/gcc-TEST/pulls/28

Tested x86_64-linux. Pushed to trunk.

 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 .../include/c_compatibility/stdckdint.h   | 113 ++
 libstdc++-v3/src/c++23/std.compat.cc.in   |   8 ++
 .../testsuite/26_numerics/stdckdint/1.cc  |  63 ++
 .../testsuite/26_numerics/stdckdint/2_neg.cc  |  39 ++
 6 files changed, 225 insertions(+)
 create mode 100644 libstdc++-v3/include/c_compatibility/stdckdint.h
 create mode 100644 libstdc++-v3/testsuite/26_numerics/stdckdint/1.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/stdckdint/2_neg.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index a8ff87fb600..4dc771a540c 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -912,6 +912,7 @@ c_compatibility_headers = \
${c_compatibility_srcdir}/math.h \
${c_compatibility_srcdir}/stdatomic.h \
${c_compatibility_srcdir}/stdbit.h \
+   ${c_compatibility_srcdir}/stdckdint.h \
${c_compatibility_srcdir}/stdlib.h
 endif
 
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 859cbee53d6..0e3d09b3a75 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1249,6 +1249,7 @@ c_compatibility_builddir = .
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/math.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdatomic.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdbit.h \
+@GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdckdint.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdlib.h
 
 @GLIBCXX_C_HEADERS_C_STD_TRUE@c_compatibility_headers = 
diff --git a/libstdc++-v3/include/c_compatibility/stdckdint.h 
b/libstdc++-v3/include/c_compatibility/stdckdint.h
new file mode 100644
index 000..1de2d18dc1a
--- /dev/null
+++ b/libstdc++-v3/include/c_compatibility/stdckdint.h
@@ -0,0 +1,113 @@
+// C compatibility header  -*- C++ -*-
+
+// Copyright The GNU Toolchain Authors.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/stdckdint.h
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_STDCKDINT_H
+#define _GLIBCXX_STDCKDINT_H
+
+#if __cplusplus > 202302L
+#include 
+#include 
+
+#define __STDC_VERSION_STDCKDINT_H__ 202311L
+
+#ifndef _GLIBCXX_DOXYGEN
+// We define these in our own namespace, but let Doxygen think otherwise.
+namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
+{
+#endif
+/// @cond undocumented
+namespace __detail
+{
+  template
+concept __cv_unqual_signed_or_unsigned_integer_type
+  = std::same_as<_Tp, std::remove_cv_t<_Tp>>
+ && std::__is_standard_integer<_Tp>::value;
+}
+/// @endcond
+
+/** Checked integer arithmetic
+ *
+ * Performs arithmetic on `__a` and `__b` and stores the result in `*__result`,
+ * with overflow detection.
+ * The arithmetic is performed in infinite signed precision, without overflow,
+ * then converted to the result type, `_Tp1`. If the converted result is not
+ * equal to th

[committed v2 1/2] libstdc++: Implement for C++26 (P3370R1)

2025-03-13 Thread Jonathan Wakely
This is the first part of the P3370R1 proposal just approved by the
committee in Wrocław. This adds C++ equivalents of the functions added
to C23 by WG14 N3022.

These functions are in the global namespace, but to avoid collisions
with the same functions defined by other standard library
implementations, this change defines them in namespace __gnu_cxx and
then adds them to the global namespace.

libstdc++-v3/ChangeLog:

* include/Makefile.am: Add stdbit.h.
* include/Makefile.in: Regenerate.
* src/c++23/std.compat.cc.in: Export  functions.
* include/c_compatibility/stdbit.h: New file.
* testsuite/20_util/stdbit/1.cc: New test.
* testsuite/20_util/stdbit/2_neg.cc: New test.

Reviewed-by: Patrick Palka 
---

Reviewed by Tomasz and Patrick at 
https://forge.sourceware.org/gcc/gcc-TEST/pulls/28

Tested x86_64-linux. Pushed to trunk.

 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/c_compatibility/stdbit.h | 582 ++
 libstdc++-v3/src/c++23/std.compat.cc.in   |  29 +
 libstdc++-v3/testsuite/20_util/stdbit/1.cc| 320 ++
 .../testsuite/20_util/stdbit/2_neg.cc |  45 ++
 6 files changed, 978 insertions(+)
 create mode 100644 libstdc++-v3/include/c_compatibility/stdbit.h
 create mode 100644 libstdc++-v3/testsuite/20_util/stdbit/1.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/stdbit/2_neg.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index de25aadd219..a8ff87fb600 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -911,6 +911,7 @@ c_compatibility_headers = \
${c_compatibility_srcdir}/tgmath.h \
${c_compatibility_srcdir}/math.h \
${c_compatibility_srcdir}/stdatomic.h \
+   ${c_compatibility_srcdir}/stdbit.h \
${c_compatibility_srcdir}/stdlib.h
 endif
 
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index 5a20dfb69b0..859cbee53d6 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1248,6 +1248,7 @@ c_compatibility_builddir = .
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/tgmath.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/math.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdatomic.h \
+@GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdbit.h \
 @GLIBCXX_C_HEADERS_C_GLOBAL_TRUE@  ${c_compatibility_srcdir}/stdlib.h
 
 @GLIBCXX_C_HEADERS_C_STD_TRUE@c_compatibility_headers = 
diff --git a/libstdc++-v3/include/c_compatibility/stdbit.h 
b/libstdc++-v3/include/c_compatibility/stdbit.h
new file mode 100644
index 000..1fb691e36c1
--- /dev/null
+++ b/libstdc++-v3/include/c_compatibility/stdbit.h
@@ -0,0 +1,582 @@
+// C compatibility header  -*- C++ -*-
+
+// Copyright The GNU Toolchain Authors.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/stdbit.h
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_STDBIT_H
+#define _GLIBCXX_STDBIT_H
+
+#if __cplusplus > 202302L
+#include 
+
+#define __STDC_VERSION_STDBIT_H__ 202311L
+
+#define __STDC_ENDIAN_BIG__ __ORDER_BIG_ENDIAN__
+#define __STDC_ENDIAN_LITTLE__  __ORDER_LITTLE_ENDIAN__
+#define __STDC_ENDIAN_NATIVE__  __BYTE_ORDER__
+
+#ifndef _GLIBCXX_DOXYGEN
+// We define these in our own namespace, but let Doxygen think otherwise.
+namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
+{
+#endif
+
+/** Count the number of leading zero bits
+ *
+ * @param  __value An unsigned integer.
+ * @since C++26
+ * @{
+ */
+template
+inline unsigned int
+stdc_leading_zeros(_Tp __value)
+{
+  static_assert(std::__unsigned_integer<_Tp>);
+  return std::countl_zero(__value);
+}
+
+inline unsigned int
+stdc_leading_zeros_uc(unsigned char __value)
+{ return stdc_leading_zeros(__value); }
+
+inline unsigned int
+stdc_leading_zeros_us(unsigned short __value)
+{ return stdc_leading_

Re: [PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Andreas Schwab
On Mär 13 2025, Richard Biener wrote:

> The following makes sure to record -D_FORTIFY_SOURCE=n and
> -U_FORTIFY_SOURCE in the DW_AT_producer debuginfo attribute when
> present on the compiler command line.

Should this also handle defines passed via -Wp?

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[pushed] testsuite: Fix sve/mask_struct_load_3_run.c [PR113965]

2025-03-13 Thread Richard Sandiford
Among other things, this testcase tests an addition of the four
values (n*4+[0:3])*9//2 for each n in [0:99].  The addition is
done in multiple integer and floating-point types and the test
is compiled with -ffast-math.

One of the floating-point types is _Float16, and as Andrew says
in the PR, _Float16's limited precision means that the order of the
additions begins to matter for higher n.  Specifically, some orders
begin to give different results from others at n=38, and at many
higher n as well.

This patch uses 5/3 rather than 9/2.  I tested locally that
all addition orders give the same result over the test range.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/testsuite/
PR testsuite/113965
* gcc.target/aarch64/sve/mask_struct_load_3_run.c: Use an
input range that is suitable for _Float16.
---
 gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3_run.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3_run.c 
b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3_run.c
index 8bc3b08fcf4..c0a7416cfaf 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3_run.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/mask_struct_load_3_run.c
@@ -18,7 +18,7 @@
asm volatile ("" ::: "memory"); \
   }\
 for (int i = 0; i < N * 4; ++i)\
-  in[i] = i * 9 / 2;   \
+  in[i] = i * 5 / 3;   \
 NAME##_4 (out, in, mask, N);   \
 for (int i = 0; i < N; ++i)\
   {\
-- 
2.25.1



[PATCH] arm: Fix REVERSIBLE_CC_MODE [PR110796...]

2025-03-13 Thread Christophe Lyon
Since we have vcmp and vcmpe instructions (vcmpe raises an "Invalid
Operation" exception in presence of a NaN operand), we need to tell
the compiler it is not safe to reverse comparisons of floating-point
arguments.

On armv8-m.main+dsp+fp (cortex-m33):
PASS: gcc.dg/torture/builtin-iseqsig-1.c
at -O1, -O2, -O3, -Os

On armv8.1-m.main+mve.fp+fp.dp (cortex-m55):
PASS: gcc.dg/torture/builtin-iseqsig-1.c
PASS: gcc.dg/torture/builtin-iseqsig-2.c
PASS: gcc.dg/torture/builtin-iseqsig-3.c
at -O1, -O2, -O3, -Os

On armv7e-m+fp.dp (cortex-m7):
PASS: gcc.dg/torture/builtin-iseqsig-1.c
PASS: gcc.dg/torture/builtin-iseqsig-2.c
PASS: gcc.dg/torture/builtin-iseqsig-3.c
PASS: gcc.dg/torture/pr82692.c
at -O1, -O2, -O3, -Os

On armv8-a+simd:
PASS: gcc.dg/torture/builtin-iseqsig-1.c
PASS: gcc.dg/torture/builtin-iseqsig-2.c
PASS: gcc.dg/torture/builtin-iseqsig-3.c
PASS: gfortran.dg/ieee/comparisons_3.F90
at -Os (they already passed at other optimization levels)

gcc/
PR target/110796
PR target/118446
* config/arm/arm.h (REVERSIBLE_CC_MODE): Take floating-point modes
into account.
---
 gcc/config/arm/arm.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 8472b756127..3c9c7c795cb 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2257,7 +2257,10 @@ extern int making_const_table;
 
 #define SELECT_CC_MODE(OP, X, Y)  arm_select_cc_mode (OP, X, Y)
 
-#define REVERSIBLE_CC_MODE(MODE) 1
+/* Having an integer comparison mode guarantees that we can use
+   reverse_condition, but the usual restrictions apply to floating-point
+   comparisons.  */
+#define REVERSIBLE_CC_MODE(MODE) ((MODE) != CCFPmode && (MODE) != CCFPEmode)
 
 #define REVERSE_CONDITION(CODE,MODE) \
   (((MODE) == CCFPmode || (MODE) == CCFPEmode) \
-- 
2.34.1



Re: [RFC] PR81358: Enable automatic linking of libatomic

2025-03-13 Thread Prathamesh Kulkarni






From: Sam James
Sent: Friday, March 7, 2025 11:39 PM
To: Prathamesh Kulkarni
Cc: Thomas Schwinge; Tobias Burnus; Joseph Myers; Xi Ruoyao; Matthew Malcomson; 
gcc-patches@gcc.gnu.org; Tom de Vries
Subject: Re: [RFC] PR81358: Enable automatic linking of libatomic


External email: Use caution opening links or attachments





Sam James  writes:



> Prathamesh Kulkarni  writes:

>

>>> -Original Message-

>>> From: Prathamesh Kulkarni 

>>> Sent: 10 January 2025 09:48

>>> To: Thomas Schwinge 

>>> Cc: Tobias Burnus ; Joseph Myers

>>> ; Xi Ruoyao ; Matthew

>>> Malcomson ; gcc-patches@gcc.gnu.org; Tom de

>>> Vries 

>>> Subject: RE: [RFC] PR81358: Enable automatic linking of libatomic

>>>

>>> External email: Use caution opening links or attachments

>>>

>>>

>>> > -Original Message-

>>> > From: Thomas Schwinge 

>>> > Sent: 07 January 2025 17:45

>>> > To: Prathamesh Kulkarni 

>>> > Cc: Tobias Burnus ; Joseph Myers

>>> > ; Xi Ruoyao ; Matthew

>>> > Malcomson ; gcc-patches@gcc.gnu.org; Tom de

>>> > Vries 

>>> > Subject: RE: [RFC] PR81358: Enable automatic linking of libatomic

>>> >

>>> > External email: Use caution opening links or attachments

>>> >

>>> >

>>> > Hi Prathamesh!

>>> Hi Thomas, thanks for the review!

>>> >

>>> > Thanks for working on this!

>>> >

>>> >

>>> > Per my understanding, this patch won't automagically resolve the

>>> need

>>> > to

>>> > (occasionally) having to specify '-foffload-options=nvptx-none=-

>>> > latomic'

>>> > for nvptx offloading: it doesn't use 'LINK_LIBATOMIC_SPEC',

>>> currently

>>> > only used via 'GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC' from

>>> > 'gcc/config/gnu-user.h' (general issue, affecting a lot of

>>> > configurations, to be addressed as necessary):

>>> >

>>> > > --- a/gcc/config/gnu-user.h

>>> > > +++ b/gcc/config/gnu-user.h

>>> >

>>> > >  #define GNU_USER_TARGET_LINK_GCC_C_SEQUENCE_SPEC \

>>> > > -  "%{static|static-pie:--start-group} %G %{!nolibc:%L} \

>>> > > +  "%{static|static-pie:--start-group} %G %{!nolibc:"

>>> > > + LINK_LIBATOMIC_SPEC "%L} \

>>> > > %{static|static-pie:--end-group}%{!static:%{!static-pie:%G}}"

>>> >

>>> > > --- a/gcc/gcc.cc

>>> > > +++ b/gcc/gcc.cc

>>> >

>>> > >  /* Here is the spec for running the linker, after compiling all

>>> > > files.  */

>>> > >

>>> > > +#if defined(TARGET_PROVIDES_LIBATOMIC) &&

>>> defined(USE_LD_AS_NEEDED)

>>> > > +#define LINK_LIBATOMIC_SPEC "%{!fno-link-libatomic:"

>>> > LD_AS_NEEDED_OPTION \

>>> > > + " -latomic " LD_NO_AS_NEEDED_OPTION "} "

>>> > > +#else

>>> > > +#define LINK_LIBATOMIC_SPEC ""

>>> > > +#endif

>>> > > +

>>> > >  /* This is overridable by the target in case they need to specify

>>> > the

>>> > > -lgcc and -lc order specially, yet not require them to

>>> override

>>> > all

>>> > > of LINK_COMMAND_SPEC.  */

>>> >

>>> > ..., and the nvptx linker doesn't support '--as-needed'.

>>> >

>>> > I'll think about it; indeed it'd be good to get that resolved, too.

>>> >

>>> >

>>> > On 2024-12-20T15:37:42+, Prathamesh Kulkarni

>>> >  wrote:

>>> > > [...] copying libatomic.a  over to $(gcc_objdir)$(MULTISUBDIR)/,

>>> and

>>> > > can confirm that 64-bit libatomic.a is copied to $build/gcc/ and

>>> 32-

>>> > bit libatomic.a is copied to $build/gcc/32/.

>>> >

>>> > So this:

>>> >

>>> > > --- a/libatomic/Makefile.am

>>> > > +++ b/libatomic/Makefile.am

>>> >

>>> > > @@ -162,6 +162,11 @@ libatomic_convenience_la_LIBADD =

>>> > > $(libatomic_la_LIBADD)  # when it is reloaded during the build of

>>> > all-multi.

>>> > >  all-multi: $(libatomic_la_LIBADD)

>>> > >

>>> > > +gcc_objdir = $(MULTIBUILDTOP)../../$(host_subdir)/gcc

>>> > > +all: all-multi libatomic.la libatomic_convenience.la

>>> > > + $(INSTALL_DATA) .libs/libatomic.a

>>> $(gcc_objdir)$(MULTISUBDIR)/

>>> > > + chmod 644 $(gcc_objdir)$(MULTISUBDIR)/libatomic.a

>>> >

>>> > ... is conceptually modelled after libgcc, where the libraries get

>>> > copied into 'gcc/'?  However, here we only copy the static

>>> > 'libatomic.a'; libgcc does a 'make install-leaf', see

>>> > 'libgcc/Makefile.in':

>>> >

>>> > all: all-multi

>>> > # Now that we have built all the objects, we need to copy

>>> > # them back to the GCC directory.  Too many things (other

>>> > # in-tree libraries, and DejaGNU) know about the layout

>>> > # of the build tree, for now.

>>> > $(MAKE) install-leaf DESTDIR=$(gcc_objdir) \

>>> >   slibdir= libsubdir= MULTIOSDIR=$(MULTIDIR)

>>> >

>>> > ..., which also installs dynamic libraries.  Is that difference

>>> > intentional and/or possibly important?

>>> Well, I wasn't sure what extension to use for shared libraries, and

>>> initially avoided copying them.

>>> libgcc seems to use $(SHLIB_EXT) to specify extension name for shared

>>> libraries, which can be overridden by targets.

>

Re: [PATCH] c, c++: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 01:45:43PM -0400, Jason Merrill wrote:
> On 3/13/25 11:27 AM, Jakub Jelinek wrote:
> Parsing a jump-statement under cp_parser_expression_statement just because
> it happens to start with __attribute is pretty strange.

It is true that it is pretty strange, but that is where we handle the
empty declarations with GNU attributes (or shall those be called attribute
declarations) as well.  Or are empty statements with GNU attribute
expression-statement with no expression?

> How about changing cp_parser_std_attribute_spec_seq in cp_parser_statement
> to cp_parser_attributes_opt?

I'd be afraid that would be quite significant change of behavior everywhere,
something that C doesn't allow (like mixing std and GNU attributes in any
orders or [[]] __attribute__(()) [[]][[]] __attribute__(())
expression-statement).  Or it would allow __attribute__(()) on while, do,
for, if, switch, ..., again something that wasn't accepted before.

In any case, calling cp_parser_jump_expression from cp_parser_statement
directly rather than from cp_parser_expression_statement is easily possible
just by doing
  else if (cp_next_tokens_can_be_gnu_attribute_p (parser))
{
  unsigned int n = cp_parser_skip_gnu_attributes_opt (parser, 1);
  if (cp_lexer_nth_token_is_keyword (parser->lexer, n, RID_RETURN))
{
  tree attr = cp_parser_gnu_attributes_opt (parser);
  for (tree a = lookup_attribute ("musttail", attr);
   a; a = lookup_attribute ("musttail", TREE_CHAIN (a)))
if (TREE_VALUE (a))
  error ("%qs attribute does not take any arguments",
 "musttail");
  statement = cp_parser_jump_statement (parser, attr);
  if (attr != NULL_TREE && any_nonignored_attribute_p (attr))
warning_at (loc, OPT_Wattributes,
"attributes at the beginning of statement are "
"ignored");
  return statement;
}
}
after
  else if (token->type == CPP_EOF)
{
  cp_parser_error (parser, "expected statement");
  return;
}

Or the GNU attributes on empty statement stuff could move there as well.

Jakub



Re: [PATCH] COBOL v3: 3/14 80K bld: config and build machinery

2025-03-13 Thread Rainer Orth
Hi James,

> Our intention, tell me if you disagree, is that cobol is enabled if
>
> 1.  --enable-languages=all, and
> 2.  the host and target are "known good", x86_64 or aarch64

you tend to forget there's a world outside of Linux, actually: as you
can see at least in PRs cobol/119217 and cobol/119218, the code is
currently riddled with lots of Linux-specific assumptions which break
left and right on e.g. Darwin/x86_64 and Solaris/amd64.  Until this is
sorted out, this should be x86_64-*-linux* and aarch64-*-linux*, I
believe.

> or
>
> 3.  --enable-languages=cobol, and
> 4.  the host and target are "plausible", 64-bit LE.

I don't think that's right: --enable-languages=cobol should enable cobol
unconditionally if the user forces it, say to try on a different
architecture (like sparcv9, which is 64-bit BE).  No need to
second-guess here or force users to modify configure here.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Joseph Myers
On Thu, 13 Mar 2025, JeanHeyd Meneide wrote:

> On Thu, Mar 13, 2025 Qing Zhao  wrote:
> 
> > ...
> >
> > Is N3188 the following:
> > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm
> >
> > What’s the status of this proposal?
> 
> 
>  N3188 was discussed during the January 2024 Meeting in Strasbourg,
> France. There was "along the lines" (opinion poll) consensus for more work
> to be done:

Note that since I had 23 comments on this proposal on the reflector, there 
is surely a *lot* more work needing to be done.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] COBOL v3: 3/14 80K bld: config and build machinery

2025-03-13 Thread James K. Lowden
On Tue, 11 Mar 2025 11:18:22 +0100
Andreas Schwab  wrote:

> > +  
> > +# It's early days for COBOL, and it is known to compile on only
> > some host and +# target systems.  We remove COBOL from other builds
> > with a warning. +
> > +cobol_is_okay_host="no"
> > +cobol_is_okay_target="no"
> > +
> > +case "${host}" in
> > +  x86_64-*-*)
> > +cobol_is_okay_host="yes"
> > +;;
> > +  aarch64-*-*)
> > +cobol_is_okay_host="yes"
> > +;;
> > +esac
> > +case "${target}" in
> > +  x86_64-*-*)
> > +cobol_is_okay_target="yes"
> > +;;
> > +  aarch64-*-*)
> > +cobol_is_okay_target="yes"
> > +;;
> > +esac
> 
> libgcobol/configure.tgt says it's supported on powerpc64le.

Our intention, tell me if you disagree, is that cobol is enabled if

1.  --enable-languages=all, and
2.  the host and target are "known good", x86_64 or aarch64

or

3.  --enable-languages=cobol, and
4.  the host and target are "plausible", 64-bit LE.

In the latter case, the user is trying something we haven't, which might or 
might not work, and could provide feedback.  

As of today there is no 64-bit architecture known not to work.  There is only 
1) tested, and 2) computera incognita.  

--jkl


Re: [PATCH] c, c++: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Jason Merrill

On 3/13/25 3:16 PM, Jakub Jelinek wrote:

On Thu, Mar 13, 2025 at 01:45:43PM -0400, Jason Merrill wrote:

On 3/13/25 11:27 AM, Jakub Jelinek wrote:
Parsing a jump-statement under cp_parser_expression_statement just because
it happens to start with __attribute is pretty strange.


It is true that it is pretty strange, but that is where we handle the
empty declarations with GNU attributes (or shall those be called attribute
declarations) as well.  Or are empty statements with GNU attribute
expression-statement with no expression?


Yes, empty statements are technically expression-statements.


How about changing cp_parser_std_attribute_spec_seq in cp_parser_statement
to cp_parser_attributes_opt?


I'd be afraid that would be quite significant change of behavior everywhere,
something that C doesn't allow (like mixing std and GNU attributes in any
orders or [[]] __attribute__(()) [[]][[]] __attribute__(())
expression-statement).  Or it would allow __attribute__(()) on while, do,
for, if, switch, ..., again something that wasn't accepted before.


Do you think those changes are undesirable?  We've previously had to fix 
cases where we were failing to support mixing of std and GNU attributes.



In any case, calling cp_parser_jump_expression from cp_parser_statement
directly rather than from cp_parser_expression_statement is easily possible
just by doing
   else if (cp_next_tokens_can_be_gnu_attribute_p (parser))
 {
   unsigned int n = cp_parser_skip_gnu_attributes_opt (parser, 1);
   if (cp_lexer_nth_token_is_keyword (parser->lexer, n, RID_RETURN))
{
  tree attr = cp_parser_gnu_attributes_opt (parser);
  for (tree a = lookup_attribute ("musttail", attr);
   a; a = lookup_attribute ("musttail", TREE_CHAIN (a)))
if (TREE_VALUE (a))
  error ("%qs attribute does not take any arguments",
 "musttail");
  statement = cp_parser_jump_statement (parser, attr);
  if (attr != NULL_TREE && any_nonignored_attribute_p (attr))
warning_at (loc, OPT_Wattributes,
"attributes at the beginning of statement are "
"ignored");
  return statement;
}
 }
after
   else if (token->type == CPP_EOF)
 {
   cp_parser_error (parser, "expected statement");
   return;
 }

Or the GNU attributes on empty statement stuff could move there as well.


Yes, it seems desirable to move all the attribute handling out of 
cp_parser_expression_statement; in the grammar, the attributes aren't 
part of the expression-statement, and it's odd to handle fallthrough in 
both places.


Jason



[PATCH 1/2] arm: Add support for NEON vsqrt builtins (hf, sf, df)

2025-03-13 Thread Ayan Shafqat
Introduce support for a new set of NEON square-root intrinsics for half,
single, and double precision.

modified:   gcc/config/arm/arm-builtins.cc
1. Define the df_UP macro to map to E_DFmode.
2. Add CODE_FOR_neon_vsqrtsf and CODE_FOR_neon_vsqrtdf constants that
   reference the underlying VFP sqrt RTL patterns (sqrtsf2 and sqrtdf2).

modified:   gcc/config/arm/arm_vfp_builtins.def
1. Replace the single-mode entry for vsqrt with a unified VAR3 entry
   that supports hf, sf, and df modes.

These modifications enable the use of __builtin_neon_vsqrt{hf,sf,df} in user
code and ensure the correct mode is selected for each precision variant.

Signed-off-by: Ayan Shafqat 
Signed-off-by: Andrew Pinski 
---
 gcc/config/arm/arm-builtins.cc  | 3 +++
 gcc/config/arm/arm_vfp_builtins.def | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc
index c56ab5db985..acc86c7e8a1 100644
--- a/gcc/config/arm/arm-builtins.cc
+++ b/gcc/config/arm/arm-builtins.cc
@@ -694,6 +694,7 @@ arm_set_sat_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define hi_UPE_HImode
 #define void_UP E_VOIDmode
 #define sf_UP   E_SFmode
+#define df_UP E_DFmode
 #define UP(X) X##_UP
 
 typedef struct {
@@ -710,6 +711,8 @@ constexpr insn_code CODE_FOR_neon_usdotv8qi = 
CODE_FOR_neon_usdotv2siv8qi;
 constexpr insn_code CODE_FOR_neon_sdotv16qi = CODE_FOR_neon_sdotv4siv16qi;
 constexpr insn_code CODE_FOR_neon_udotv16qi = CODE_FOR_neon_udotv4siv16qi;
 constexpr insn_code CODE_FOR_neon_usdotv16qi = CODE_FOR_neon_usdotv4siv16qi;
+constexpr insn_code CODE_FOR_neon_vsqrtsf  = CODE_FOR_sqrtsf2;
+constexpr insn_code CODE_FOR_neon_vsqrtdf  = CODE_FOR_sqrtdf2;
 
 #define CF(N,X) CODE_FOR_neon_##N##X
 
diff --git a/gcc/config/arm/arm_vfp_builtins.def 
b/gcc/config/arm/arm_vfp_builtins.def
index 1fbf71e728e..8cafd72b565 100644
--- a/gcc/config/arm/arm_vfp_builtins.def
+++ b/gcc/config/arm/arm_vfp_builtins.def
@@ -40,7 +40,7 @@ VAR1 (UNOP, vrndm, hf)
 VAR1 (UNOP, vrndn, hf)
 VAR1 (UNOP, vrndp, hf)
 VAR1 (UNOP, vrndx, hf)
-VAR1 (UNOP, vsqrt, hf)
+VAR3 (UNOP, vsqrt, hf, sf, df)
 
 VAR2 (BINOP, vcvths_n, hf, si)
 VAR2 (BINOP, vcvthu_n, hf, si)
-- 
2.43.0



Re: [PATCH][v3] Simple cobol.dg testsuite

2025-03-13 Thread David Malcolm
On Thu, 2025-03-13 at 12:11 +0100, Simon Sobisch wrote:
> Thanks for your work on adding a testsuite. Can you please explain
> why 
> you do this when a complete testsuite exists in autoconf (autotest) 
> format (which roots back to decade of work in GnuCOBOL, with all 
> copyrights for that already with the FSF)?
> 
> Is the existence of this in upstream [1] just unknown (because it was
> not part of the initial patches [for reasons I not understood])?
> 
> Is the format such a big issue (note: previous discussions elaborated
> "a 
> test suite is very important and other frontends also use a framework
> other than dejagnu)?
> 
> If dejagnu is the way to go:
> 
> * Shouldn't there be deprecation of autotest in autoconf (of course
> only 
> if that preference is also outside of gcc)?
> 
> * Shouldn't there be a (at least semi automated) script / migration
> tool 
> (at least for this specific time in place to convert the "UAT" once
> into 
> dejagnu format)?
> 
> 
> 
> Thanks for giving me some context on this,
> Simon
> 
> 
> [1]: 
> https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/-/tree/master+cobol/gcc/cobol/UAT
> 

Hi Simon

Does the UAT testsuite have coverage for what happens on invalid code?
 
For example, in
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677481.html
my patch adds test coverage for the output on one kind of typo (or, at
least, I tried to, my knowledge of COBOL is essentially 0); I put this
in Richard's DejaGnu suite since I have lots of similar tests for other
frontends.

Having a good user experience on incorrect code is important, so we
need some kind of test coverage for this.  The new DejaGnu-based tests
seem to work well for that (as per the patch).  I don't know if this is
something that could/should be shared with GnuCOBOL, given that there
might well be differences in error-handling behavior between GCC COBOL
and GnuCOBOL.

Thoughts?
Dave



[PATCH v2 1/2] Aarch64: Add FMA and FMAF intrinsic and corresponding tests

2025-03-13 Thread Ayan Shafqat
This patch introduces inline definitions for the __fma and __fmaf
functions in arm_acle.h for Aarch64 targets. These definitions rely on
__builtin_fma and __builtin_fmaf to ensure proper inlining and to meet
the ACLE requirements [1].

The patch has been tested locally using a crosstool-NG sysroot for
Aarch64, confirming that the generated code uses the expected fused
multiply-accumulate instructions (fmadd).

gcc/ChangeLog:

* config/aarch64/arm_acle.h:
(__fma): New Function
(__fmaf): New Function

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/acle_fma.c: New test.
---
 gcc/config/aarch64/arm_acle.h   | 14 ++
 .../gcc.target/aarch64/acle/acle_fma.c  | 17 +
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/acle_fma.c

diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 7976c117daf..d9e2401ea9f 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -129,6 +129,20 @@ __jcvt (double __a)
 
 #pragma GCC pop_options
 
+__extension__ extern __inline double
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__fma (double __x, double __y, double __z)
+{
+  return __builtin_fma (__x, __y, __z);
+}
+
+__extension__ extern __inline float
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__fmaf (float __x, float __y, float __z)
+{
+  return __builtin_fmaf (__x, __y, __z);
+}
+
 #pragma GCC push_options
 #pragma GCC target ("+nothing+frintts")
 __extension__ extern __inline float
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/acle_fma.c 
b/gcc/testsuite/gcc.target/aarch64/acle/acle_fma.c
new file mode 100644
index 000..9363a75b593
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/acle_fma.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "arm_acle.h"
+
+double test_acle_fma (double x, double y, double z)
+{
+  return __fma (x, y, z);
+}
+
+float test_acle_fmaf (float x, float y, float z)
+{
+  return __fmaf (x, y, z);
+}
+
+/* { dg-final { scan-assembler-times "fmadd\td\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "fmadd\ts\[0-9\]" 1 } } */
-- 
2.43.0



[PATCH v2 2/2] arm: Add FMA and FMAF intrinsics with corresponding tests

2025-03-13 Thread Ayan Shafqat
This patch introduces inline definitions for the __fma and __fmaf
functions in arm_acle.h for arm targets. These definitions rely on
__builtin_fma and __builtin_fmaf to ensure proper inlining and to meet
the ACLE requirements [1].

The patch has been tested locally using a crosstool-NG sysroot for
arm-cortexa9_neon-linux-gnueabihf, confirming that the generated code
uses the expected fused multiply-accumulate instructions:

vfma.f32 for single precision
vmfa.f64 for double precision

[1] 
https://arm-software.github.io/acle/main/acle.html#fused-multiply-accumulate-fma

gcc/ChangeLog:

* config/arm/arm_acle.h (__attribute__):
(__fma): New Function
(__fmaf): New Function

gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/acle_fma.c: New test.
---
 gcc/config/arm/arm_acle.h| 18 ++
 gcc/testsuite/gcc.target/arm/acle/acle_fma.c | 17 +
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/acle/acle_fma.c

diff --git a/gcc/config/arm/arm_acle.h b/gcc/config/arm/arm_acle.h
index c6c03fdce27..14c28f11b9c 100644
--- a/gcc/config/arm/arm_acle.h
+++ b/gcc/config/arm/arm_acle.h
@@ -829,6 +829,24 @@ __crc32cd (uint32_t __a, uint64_t __b)
 #endif /* __ARM_FEATURE_CRC32  */
 #pragma GCC pop_options
 
+#ifdef __ARM_FEATURE_FMA
+__extension__ extern __inline double
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__fma (double __x, double __y, double __z)
+{
+  return __builtin_fma (__x, __y, __z);
+}
+#endif
+
+#ifdef __ARM_FEATURE_FMA
+__extension__ extern __inline float
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__fmaf (float __x, float __y, float __z)
+{
+  return __builtin_fmaf (__x, __y, __z);
+}
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/gcc/testsuite/gcc.target/arm/acle/acle_fma.c 
b/gcc/testsuite/gcc.target/arm/acle/acle_fma.c
new file mode 100644
index 000..4177ac81f07
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/acle/acle_fma.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a -mfpu=neon-vfpv4 -mfloat-abi=hard" } */
+
+#include "arm_acle.h"
+
+double test_acle_fma (double x, double y, double z)
+{
+  return __fma (x, y, z);
+}
+
+float test_acle_fmaf (float x, float y, float z)
+{
+  return __fmaf (x, y, z);
+}
+
+/* { dg-final { scan-assembler-times "vfma.f64\td\[0-9\]," 1 } } */
+/* { dg-final { scan-assembler-times "vfma.f32\ts\[0-9\]" 1 } } */
-- 
2.43.0



[PATCH] c, c++, v2: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 03:52:25PM -0400, Jason Merrill wrote:
> > > How about changing cp_parser_std_attribute_spec_seq in cp_parser_statement
> > > to cp_parser_attributes_opt?
> > 
> > I'd be afraid that would be quite significant change of behavior everywhere,
> > something that C doesn't allow (like mixing std and GNU attributes in any
> > orders or [[]] __attribute__(()) [[]][[]] __attribute__(())
> > expression-statement).  Or it would allow __attribute__(()) on while, do,
> > for, if, switch, ..., again something that wasn't accepted before.
> 
> Do you think those changes are undesirable?  We've previously had to fix
> cases where we were failing to support mixing of std and GNU attributes.

I know, but we've never allowed GNU attributes on most of those, neither
does clang, we don't allow it in C and with the exception of
fallthrough/assume on empty statement and musttail on return we currently
even don't have any uses for it.  If we start accepting it, we'd then
need to support it forever.

> Yes, it seems desirable to move all the attribute handling out of
> cp_parser_expression_statement; in the grammar, the attributes aren't part
> of the expression-statement, and it's odd to handle fallthrough in both
> places.

Here is adjusted patch which just moves the __attribute__((musttail)) return
handling to cp_parser_statement.  So far tested on dg.exp=musttail*

Moving the GNU attribute assume/fallthrough handling on empty statement handling
from cp_parser_expression_statement would regress
int bar (int x) { return x; }

void
foo (void)
{
  if (__attribute__(()); true)
;
  if (__attribute__((assume (bar (1; true)
;
//  if (__attribute__((fallthrough)); true)
//;
}
accepted by g++ 13 and 14 (fallthrough commented out, that is diagnosed
with error).

2025-03-13  Jakub Jelinek  

PR c/116545
gcc/
* doc/extend.texi (musttail statement attribute): Document
that musttail GNU attribute can be used as well.
gcc/c-family/
* c-attribs.cc (c_common_clang_attributes): Add musttail.
gcc/c/
* c-parser.cc (c_parser_declaration_or_fndef): Parse
__attribute__((musttail)) return.
(c_parser_handle_musttail): Diagnose attribute arguments.
(c_parser_statement_after_labels): Parse
__attribute__((musttail)) return.
gcc/cp/
* parser.cc (cp_parser_statement): Parse __attribute__((musttail))
return.
gcc/testsuite/
* c-c++-common/musttail15.c: New test.
* c-c++-common/musttail16.c: New test.
* c-c++-common/musttail17.c: New test.
* c-c++-common/musttail18.c: New test.
* c-c++-common/musttail19.c: New test.
* c-c++-common/musttail20.c: New test.
* c-c++-common/musttail21.c: New test.
* c-c++-common/musttail22.c: New test.
* c-c++-common/musttail23.c: New test.
* c-c++-common/musttail24.c: New test.
* g++.dg/musttail7.C: New test.
* g++.dg/musttail8.C: New test.
* g++.dg/musttail12.C: New test.
* g++.dg/musttail13.C: New test.

--- gcc/doc/extend.texi.jj  2025-03-13 14:04:54.160230977 +0100
+++ gcc/doc/extend.texi 2025-03-13 20:36:28.719373035 +0100
@@ -10241,18 +10241,22 @@ have to optimize it to just @code{return
 @cindex @code{musttail} statement attribute
 @item musttail
 
-The @code{gnu::musttail} or @code{clang::musttail} attribute
-can be applied to a @code{return} statement with a return-value expression
-that is a function call.  It asserts that the call must be a tail call that
-does not allocate extra stack space, so it is safe to use tail recursion
-to implement long running loops.
+The @code{gnu::musttail} or @code{clang::musttail} standard attribute
+or @code{musttail} GNU attribute can be applied to a @code{return} statement
+with a return-value expression that is a function call.  It asserts that the
+call must be a tail call that does not allocate extra stack space, so it is
+safe to use tail recursion to implement long running loops.
 
 @smallexample
 [[gnu::musttail]] return foo();
 @end smallexample
 
+@smallexample
+__attribute__((musttail)) return bar();
+@end smallexample
+
 If the compiler cannot generate a @code{musttail} tail call it will report
-an error. On some targets tail calls may never be supported.
+an error.  On some targets tail calls may never be supported.
 Tail calls cannot reference locals in memory, which may affect
 builds without optimization when passing small structures, or passing
 or returning large structures.  Enabling @option{-O1} or @option{-O2} can
--- gcc/c-family/c-attribs.cc.jj2025-03-13 14:04:54.041232614 +0100
+++ gcc/c-family/c-attribs.cc   2025-03-13 20:36:28.735372814 +0100
@@ -651,7 +651,9 @@ const struct scoped_attribute_specs c_co
 /* Attributes also recognized in the clang:: namespace.  */
 const struct attribute_spec c_common_clang_attributes[] = {
   { "flag_enum", 0, 0, false, true, false, false,
-

[PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk, and 14 after
a while?

-- >8 --

The type tuple> is clearly copy/move constructible, but for
reasons that are not yet completely understood checking this triggers
constraint recursion with our C++20 tuple implementation (and not the
C++17 implementation).

It turns out this recursion stems from considering the non-template
tuple(const _Elements&) constructor when checking for copy/move
constructibility.  Checking this constructor is of course redundant,
since the defaulted copy/move constructors are better matches.

GCC has a non-standard "perfect candidate" optimization[1] that causes
overload resolution to shortcut considering template candidates if we
find a (non-template) perfect candidate.  So to work around this tuple
bug (and as a general compile-time optimization) this patch turns the
problematic constructor into a template so that GCC doesn't consider it
when checking for copy/move constructibility.

Changing the template-ness of a constructor can affect the outcome of
overload resolution (since template-ness is a tiebreaker) so there's a
risk this change could cause overload resolution ambiguities.  The tuple
constructor set doesn't seem particularly reliant on this tiebreaker
though.

The testcase still fails with Clang (in C++20 mode) since it doesn't
implement said optimization.

PR libstdc++/116440

libstdc++-v3/ChangeLog:

* include/std/tuple (tuple::tuple(const _Elements&...)):
Turn into a template.
* testsuite/20_util/tuple/116440.C: New test.

[1]: See r11-7287-g187d0d5871b1fa and
https://isocpp.org/files/papers/P3606R0.html
---
 libstdc++-v3/include/std/tuple| 14 +
 libstdc++-v3/testsuite/20_util/tuple/116440.C | 29 +++
 2 files changed, 37 insertions(+), 6 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/tuple/116440.C

diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 34d790fd6f5..67760471d95 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -966,12 +966,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _Inherited()
   { }
 
-  constexpr explicit(!__convertible())
-  tuple(const _Elements&... __elements)
-  noexcept(__nothrow_constructible())
-  requires (__constructible())
-  : _Inherited(__elements...)
-  { }
+  // Defined as a template to work around PR libstdc++/116440.
+  template
+   constexpr explicit(!__convertible())
+   tuple(const _Elements&... __elements)
+   noexcept(__nothrow_constructible())
+   requires (__constructible())
+   : _Inherited(__elements...)
+   { }
 
   template
requires (__disambiguating_constraint<_UTypes...>())
diff --git a/libstdc++-v3/testsuite/20_util/tuple/116440.C 
b/libstdc++-v3/testsuite/20_util/tuple/116440.C
new file mode 100644
index 000..12259134d25
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/tuple/116440.C
@@ -0,0 +1,29 @@
+// PR libstdc++/116440 - std::tuple> does not compile
+// { dg-do compile { target c++17 } }
+
+#include 
+#include 
+#include 
+
+template 
+using TupleTuple = std::tuple>;
+
+struct EmbedAny {
+std::any content;
+};
+
+static_assert(std::is_copy_constructible>::value);
+static_assert(std::is_move_constructible>::value);
+
+static_assert(std::is_copy_constructible>::value);
+static_assert(std::is_move_constructible>::value);
+
+static_assert(std::is_constructible_v>);
+
+struct EmbedAnyWithZeroSizeArray {
+void* pad[0];
+std::any content;
+};
+
+static_assert(std::is_copy_constructible>::value);
+static_assert(std::is_move_constructible>::value);
-- 
2.49.0.rc1.37.ge969bc8759



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Patrick Palka
On Thu, 13 Mar 2025, Ville Voutilainen wrote:

> On Thu, 13 Mar 2025 at 23:16, Ville Voutilainen
>  wrote:
> >
> > On Thu, 13 Mar 2025 at 23:03, Patrick Palka  wrote:
> > > +  // Defined as a template to work around PR libstdc++/116440.
> > > +  template
> > > +   constexpr explicit(!__convertible())
> > > +   tuple(const _Elements&... __elements)
> >
> > I don't understand how a constructor template declared like this can
> > ever be called. The template parameter pack
> > can't be provided or deduced, and can't have a default. So we're
> > effectively making this signature always lose
> > overload resolution to the one that takes a pack of _UElements&&.
> >
> > Which may be fine. I can't head-compile a test that would fail in that
> > case. If any of the incoming argument isn't one
> > of _Elements, that constructor wins overload resolution anyway. If the
> > incoming arguments are exactly _Elements, that
> > constructor does the same thing as this one. I think.
> 
> Oh, never mind. The pack is just deduced as an empty pack.

Yep that's my understanding, though I don't know where in the standard
this is specified, a quick Ctrl+F is failing me.

I can use template or template if that's
preferred :)



Re: [PATCH 2/3] Aarch64: Add __sqrt and __sqrtf intrinsics to arm_acle.h

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 05:23:00PM -0400, Ayan Shafqat wrote:
> This patch introduces two new inline functions, __sqrt and __sqrtf, in
> arm_acle.h for Aarch64 targets. These functions wrap the new builtins
> __builtin_aarch64_sqrtdf and __builtin_aarch64_sqrtsf, respectively,
> providing direct access to hardware instructions without relying on the
> standard math library or optimization levels.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/arm_acle.h (__sqrt, __sqrtf): New function.
> 
> Signed-off-by: Ayan Shafqat 
> ---
>  gcc/config/aarch64/arm_acle.h | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
> index 7976c117daf..d972a4e7e7e 100644
> --- a/gcc/config/aarch64/arm_acle.h
> +++ b/gcc/config/aarch64/arm_acle.h
> @@ -118,6 +118,20 @@ __revl (unsigned long __value)
>  return __rev (__value);
>  }
>  
> +__extension__ extern __inline double
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__sqrt(double  __x)

Just formatting nits, there should be space in between the function name
and ( and only one space between double and __x.

Also, it is unclear why it uses __extension__ (but admittedly it is used
elsewhere in the header.

> +{
> +return __builtin_aarch64_sqrtdf (__x);

Just two space indentation rather than 4 spaces.

> +}
> +
> +__extension__ extern __inline float
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__sqrtf(float __x)

See above

> +{
> +return __builtin_aarch64_sqrtsf (__x);

Ditto

Jakub



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Jonathan Wakely
On Thu, 13 Mar 2025 at 21:29, Patrick Palka  wrote:
>
> On Thu, 13 Mar 2025, Ville Voutilainen wrote:
>
> > On Thu, 13 Mar 2025 at 23:16, Ville Voutilainen
> >  wrote:
> > >
> > > On Thu, 13 Mar 2025 at 23:03, Patrick Palka  wrote:
> > > > +  // Defined as a template to work around PR libstdc++/116440.
> > > > +  template
> > > > +   constexpr explicit(!__convertible())
> > > > +   tuple(const _Elements&... __elements)
> > >
> > > I don't understand how a constructor template declared like this can
> > > ever be called. The template parameter pack
> > > can't be provided or deduced, and can't have a default. So we're
> > > effectively making this signature always lose
> > > overload resolution to the one that takes a pack of _UElements&&.
> > >
> > > Which may be fine. I can't head-compile a test that would fail in that
> > > case. If any of the incoming argument isn't one
> > > of _Elements, that constructor wins overload resolution anyway. If the
> > > incoming arguments are exactly _Elements, that
> > > constructor does the same thing as this one. I think.
> >
> > Oh, never mind. The pack is just deduced as an empty pack.
>
> Yep that's my understanding, though I don't know where in the standard
> this is specified, a quick Ctrl+F is failing me.
>
> I can use template or template if that's
> preferred :)

I would prefer template to the empty pack, I think
the default template argument makes it a little more obvious how that
constructor can be called (I'm sure Ville won't be the only one to
raise an eyebrow at that).

Thanks for figuring this out, and noticing that that the template-ness
of that constructor is what changed between C++17 and C++20. I think
when I re-implemented it using concepts I assumed the template-ness
was there for the _ImplicitCtor / _ExplicitCtor stuff, which is done
using explicit(bool) in C++20. I wasn't looking at the tuple(const
_Elements&...) constructor at all, because the errors all pointed to
tuple(_UTypes&&...).

Do we also want to constraint the tuple(const _Elements&...)
constructor with requires sizeof...(_Elements) >= 1, which is present
on the C++17 version?



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Thu, 13 Mar 2025 at 23:57, Jonathan Wakely  wrote:
> > Do we also want to constraint the tuple(const _Elements&...)
> > constructor with requires sizeof...(_Elements) >= 1, which is present
> > on the C++17 version?
>
> Oh we don't need that constraint, because we have an explicit
> specialization for tuple<>.

Are.. ..you sure? What about CTAD? That will look into the primary
template only, and will perform its madness based on that
and nothing else.


Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Fri, 14 Mar 2025 at 00:03, Ville Voutilainen
 wrote:
>
> On Thu, 13 Mar 2025 at 23:57, Jonathan Wakely  wrote:
> > > Do we also want to constraint the tuple(const _Elements&...)
> > > constructor with requires sizeof...(_Elements) >= 1, which is present
> > > on the C++17 version?
> >
> > Oh we don't need that constraint, because we have an explicit
> > specialization for tuple<>.
>
> Are.. ..you sure? What about CTAD? That will look into the primary
> template only, and will perform its madness based on that
> and nothing else.

Oh, but CTAD would fail if there's no arguments passed.

So.. maybe we never needed that constraint, and the only reason I
added it was not trusting my (or anyone else's) understanding
of partial ordering and overload resolution.


Re: COBOL: testsuite and running NIST85 (was: Re: [PATCH][v3] Simple cobol.dg testsuite)

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Simon Sobisch wrote:

> 
> Am 13.03.2025 um 12:49 schrieb Richard Biener:
> > On Thu, 13 Mar 2025, Sam James wrote:
> > 
> >> Simon Sobisch  writes:
> >>
> >>> Thanks for your work on adding a testsuite. Can you please explain why
> >>> you do this when a complete testsuite exists in autoconf (autotest)
> >>> format (which roots back to decade of work in GnuCOBOL, with all
> >>> copyrights for that already with the FSF)?
> >>>
> >>
> >> I don't think any of us were aware of it ("we" being "the general GCC
> >> developer community", not the COBOL folks, for the purposes of this
> >> email) until yesterday when richi mused about it on IRC maybe existing
> >> and we went looking out of curiosity.
> >>
> >> I agree that having that testsuite integrated would be fantastic.
> >>
> >>> Is the existence of this in upstream [1] just unknown (because it was
> >>> not part of the initial patches [for reasons I not understood])?
> >>>
> >>
> >> I would've personally liked to see the NIST testsuite integration at
> >> least in the initial patches, but it is what it is. I don't think the
> >> GnuCOBOL testsuite was brought up at all (and I think most of us weren't
> >> aware of it) in the patch upstreaming discussions.
> >>
> >> Now that we *are* aware of it, it seems desirable to have for sure.
> >>
> >>> Is the format such a big issue (note: previous discussions elaborated
> >>> "a test suite is very important and other frontends also use a
> >>> framework other than dejagnu)?
> >>>
> >>> If dejagnu is the way to go:
> >>>
> >>> * Shouldn't there be deprecation of autotest in autoconf (of course
> >>>only if that preference is also outside of gcc)?
> >>
> >> It's a GCC / GNU toolchain-only preference because it allows easily
> >> doing cross + simulator testing, and all of our tools are used to its
> >> format.
> > 
> > That's indeed the main reason.
> 
> Thanks for the explanation. That's totally fine.
> 
> > 
> >> It's definitely not perfect. Years ago (way before I followed GCC),
> >> there was talk of replacing dejagnu, just efforts failed.
> >>
> >>>
> >>> * Shouldn't there be a (at least semi automated) script / migration
> >>>tool (at least for this specific time in place to convert the "UAT"
> >>>once into dejagnu format)?
> >>
> >> Yes. Having testsuite integration is seen as critical at this
> >> point. richi just wanted to present this as a non-COBOL person to give
> >> us something to play with.
> > 
> > Yes, and to give people familiar with how GCC tests are done a place
> > to put regression tests going forward.
> > 
> > I do think that integrating the testsuites the COBOLworx folks have
> > is important and of course integrating tests from GNU Cobol is desirable
> > as well.  Whether we can or want to integrate tests based on autotest
> > is another question - I'd probably avoid that, even as short-term
> > solution, as such tend to stay forever.
> 
> I agree. Note: COBOLworx started by using the GnuCOBOL testsuite; even with
> the current UAT's state it would be a lot of manual work to re-synchronize
> them, so going one step further to dejagnu seems to not make it much harder
> either.
> It will definitely be useful if the "original test file names" (like
> run_subscripts.at, or at least run_subscripts) are kept somewhere - a comment
> like "auto-translated from run_subscripts.at" is enough - and maybe they can
> stay in one file each (I don't know enough about dejagnu to comment on that).
> 
> The main point is that it seems most reasonable to convert those files into
> dejagnu format once (so obviously a "script working good enough, not
> installed" comes into mind), instead of writing it from scratch.

Definitely.

> > What would be nice is to have a common separate test harness you can
> > test an installed compiler against - I'm not sure whether the GNU
> > Cobol test harness or the COBOLworx one qualifies here.  The NIST
> > one probably does, but it seems to require "plumbing" that's not
> > part of NIST and that, in implementation, might differ from GNU Cobol
> > to COBOLworx.
> 
> That's a good opportunity to be picky: it is GnuCOBOL (one word, COBOL in
> upper-case) :-)
> 
> And yes: a common separate test harness is most reasonable and that's exactly
> what the idea of NIST was.
> If you ever wonder: GnuCOBOL uses make (with one sub-directory per "Module")
> along with perl [2].
> This allows to not only do testing (or just extraction of the files) along
> with counting and tracking time, but also to automate some of the required
> "needs manual inspection".
> 
> And given gcobc, I'd argue that gcobol should not fail the following (and
> ideally show its superior compile and run time):
> 
> $> tar -xvf gnucobol-3.*.tar.*
> $> cd gnucobol-3.*/
> $> ./configure  # for automake and autoconf doing the setup
> $> cd tests/cobol85
> $> make test COBC=gcobc-15
> 
> ... just tried that:
> gcobol: error: unrecognized command-line option ‘-std=cobol85’
> 
> --> seems 

Re: [PATCH][v3] Simple cobol.dg testsuite

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Sam James wrote:

> Simon Sobisch  writes:
> 
> > Thanks for your work on adding a testsuite. Can you please explain why
> > you do this when a complete testsuite exists in autoconf (autotest)
> > format (which roots back to decade of work in GnuCOBOL, with all
> > copyrights for that already with the FSF)?
> >
> 
> I don't think any of us were aware of it ("we" being "the general GCC
> developer community", not the COBOL folks, for the purposes of this
> email) until yesterday when richi mused about it on IRC maybe existing
> and we went looking out of curiosity.
> 
> I agree that having that testsuite integrated would be fantastic.
> 
> > Is the existence of this in upstream [1] just unknown (because it was
> > not part of the initial patches [for reasons I not understood])?
> >
> 
> I would've personally liked to see the NIST testsuite integration at
> least in the initial patches, but it is what it is. I don't think the
> GnuCOBOL testsuite was brought up at all (and I think most of us weren't
> aware of it) in the patch upstreaming discussions.
> 
> Now that we *are* aware of it, it seems desirable to have for sure.
> 
> > Is the format such a big issue (note: previous discussions elaborated
> > "a test suite is very important and other frontends also use a
> > framework other than dejagnu)?
> >
> > If dejagnu is the way to go:
> >
> > * Shouldn't there be deprecation of autotest in autoconf (of course
> >   only if that preference is also outside of gcc)?
> 
> It's a GCC / GNU toolchain-only preference because it allows easily
> doing cross + simulator testing, and all of our tools are used to its
> format.

That's indeed the main reason.

> It's definitely not perfect. Years ago (way before I followed GCC),
> there was talk of replacing dejagnu, just efforts failed.
> 
> >
> > * Shouldn't there be a (at least semi automated) script / migration
> >   tool (at least for this specific time in place to convert the "UAT"
> >   once into dejagnu format)?
> 
> Yes. Having testsuite integration is seen as critical at this
> point. richi just wanted to present this as a non-COBOL person to give
> us something to play with.

Yes, and to give people familiar with how GCC tests are done a place
to put regression tests going forward.

I do think that integrating the testsuites the COBOLworx folks have
is important and of course integrating tests from GNU Cobol is desirable
as well.  Whether we can or want to integrate tests based on autotest
is another question - I'd probably avoid that, even as short-term 
solution, as such tend to stay forever.

What would be nice is to have a common separate test harness you can
test an installed compiler against - I'm not sure whether the GNU
Cobol test harness or the COBOLworx one qualifies here.  The NIST
one probably does, but it seems to require "plumbing" that's not
part of NIST and that, in implementation, might differ from GNU Cobol
to COBOLworx.

Richard.

> >
> >
> >
> > Thanks for giving me some context on this,
> > Simon
> >
> >
> > [1]:
> > https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/-/tree/master+cobol/gcc/cobol/UAT
> 
> thanks,
> sam
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Fix invalid profile mismatch error

2025-03-13 Thread Jan Hubicka
Hi,
this patch fixes false incosistent profile error message seen when building 
SPEC with
-fprofile-use -fdump-ipa-profile.
The problem is that with dumping tree_esitmate_probability is run in dry run
mode to report success rates of heuristics.  It however runs 
determine_unlikely_bbs
which ovewrites some counts to profile_count::zero and later value profiling 
sees
the mismatch.

In sane profile determine_unlikely_bbs should be almost always no-op since it
should only drop to 0 things that are known to be unlikely executed. What
happens here is that there is a comdat where profile is lost and we see a
call with non-zero count calling function with zero count and "fix" the profile
by making the call to have zero count, too.

I also extended unlikely prediates to avoid tampering with predictions when
prediciton is believed to be reliable.  This also avoids us from dropping all
EH regions to 0 count as tested by the testcase.

Jeff, I think we can use same logic as in isolate-paths and detect
statements which will trigger undefined effect or errornerous
termination in unlikely_executed_stmt_p.  Are you OK with exporting
stmt_uses_0_or_null_in_undefined_way and using it in
unlikely_executed_stmt_p?

Do we have other ways to detect that given STMT will very likely not be
executed?

Bootsrapped/regtested x86_64-linux, will commit it shortly.

gcc/ChangeLog:

* predict.cc (unlikely_executed_edge_p): Ignore EDGE_EH if profile
is reliable.
(unlikely_executed_stmt_p): special case builtin_trap/unreachable and
ignore other heuristics for reliable profiles.
(tree_estimate_probability): Disable unlikely bb detection when
doing dry run

gcc/testsuite/ChangeLog:

* g++.dg/tree-prof/eh1.C: New test.

diff --git a/gcc/predict.cc b/gcc/predict.cc
index ef31c48bfe2..eefff5908f9 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -245,7 +245,10 @@ unlikely_executed_edge_p (edge e)
 {
   return (e->src->count == profile_count::zero ()
  || e->probability == profile_probability::never ())
-|| (e->flags & (EDGE_EH | EDGE_FAKE));
+|| (e->flags & EDGE_FAKE)
+/* If we read profile and know EH edge is executed, trust it.
+   Otherwise we consider EH edges never executed.  */
+|| ((e->flags & EDGE_EH) && !e->probability.reliable_p ());
 }
 
 /* Return true if edge E of function FUN is probably never executed.  */
@@ -830,6 +833,26 @@ unlikely_executed_stmt_p (gimple *stmt)
 {
   if (!is_gimple_call (stmt))
 return false;
+
+  /* Those calls are inserted by optimizers when code is known to be
+ unreachable or undefined.  */
+  if (gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE)
+  || gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE_TRAP)
+  || gimple_call_builtin_p (stmt, BUILT_IN_TRAP))
+return false;
+
+  /* Checks below do not need to be fully reliable.  Cold attribute may be
+ misplaced by user and in the presence of comdat we may result in call to
+ function with 0 profile having non-zero profile.
+
+ We later detect that profile is lost and will drop the profile of the
+ comdat.
+
+ So if we think profile count is reliable, do not try to apply these
+ heuristics.  */
+  if (gimple_bb (stmt)->count.reliable_p ()
+  && gimple_bb (stmt)->count.nonzero_p ())
+return gimple_bb (stmt)->count == profile_count::zero ();
   /* NORETURN attribute alone is not strong enough: exit() may be quite
  likely executed once during program run.  */
   if (gimple_call_fntype (stmt)
@@ -3269,7 +3292,8 @@ tree_estimate_probability (bool dry_run)
   calculate_dominance_info (CDI_POST_DOMINATORS);
   /* Decide which edges are known to be unlikely.  This improves later
  branch prediction. */
-  determine_unlikely_bbs ();
+  if (!dry_run)
+determine_unlikely_bbs ();
 
   bb_predictions = new hash_map;
   ssa_expected_value = new hash_map, expected_value>;
diff --git a/gcc/testsuite/g++.dg/tree-prof/eh1.C 
b/gcc/testsuite/g++.dg/tree-prof/eh1.C
new file mode 100644
index 000..10a35968dc4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-prof/eh1.C
@@ -0,0 +1,34 @@
+/* { dg-options "-O3 -fdump-ipa-profile-details -fno-inline 
-fdump-tree-fixup_cfg3-details -fdump-tree-optimized-details" } */
+char a[1];
+char b[1];
+int sz = 1000;
+
+__attribute__((noipa))
+ void test2 ()
+{
+  throw (sz);
+}
+void
+test ()
+{
+  try
+  {
+test2 ();
+  }
+  catch (int v)
+  {
+__builtin_memcpy (b, a, v);
+  }
+}
+int
+main ()
+{
+  for (int i = 0; i < 10; i++)
+test ();
+}
+/* { dg-final-use-not-autofdo { scan-ipa-dump-times "Average value 
sum:1" 2 "profile" } } */
+/* 1 zero count for resx block.  */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 1 "fixup_cfg3" 
} } */
+/* 2 zero count for resx block and return block since return gets duplicated 
by tracer.  */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 2 "optimized" 
} 

[PATCH] libstdc++: Implement P3137R3 views::to_input for C++26

2025-03-13 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/version.def (ranges_to_input): Define.
* include/bits/version.h: Regenerate.
* include/std/ranges (ranges::to_input_view): Define for C++26.
(views::__detail::__can_to_input): Likewise.
(views::_ToInput, views::to_input): Likewise.
* testsuite/std/ranges/adaptors/to_input/1.cc: New test.
---
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 ++
 libstdc++-v3/include/std/ranges   | 170 ++
 .../std/ranges/adaptors/to_input/1.cc |  58 ++
 4 files changed, 246 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/std/ranges/adaptors/to_input/1.cc

diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 2af5a54bff2..c2b5283df89 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -1910,6 +1910,14 @@ ftms = {
   };
 };
 
+ftms = {
+  name = ranges_to_input;
+  values = {
+v = 202502;
+cxxmin = 26;
+  };
+};
+
 ftms = {
   name = to_string;
   values = {
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index 9833023cfdc..775c8642139 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -2120,6 +2120,16 @@
 #endif /* !defined(__cpp_lib_text_encoding) && 
defined(__glibcxx_want_text_encoding) */
 #undef __glibcxx_want_text_encoding
 
+#if !defined(__cpp_lib_ranges_to_input)
+# if (__cplusplus >  202302L)
+#  define __glibcxx_ranges_to_input 202502L
+#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_ranges_to_input)
+#   define __cpp_lib_ranges_to_input 202502L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_ranges_to_input) && 
defined(__glibcxx_want_ranges_to_input) */
+#undef __glibcxx_want_ranges_to_input
+
 #if !defined(__cpp_lib_to_string)
 # if (__cplusplus >  202302L) && _GLIBCXX_HOSTED && (__glibcxx_to_chars)
 #  define __glibcxx_to_string 202306L
diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index e21f5284b46..dd97d276ef0 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -69,6 +69,7 @@
 #define __glibcxx_want_ranges_slide
 #define __glibcxx_want_ranges_stride
 #define __glibcxx_want_ranges_to_container
+#define __glibcxx_want_ranges_to_input
 #define __glibcxx_want_ranges_zip
 #include 
 
@@ -10390,6 +10391,175 @@ namespace ranges
 } // namespace ranges
 #endif // __cpp_lib_ranges_cache_latest
 
+#if __cpp_lib_ranges_to_input // C++ >= 26
+namespace ranges
+{
+  template
+requires view<_Vp>
+  class to_input_view : public view_interface>
+  {
+_Vp _M_base = _Vp();
+
+template
+class _Iterator;
+
+  public:
+to_input_view() requires default_initializable<_Vp> = default;
+
+constexpr explicit
+to_input_view(_Vp __base)
+: _M_base(std::move(__base))
+{ }
+
+constexpr _Vp
+base() const & requires copy_constructible<_Vp>
+{ return _M_base; }
+
+constexpr _Vp
+base() &&
+{ return std::move(_M_base); }
+
+constexpr auto
+begin() requires (!__detail::__simple_view<_Vp>)
+{ return _Iterator(ranges::begin(_M_base)); }
+
+constexpr auto
+begin() const requires range
+{ return _Iterator(ranges::begin(_M_base)); }
+
+constexpr auto
+end() requires (!__detail::__simple_view<_Vp>)
+{ return ranges::end(_M_base); }
+
+constexpr auto
+end() const requires range
+{ return ranges::end(_M_base); }
+
+constexpr auto
+size() requires sized_range<_Vp>
+{ return ranges::size(_M_base); }
+
+constexpr auto
+size() const requires sized_range
+{ return ranges::size(_M_base); }
+  };
+
+  template
+to_input_view(_Range&&) -> to_input_view>;
+
+  template
+requires view<_Vp>
+  template
+  class to_input_view<_Vp>::_Iterator
+  {
+using _Base = __maybe_const_t<_Const, _Vp>;
+
+iterator_t<_Base> _M_current = iterator_t<_Base>();
+
+constexpr explicit
+_Iterator(iterator_t<_Base> __current)
+: _M_current(std::move(__current))
+{ }
+
+friend to_input_view;
+friend _Iterator;
+
+  public:
+using difference_type = range_difference_t<_Base>;
+using value_type = range_value_t<_Base>;
+using iterator_concept = input_iterator_tag;
+
+_Iterator() requires default_initializable> = default;
+
+_Iterator(_Iterator&&) = default;
+_Iterator& operator=(_Iterator&&) = default;
+
+constexpr
+_Iterator(_Iterator __i)
+  requires _Const && convertible_to, iterator_t<_Base>>
+: _M_current(std::move(__i._M_current))
+{ }
+
+constexpr iterator_t<_Base>
+base() &&
+{ return std::move(_M_current); }
+
+constexpr const iterator_t<_Base>&
+base() const & noexcept
+{ return _M_current; }
+
+constexpr decltype(auto)
+

Re: [PATCH][RFC] add -[DU]_FORTIFY_SOURCE[=n] to DW_AT_producer

2025-03-13 Thread Richard Biener
On Thu, 13 Mar 2025, Jakub Jelinek wrote:

> On Thu, Mar 13, 2025 at 03:44:21PM +0100, Richard Biener wrote:
> > +  case OPT_D:
> > +  case OPT_U:
> > +   if (strncmp (options[i].arg, "_FORTIFY_SOURCE",
> > +strlen ("_FORTIFY_SOURCE")) == 0)
> 
> I'd say you want to verify that after that substring there is either
> '\0' or "=".
> Otherwise you'll record -D_FORTIFY_SOURCE_NOT_REALLY=1 which doesn't
> matter at all.

I had that first and thought it wasn't worth the cycles, but I can
surely add that (and thus also separate -U and -D handling).

Richard.

> > + {
> > +   switches.safe_push (options[i].orig_option_with_args_text);
> > +   len += strlen (options[i].orig_option_with_args_text) + 1;
> > + }
> > +   /* Otherwise ignore these. */
> > +   continue;
> >case OPT_flto_:
> > {
> >   const char *lto_canonical = "-flto";
> 
> Otherwise LGTM.
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Qing Zhao


> On Mar 12, 2025, at 18:46, Yeoul Na  wrote:
> 
> 
> 
>> On Mar 12, 2025, at 3:40 PM, Bill Wendling  wrote:
>> 
>> On Wed, Mar 12, 2025 at 3:28 PM Yeoul Na  wrote:
 On Mar 12, 2025, at 2:51 PM, John McCall  wrote:
 
 On 12 Mar 2025, at 16:02, Bill Wendling wrote:
> Qing pointed out in four lines of code how there are two different
> token resolution rules being used: one which is reliant upon C's
> current scoping rules and the other which requires a completely new
> scoping rule. This is no longer a question about what a programmer is
> most likely to infer what is meant or if the documentation makes it
> clear what's happening or any other ambiguous issues that me and
> others put forth before. This is a serious problem. There are
> essentially two options we have:
> 
> 1. Create a proposal to the C standards committee adding an "instance
> scope" into the language and use that for the feature, or
> 2. Find some other way, which doesn't require modifying the base language.
 
 We are actually planning to write a paper for the C committee, so if
 you’re willing to wait for that, perhaps that’s the right path forward.
 
 John.
>>> 
>>> Let me clarify. We never proposed to change the base language. Please check 
>>> my earlier response to Qing:
>>> 
 Yes, I read it and thanks again for writing it up! One clarification 
 related to that is -fbounds-safety never proposed to change the existing C 
 behavior. It’s adding an instantiation scope for the bounds annotations 
 only so members can be accessed within that context, this is similar in a 
 way to your __self proposal in that it’s introducing an instantiation 
 scope so that it can be only accessed through the __self syntax.
>> 
>> I don't think any of us intended to change the base language, but from
>> my reading of Qing's writeup we are. And I don't agree that __self is
>> also adding an instance scope. It's a way to reference the current
>> object, similar to 'this' in C++. Any reference to members within that
>> object must be resolved by reference: __self.member, etc. The members
>> aren't resolved by resorting to a scoping rule.
> 
> I don’t think so, we are changing only if we also changed how VLAs are 
> handled. Qing also wrote this:
> 
>> You can argue to only add the new variable scope for counted_by attribute,
>> not for VLA, then how to handle the following case:
>> 
>> [opc@qinzhao~]$ cat t8.c
>> void boo (int k)
>> {
>>  const int n = 10; // a local variable n
>>  struct foo {
>>int n; // a member variable n
>>int a[n + 10];  // for VLA, this n refers to the local variable n.
>>char *b __attribute__ ((counted_by(n + 10)))
>>  // for counted_by, this n refers to the member variable n.
>>  };
>> }
>> 
>> This will be a disaster.  
>> 
> 

The above example showed a very clear conflict situation between the default C 
scoping rules 
(Only global scope and local scope) between the additional instance scope 
introduced by the 
current counted_by syntax,  which really is a disaster that we should avoid. 

Both GCC and CLANG made a mistake in the beginning when design the syntax for 
counted_by,
It’s still not too late to correct this. 

> This should be handled by diagnostics.

No, diagnostic is not the right way to correct this design mistake, I think we 
should change the syntax
To avoid such confusion.

Qing


> 
> 
>> 
 Then, the question would be potential confusions caused by inconsistency 
 with how array sizes are handled. However, trying to make it consistent 
 with arrays could actually make the counted_by itself more error-prone and 
 confusing because they are inherently different: arrays can never 
 reference members while counted_by should be able to. I’ll share our full 
 reasoning with examples in the RFC that I’ll post soon.
>> 
>> This is exactly why we're suggesting the '__self' (or maybe another
>> idea). :-) It removes all inconsistencies without adding a new scoping
>> rule to the language. However, as I said in my last email, we're past
>> worrying about user confusion because we really shouldn't change the
>> base language (adding a new scoping rule) in ways that haven't yet
>> been approved by the standards committee.
> 
> And yes, again, I don’t oppose to introducing ‘__self’ or some other scope 
> specifiers so that you can write it more clearly when you should. But it 
> doesn’t have to be the default way of writing to most of the code where we 
> don’t have such obscure ambiguous code.
> 
> Cheers,
> Yeoul
> 
> 
>> 
 And the confusing code like below should be able to be handled by 
 diagnostics. The code like this is already confusing to readers and 
 writers of this code no matter which approach we will take and how clear 
 it would be for the compiler and the standard.
 
 void boo (int k)
 {
 const int n = 10; // a local variab

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread JeanHeyd Meneide
On Thu, Mar 13, 2025 Qing Zhao  wrote:

> ...
>
> Is N3188 the following:
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm
>
> What’s the status of this proposal?


 N3188 was discussed during the January 2024 Meeting in Strasbourg,
France. There was "along the lines" (opinion poll) consensus for more work
to be done:

> *Straw poll (opinion):* Would WG14 like to add something along the lines
of N3188 to C2y?
> 10 / 3 / 5

 You can read the minutes here:
https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3227.htm#59-n3188-identifying-array-length-state

Best Wishes,
JeanHeyd


[PATCH] doc: document incremental LTO flags

2025-03-13 Thread Michal Jires
This adds missing documentation for LTO flags.

Ok?

gcc/ChangeLog:

* doc/invoke.texi: (Optimize Options):
Add incremental LTO flags.
---
 gcc/doc/invoke.texi | 26 +++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4fbb4cda101..3efc6602898 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -601,7 +601,8 @@ Objective-C and Objective-C++ Dialects}.
 -floop-block  -floop-interchange  -floop-strip-mine
 -floop-unroll-and-jam  -floop-nest-optimize
 -floop-parallelize-all  -flra-remat  -flto  -flto-compression-level
--flto-partition=@var{alg}  -fmalloc-dce -fmerge-all-constants
+-flto-partition=@var{alg} -flto-incremental=@var{path}
+-flto-incremental-cache-size=@var{n} -fmalloc-dce -fmerge-all-constants
 -fmerge-constants  -fmodulo-sched  -fmodulo-sched-allow-regmoves
 -fmove-loop-invariants  -fmove-loop-stores  -fno-branch-count-reg
 -fno-defer-pop  -fno-fp-int-builtin-inexact  -fno-function-cse
@@ -15086,8 +15087,10 @@ Specify the partitioning algorithm used by the 
link-time optimizer.
 The value is either @samp{1to1} to specify a partitioning mirroring
 the original source files or @samp{balanced} to specify partitioning
 into equally sized chunks (whenever possible) or @samp{max} to create
-new partition for every symbol where possible.  Specifying @samp{none}
-as an algorithm disables partitioning and streaming completely.
+new partition for every symbol where possible or @samp{cache} to
+balance chunk sizes while keeping related symbols together for better
+caching in incremental LTO.  Specifying @samp{none} as an algorithm
+disables partitioning and streaming completely.
 The default value is @samp{balanced}. While @samp{1to1} can be used
 as an workaround for various code ordering issues, the @samp{max}
 partitioning is intended for internal testing only.
@@ -15095,6 +15098,23 @@ The value @samp{one} specifies that exactly one 
partition should be
 used while the value @samp{none} bypasses partitioning and executes
 the link-time optimization step directly from the WPA phase.
 
+@opindex flto-incremental
+@item -flto-incremental=@var{path}
+Enable incremental LTO, with its cache in given existing directory.
+Can significantly shorten edit-compile cycles with LTO.
+
+When used with LTO (@option{-flto}), the output of translation units
+inside LTO is cached. Cached translation units are likely to be
+encountered again when recompiling with small code changes, leading to
+recompile time reduction.
+
+Multiple GCC instances can use the same cache in parallel.
+
+@opindex flto-incremental-cache-size
+@item -flto-incremental-cache-size=@var{n}
+Specifies number of cache entries in incremental LTO after which to prune
+old entries. This is a soft limit, temporarily there may be more entries.
+
 @opindex flto-compression-level
 @item -flto-compression-level=@var{n}
 This option specifies the level of compression used for intermediate
-- 
2.48.1



Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Martin Uecker
Am Donnerstag, dem 13.03.2025 um 15:41 + schrieb Qing Zhao:
> 
> > On Mar 12, 2025, at 12:40, Martin Uecker  wrote:
> > 
> > Am Mittwoch, dem 12.03.2025 um 16:20 + schrieb Qing Zhao:
> > > 
> > > > On Mar 10, 2025, at 15:34, Martin Uecker  wrote:
> > > > 
> > > > Am Montag, dem 10.03.2025 um 15:00 -0400 schrieb John McCall:
> > > > > 
> > > > 
> > > > ...
> > > > 
> > > > > That said, my preference is still to just give preference to the 
> > > > > field name,
> > > > > which sidesteps any need for disambiguation syntax and avoids this 
> > > > > whole
> > > > > problem where structs can be broken by just adding a global variable 
> > > > > that
> > > > > happens to collide with a field.
> > > > 
> > > > I don't think it is a good idea when the 'n' in 'buf' refers to the
> > > > previous global 'n' coming before and the 'n' in attribute 
> > > > refers to a member 'n' coming later in the following example.
> > > > 
> > > > constexpr int n = 1;
> > > > 
> > > > struct foo {
> > > > char *p [[gnu::counted_by(n)]];
> > > > char buf[n];
> > > > int n;
> > > > };
> > > > 
> > > > How are you going to explain this to anyone?
> > > > 
> > > > 
> > > > And neither global names nor struct members may always be under
> > > > the control of the programmer.  Also that simply bringing
> > > > a new identifier into scope can break code elsewhere worries me.
> > > > 
> > > > 
> > > > Finally, the following does not even compile in C++.
> > > > 
> > > > struct foo {
> > > > char buf[n];
> > > > const static int n = 2;
> > > > };
> > > > 
> > > > While the next example is also ok in C++.
> > > > 
> > > > constexpr int n = 2;
> > > > 
> > > > struct foo {
> > > > char buf[n];
> > > > };
> > > > 
> > > > With both declarations of 'n' the example has UB in C++. 
> > > > So I am not convinced the proposed rules make a lot
> > > > of sense for C++ either.
> > > > 
> > > > 
> > > > Disambiguation with '__self__.'  completely avoids all these issues
> > > > while keeping the door open for later improvements.  
> > > > 
> > > > I still think one could use designator syntax, i.e. '.n', which
> > > > would be clearer and intuitive for both C and C++ programmers.
> > > 
> > > I think the major reason to use __self.n instead of .n is:
> > > 
> > > The dot (.) operator, i.e., the member access operator in C, is used to 
> > > access the member of an _instance_ of 
> > > a structure/union.
> > > We should declare a variable with a structure type first, and then append 
> > > this member access operator to this 
> > > variable and followed by the member name to access the member, and then 
> > > use it in the expressions.
> > 
> > For a designator
> > 
> > struct foo { int n; } a = { .n = 1 };
> > 
> > we also refer to a member 'n' of an instance 'a' of a structure type.
> > The instance is simply implied by the context.
> > 
> > For 
> > 
> > struct foo { int n; char *x __counted_by(.n) };
> > 
> > is also refers to a member of an instance of the struct. The
> > instance is the 'a' which is later used in an expression 'a.x'
> > So the instance would again be implied by the context.
> > 
> > So for me this makes perfect sense in both cases (and
> > for both C and C++)
> 
> Why does ‘.n' also make sense in C++?

For my perspective, it makes sense because C++ also already
uses this syntax of designators, so this syntax should already be
familiar to C++ programmers just like it is for C programmers: 
https://godbolt.org/z/7saEofhEb

It would also disambiguate the name lookup just as it does in C.

So it seems to be a possible way forward while avoiding
language divergence and without introducing anything too novel
in either language.

(But others still have concerns about .n and prefer __self__.)

Martin





Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Martin Uecker
Am Donnerstag, dem 13.03.2025 um 19:48 +0100 schrieb JeanHeyd Meneide:
> On Thu, Mar 13, 2025 Martin Uecker  wrote:
> 
> 
...

>  Part of this problem is self-inflicted: VLAs in structures are
> a GNU extension and not an ISO C feature (for reasons like this one). 

Note that this has nothing to do with the GCC extension because
the exact same issue occurs for ordinary identifiers:

enun { N = 1 }
struct foo {
  char buf[N];
};

or constexpr in C23.



Martin


[PATCH] vect: Fix aarch64/pr99873_2.c ld4/st4 failure

2025-03-13 Thread Richard Sandiford
vect_slp_prefer_store_lanes_p allows an SLP tree to be split even
if the tree could use store-lanes, provided that one of the new
groups would operate on full vectors for each scalar iteration.
That heuristic is no longer firing for gcc.target/aarch64/pr99873_2.c.

The test contains:

void __attribute ((noipa))
foo (uint64_t *__restrict x, uint64_t *__restrict y, int n)
{
  for (int i = 0; i < n; i += 4)
{
  x[i] += y[i];
  x[i + 1] += y[i + 1];
  x[i + 2] |= y[i + 2];
  x[i + 3] |= y[i + 3];
}
}

and wants us to use V2DI for the first two elements and V2DI for
the second two elements, rather than LD4s and ST4s.  This gives:

.L3:
ldp q31, q0, [x0]
add w3, w3, 1
ldp q29, q30, [x1], 32
orr v30.16b, v0.16b, v30.16b
add v31.2d, v29.2d, v31.2d
stp q31, q30, [x0], 32
cmp w2, w3
bhi .L3

instead of:

.L4:
ld4 {v28.2d - v31.2d}, [x2]
ld4 {v24.2d - v27.2d}, [x3], 64
add v24.2d, v28.2d, v24.2d
add v25.2d, v29.2d, v25.2d
orr v26.16b, v30.16b, v26.16b
orr v27.16b, v31.16b, v27.16b
st4 {v24.2d - v27.2d}, [x2], 64
cmp x2, x5
bne .L4

The first loop only handles half the amount of data per iteration,
but it requires far fewer internal permutations.

One reason the heuristic no longer fired looks like a typo: the call
to vect_slp_prefer_store_lanes_p was passing "1" as the new group size,
instead of "i".

However, even with that fixed, vect_analyze_slp later falls back on
single-lane SLP with load/store lanes.  I think that heuristic too
should use vect_slp_prefer_store_lanes_p (but it otherwise looks good).

The question is whether every load should pass
vect_slp_prefer_store_lanes_p or whether just one is enough.
I don't have an example that would make the call either way,
so I went for the latter, given that it's the smaller change
from the status quo.

This also appears to improve fotonik3d and roms from SPEC2017
(cross-checked against two different systems).

Bootstrapped & regression-tested on aarch64-linux-gnu, where it
fixes the pr99873_2.c regression.  OK to install?

Richard


gcc/
* tree-vect-slp.cc (vect_build_slp_instance): Pass the new group
size (i) rather than 1 to vect_slp_prefer_store_lanes_p.
(vect_analyze_slp): Only force the use of load-lanes and
store-lanes if that is preferred for at least one load/store pair.
---
 gcc/tree-vect-slp.cc | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 9e09f8e980b..ecb4a6521de 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -4161,7 +4161,7 @@ vect_build_slp_instance (vec_info *vinfo,
   && ! STMT_VINFO_SLP_VECT_ONLY (stmt_info)
   && compare_step_with_zero (vinfo, stmt_info) > 0
   && vect_slp_prefer_store_lanes_p (vinfo, stmt_info, NULL_TREE,
-masked_p, group_size, 1));
+masked_p, group_size, i));
  if (want_store_lanes || force_single_lane)
i = 1;
 
@@ -5095,7 +5095,7 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size,
  && !SLP_INSTANCE_TREE (instance)->ldst_lanes)
{
  slp_tree slp_root = SLP_INSTANCE_TREE (instance);
- int group_size = SLP_TREE_LANES (slp_root);
+ unsigned int group_size = SLP_TREE_LANES (slp_root);
  tree vectype = SLP_TREE_VECTYPE (slp_root);
 
  stmt_vec_info rep_info = SLP_TREE_REPRESENTATIVE (slp_root);
@@ -5138,6 +5138,7 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size,
  if (loads_permuted)
{
  bool can_use_lanes = true;
+ bool prefer_load_lanes = false;
  FOR_EACH_VEC_ELT (loads, j, load_node)
if (STMT_VINFO_GROUPED_ACCESS
  (SLP_TREE_REPRESENTATIVE (load_node)))
@@ -5165,9 +5166,21 @@ vect_analyze_slp (vec_info *vinfo, unsigned 
max_tree_size,
can_use_lanes = false;
break;
  }
+   /* Make sure that the target would prefer store-lanes
+  for at least one of the loads.
+
+  ??? Perhaps we should instead require this for
+  all loads?  */
+   prefer_load_lanes
+ = (prefer_load_lanes
+|| SLP_TREE_LANES (load_node) == group_size
+|| (vect_slp_prefer_store_lanes_p
+(vinfo, stmt_vinfo,
+ STMT_VINFO_VECTYPE (stmt_vinfo), masked,
+ group_size, SLP_TREE_LANES (load_node;
  }
 
- if (can_use_lanes)
+ if (can_use_lanes && pre

[OG14, COMMITTED] OpenMP: Integrate dynamic selectors with dispatch argument handling [PR118457]

2025-03-13 Thread Sandra Loosemore
Support for dynamic selectors in "declare variant" was developed in
parallel with support for the adjust_args/append_args clauses and the
dispatch construct; they collided in a bad way.  This patch fixes the
"sorry" for calls that need both by removing the adjust_args/append_args
code from gimplify_call_expr and invoking it from the new variant
substitution code instead.

This is a backport of commit 44b1d52e2f4db57849ca54b63c52a687294b1793
from mainline, done by hand instead of cherry-picking because of
differences in the dynamic selector code between OG14 and mainline.

gcc/ChangeLog
PR middle-end/118457
* gimplify.cc (modify_call_for_omp_dispatch): New, containing
code split from gimplify_call_expr and modified to emit tree
instead of gimple.  Remove the error for falling through to a call
to the base function.
(gimplify_variant_call_expr):  Add omp_dispatch_p argument and
call modify_call_for_omp_dispatch if needed.
(gimplify_call_expr): Adjust call to gimplify_variant_call_expr.
Move adjust_args/append_args code to modify_call_for_omp_dispatch.

gcc/testsuite/ChangeLog
PR middle-end/118457
* c-c++-common/gomp/adjust-args-6.c: New.
* c-c++-common/gomp/append-args-5.c: Adjust expected output.
* c-c++-common/gomp/append-args-dynamic.c: New.
* c-c++-common/gomp/dispatch-11.c: Adjust expected output.
* gfortran.dg/gomp/dispatch-11.f90: Likewise.
---
 gcc/gimplify.cc   | 687 --
 .../c-c++-common/gomp/adjust-args-6.c |  27 +
 .../c-c++-common/gomp/append-args-5.c |  19 +-
 .../c-c++-common/gomp/append-args-dynamic.c   |  94 +++
 gcc/testsuite/c-c++-common/gomp/dispatch-11.c |  22 +-
 .../gfortran.dg/gomp/dispatch-11.f90  |   5 -
 6 files changed, 447 insertions(+), 407 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/adjust-args-6.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/append-args-dynamic.c

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 2bcadcf1038..ea46c81364b 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3892,6 +3892,303 @@ maybe_fold_stmt (gimple_stmt_iterator *gsi)
   return fold_stmt (gsi);
 }
 
+/* OpenMP: Handle the append_args and adjust_args clauses of
+   declare_variant for EXPR, which is a CALL_EXPR whose CALL_EXPR_FN
+   is the variant, within a dispatch construct with clauses DISPATCH_CLAUSES
+   and location DISPATCH_LOC.
+
+   'append_args' causes interop objects are added after the last regular
+   (nonhidden, nonvariadic) arguments of the variant function.
+   'adjust_args' with need_device_{addr,ptr} converts the pointer target of
+   a pointer from a host to a device address. This uses either the default
+   device or the passed device number, which then sets the default device
+   address.  */
+static tree
+modify_call_for_omp_dispatch (tree expr, tree dispatch_clauses,
+ location_t dispatch_loc)
+{
+  tree fndecl = get_callee_fndecl (expr);
+
+  /* Skip processing if we don't get the expected call form.  */
+  if (!fndecl)
+return expr;
+
+  int nargs = call_expr_nargs (expr);
+  tree dispatch_device_num = NULL_TREE;
+  tree dispatch_device_num_init = NULL_TREE;
+  tree dispatch_interop = NULL_TREE;
+  tree dispatch_append_args = NULL_TREE;
+  int nfirst_args = 0;
+  tree dispatch_adjust_args_list
+= lookup_attribute ("omp declare variant variant args",
+   DECL_ATTRIBUTES (fndecl));
+
+  if (dispatch_adjust_args_list)
+{
+  dispatch_adjust_args_list = TREE_VALUE (dispatch_adjust_args_list);
+  dispatch_append_args = TREE_CHAIN (dispatch_adjust_args_list);
+  if (TREE_PURPOSE (dispatch_adjust_args_list) == NULL_TREE
+ && TREE_VALUE (dispatch_adjust_args_list) == NULL_TREE)
+   dispatch_adjust_args_list = NULL_TREE;
+}
+  if (dispatch_append_args)
+{
+  nfirst_args = tree_to_shwi (TREE_PURPOSE (dispatch_append_args));
+  dispatch_append_args = TREE_VALUE (dispatch_append_args);
+}
+  dispatch_device_num = omp_find_clause (dispatch_clauses, OMP_CLAUSE_DEVICE);
+  if (dispatch_device_num)
+dispatch_device_num = OMP_CLAUSE_DEVICE_ID (dispatch_device_num);
+  dispatch_interop = omp_find_clause (dispatch_clauses, OMP_CLAUSE_INTEROP);
+  int nappend = 0, ninterop = 0;
+  for (tree t = dispatch_append_args; t; t = TREE_CHAIN (t))
+nappend++;
+
+  /* FIXME: error checking should be taken out of this function and
+ handled before any attempt at filtering or resolution happens.
+ Otherwise whether or not diagnostics appear is determined by
+ GCC internals, how good the front ends are at constant-folding,
+ the split between early/late resolution, etc instead of the code
+ as written by the user.  */
+  if (dispatch_interop)
+{
+  for (tree t = dispatch_interop; t; t = TREE_CHAIN (t))
+   if (OMP_CLAUSE_CODE (t) == OMP_CLAUSE_

Re: [PATCH] c, c++: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Joseph Myers
On Thu, 13 Mar 2025, Jason Merrill wrote:

> > I'd be afraid that would be quite significant change of behavior everywhere,
> > something that C doesn't allow (like mixing std and GNU attributes in any
> > orders or [[]] __attribute__(()) [[]][[]] __attribute__(())
> > expression-statement).  Or it would allow __attribute__(()) on while, do,
> > for, if, switch, ..., again something that wasn't accepted before.
> 
> Do you think those changes are undesirable?  We've previously had to fix cases
> where we were failing to support mixing of std and GNU attributes.

Mixing attributes is problematic because of different semantics for 
appertainment.  For example, in

  void func() ATTRS;

if ATTRS are standard attributes then they are part of the function 
declarator and appertain to the function type, but if they are GNU 
attributes then they are not part of the declarator, but rather appertain 
to the declaration.  This means completely different parts of the C parser 
handle the different kinds of attributes in what superficially looks like 
the same position (standard attributes handled in parsing a declarator, 
GNU ones in parsing the declaration after the declarator - with more 
complicated declarators such as functions returning pointers to arrays, 
the locations can end up physically separated) and there would be no 
coherent syntax or semantics for appertainment for mixed attributes there.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Patrick Palka
On Thu, 13 Mar 2025, Patrick Palka wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk, and 14 after
> a while?
> 
> -- >8 --
> 
> The type tuple> is clearly copy/move constructible, but for
> reasons that are not yet completely understood checking this triggers
> constraint recursion with our C++20 tuple implementation (and not the
> C++17 implementation).
> 
> It turns out this recursion stems from considering the non-template
> tuple(const _Elements&) constructor when checking for copy/move
> constructibility.  Checking this constructor is of course redundant,
> since the defaulted copy/move constructors are better matches.
> 
> GCC has a non-standard "perfect candidate" optimization[1] that causes
> overload resolution to shortcut considering template candidates if we
> find a (non-template) perfect candidate.  So to work around this tuple
> bug (and as a general compile-time optimization) this patch turns the
> problematic constructor into a template so that GCC doesn't consider it
> when checking for copy/move constructibility.
> 
> Changing the template-ness of a constructor can affect the outcome of
> overload resolution (since template-ness is a tiebreaker) so there's a
> risk this change could cause overload resolution ambiguities.  The tuple
> constructor set doesn't seem particularly reliant on this tiebreaker
> though.

Ah, I just realized the C++17 tuple impl already defines the
tuple(const _Elements&...) constructor as a template in order to
constrain it!  So this patch arguably makes the C++20 constructor
set more consistent with the C++17 impl, and so should be quite safe.

> 
> The testcase still fails with Clang (in C++20 mode) since it doesn't
> implement said optimization.
> 
>   PR libstdc++/116440
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/tuple (tuple::tuple(const _Elements&...)):
>   Turn into a template.
>   * testsuite/20_util/tuple/116440.C: New test.
> 
> [1]: See r11-7287-g187d0d5871b1fa and
> https://isocpp.org/files/papers/P3606R0.html
> ---
>  libstdc++-v3/include/std/tuple| 14 +
>  libstdc++-v3/testsuite/20_util/tuple/116440.C | 29 +++
>  2 files changed, 37 insertions(+), 6 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/20_util/tuple/116440.C
> 
> diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
> index 34d790fd6f5..67760471d95 100644
> --- a/libstdc++-v3/include/std/tuple
> +++ b/libstdc++-v3/include/std/tuple
> @@ -966,12 +966,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>: _Inherited()
>{ }
>  
> -  constexpr explicit(!__convertible())
> -  tuple(const _Elements&... __elements)
> -  noexcept(__nothrow_constructible())
> -  requires (__constructible())
> -  : _Inherited(__elements...)
> -  { }
> +  // Defined as a template to work around PR libstdc++/116440.
> +  template

Whoops, s/class/typename as per convention.

> + constexpr explicit(!__convertible())
> + tuple(const _Elements&... __elements)
> + noexcept(__nothrow_constructible())
> + requires (__constructible())
> + : _Inherited(__elements...)
> + { }
>  
>template
>   requires (__disambiguating_constraint<_UTypes...>())
> diff --git a/libstdc++-v3/testsuite/20_util/tuple/116440.C 
> b/libstdc++-v3/testsuite/20_util/tuple/116440.C
> new file mode 100644
> index 000..12259134d25
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/tuple/116440.C
> @@ -0,0 +1,29 @@
> +// PR libstdc++/116440 - std::tuple> does not compile
> +// { dg-do compile { target c++17 } }
> +
> +#include 
> +#include 
> +#include 
> +
> +template 
> +using TupleTuple = std::tuple>;
> +
> +struct EmbedAny {
> +std::any content;
> +};
> +
> +static_assert(std::is_copy_constructible>::value);
> +static_assert(std::is_move_constructible>::value);
> +
> +static_assert(std::is_copy_constructible>::value);
> +static_assert(std::is_move_constructible>::value);
> +
> +static_assert(std::is_constructible_v>);
> +
> +struct EmbedAnyWithZeroSizeArray {
> +void* pad[0];
> +std::any content;
> +};
> +
> +static_assert(std::is_copy_constructible>::value);
> +static_assert(std::is_move_constructible>::value);
> -- 
> 2.49.0.rc1.37.ge969bc8759
> 
> 



Re: [PATCH][v3] Simple cobol.dg testsuite

2025-03-13 Thread Simon Sobisch



Am 13.03.2025 um 21:35 schrieb David Malcolm:

On Thu, 2025-03-13 at 12:11 +0100, Simon Sobisch wrote:

Thanks for your work on adding a testsuite. Can you please explain
why
you do this when a complete testsuite exists in autoconf (autotest)
format (which roots back to decade of work in GnuCOBOL, with all
copyrights for that already with the FSF)?

Is the existence of this in upstream [1] just unknown (because it was
not part of the initial patches [for reasons I not understood])?

Is the format such a big issue (note: previous discussions elaborated
"a
test suite is very important and other frontends also use a framework
other than dejagnu)?

If dejagnu is the way to go:

* Shouldn't there be deprecation of autotest in autoconf (of course
only
if that preference is also outside of gcc)?

* Shouldn't there be a (at least semi automated) script / migration
tool
(at least for this specific time in place to convert the "UAT" once
into
dejagnu format)?



Thanks for giving me some context on this,
Simon


[1]:
https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/-/tree/master+cobol/gcc/cobol/UAT



Hi Simon

Does the UAT testsuite have coverage for what happens on invalid code?
  
For example, in

https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677481.html
my patch adds test coverage for the output on one kind of typo (or, at
least, I tried to, my knowledge of COBOL is essentially 0); I put this
in Richard's DejaGnu suite since I have lots of similar tests for other
frontends.


Yes. That is what the "syn" tests are about, for example
https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/-/blob/master+cobol/gcc/cobol/UAT/testsuite.src/syn_value.at


# Unexpected return code 0 failure - We don't implement
# type checking TODO
AT_DATA([prog.cob], [
   IDENTIFICATION   DIVISION.
   PROGRAM-ID.  prog.
   DATA DIVISION.
   WORKING-STORAGE  SECTION.
  * Gnu throws ERROR on next line TODO
   01 X-SPACE   PIC 999 VALUE SPACE.
  * Gnu Throws WARNING on next two lines TODO
   01 X-ABC PIC 999 VALUE "abc".
   01 X-12-3PIC 999 VALUE 12.3.
   01 X-123 PIC 999 VALUE 123.
  * Gnu Throws WARNING on next line
   01 X-1234PIC 999 VALUE 1234.
   PROCEDUREDIVISION.
   STOP RUN.
])
AT_CHECK([$COMPILE_ONLY prog.cob], [1], ,
[prog.cob:7:25: error: numeric type X-SPACE VALUE 'SPACES' requires 
numeric VALUE

7 |01 X-SPACE   PIC 999 VALUE SPACE.
  | ^
prog.cob:9:25: error: numeric type X-ABC VALUE 'abc' requires numeric VALUE
9 |01 X-ABC PIC 999 VALUE "abc".
  | ^
prog.cob:10:25: error: integer type X-12-3 VALUE '12.3' requires integer 
VALUE

   10 |01 X-12-3PIC 999 VALUE 12.3.
  | ^
prog.cob:13:25: error: numeric X-1234 VALUE '1234' holds only 3 digits
   13 |01 X-1234PIC 999 VALUE 1234.
  | ^
cobol1: error: failed compiling prog.cob
])


Notes:

* in GnuCOBOL we have specific tests that check the "extended" format
  with context; in all other cases we pass (as part of the $COMPILE
  above)  -fdiagnostics-plain-output - because we only want the user
  message, not the formatting tested overall - and producing and
  comparing less output also saves some (minor) computation

* I'd definitely suggest that UAT is adjusted similar before the
  conversion for the same reasons

* as hinted in the UAT notes above, GnuCOBOL test results are different,
  here the original result from the file this test in UAT was based on:

prog.cob:6: error: invalid VALUE clause
prog.cob:7: warning: numeric value is expected
prog.cob:8: warning: value size exceeds data size
prog.cob:10: warning: value size exceeds data size

[note: according to ISO those should all be errors, but other COBOL 
implementations don't care and truncate "per MOVE rules", so cobc's 
default is a warning here]


Just for your amusement, that's GC 3.2's output (with -Wall)

prog.cob:6: error: invalid VALUE clause
4 |DATA DIVISION.
5 |WORKING-STORAGE  SECTION.
6 >01 X-SPACE   PIC 999 VALUE SPACE.
7 |01 X-ABC PIC 999 VALUE "abc".
8 |01 X-12-3PIC 999 VALUE 12.3.
prog.cob:7: warning: numeric value is expected [-Wothers]
5 |WORKING-STORAGE  SECTION.
6 |01 X-SPACE   PIC 999 VALUE SPACE.
7 >01 X-ABC PIC 999 VALUE "abc".
8 |01 X-12-3PIC 999 VALUE 12.3.
9 |01 X-123 PIC 999 VALUE 123.
prog.cob:8: warning: value size exceeds data size [-Wothers]
6 |01 X-SPACE   PIC 999 VALUE SPACE.
7 |01 X-ABC PIC 999 VALUE "abc".
8 >01 X-12-3PIC 999 VALUE 12.3.
9 |01 X-123 PIC 999 VALUE 123.
   10 |01 X-1234PIC 999 VALUE 1234.
prog.cob:10: warning:

[PATCH v2 1/3] Aarch64: Use BUILTIN_VHSDF_HSDF for vector and scalar sqrt builtins

2025-03-13 Thread Ayan Shafqat
This patch changes the `sqrt` builtin definition from `BUILTIN_VHSDF_DF`
to `BUILTIN_VHSDF_HSDF` in `aarch64-simd-builtins.def`, ensuring the
builtin covers half, single, and double precision variants. The redundant
`VAR1 (UNOP, sqrt, 2, FP, hf)` lines are removed, as they are no longer
needed now that `BUILTIN_VHSDF_HSDF` handles those cases.

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def: Change
BUILTIN_VHSDF_DF to BUILTIN_VHSDF_HSDF

Signed-off-by: Ayan Shafqat 
Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-simd-builtins.def | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
b/gcc/config/aarch64/aarch64-simd-builtins.def
index 6cc45b18a72..685bf0dc408 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -57,7 +57,7 @@
   VAR1 (BINOPP, pmull, 0, DEFAULT, v8qi)
   VAR1 (BINOPP, pmull_hi, 0, DEFAULT, v16qi)
   BUILTIN_VHSDF_HSDF (BINOP, fmulx, 0, FP)
-  BUILTIN_VHSDF_DF (UNOP, sqrt, 2, FP)
+  BUILTIN_VHSDF_HSDF (UNOP, sqrt, 2, FP)
   BUILTIN_VDQ_I (BINOP, addp, 0, DEFAULT)
   BUILTIN_VDQ_I (BINOPU, addp, 0, DEFAULT)
   BUILTIN_VDQ_BHSI (UNOP, clrsb, 2, DEFAULT)
@@ -848,9 +848,6 @@
   BUILTIN_VHSDF_HSDF (BINOP_USS, facgt, 0, FP)
   BUILTIN_VHSDF_HSDF (BINOP_USS, facge, 0, FP)
 
-  /* Implemented by sqrt2.  */
-  VAR1 (UNOP, sqrt, 2, FP, hf)
-
   /* Implemented by hf2.  */
   VAR1 (UNOP, floatdi, 2, FP, hf)
   VAR1 (UNOP, floatsi, 2, FP, hf)
-- 
2.43.0



[PATCH v2 2/3] Aarch64: Add __sqrt and __sqrtf intrinsics to arm_acle.h

2025-03-13 Thread Ayan Shafqat
This patch introduces two new inline functions, __sqrt and __sqrtf, in
arm_acle.h for Aarch64 targets. These functions wrap the new builtins
__builtin_aarch64_sqrtdf and __builtin_aarch64_sqrtsf, respectively,
providing direct access to hardware instructions without relying on the
standard math library or optimization levels.

gcc/ChangeLog:

* config/aarch64/arm_acle.h (__sqrt, __sqrtf): New function.

Signed-off-by: Ayan Shafqat 
---
 gcc/config/aarch64/arm_acle.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 7976c117daf..d972a4e7e7e 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -118,6 +118,20 @@ __revl (unsigned long __value)
 return __rev (__value);
 }
 
+__extension__ extern __inline double
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__sqrt(double  __x)
+{
+return __builtin_aarch64_sqrtdf (__x);
+}
+
+__extension__ extern __inline float
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__sqrtf(float __x)
+{
+return __builtin_aarch64_sqrtsf (__x);
+}
+
 #pragma GCC push_options
 #pragma GCC target ("+nothing+jscvt")
 __extension__ extern __inline int32_t
-- 
2.43.0



[PATCH v2 3/3] Aarch64: Add tests for __sqrt and __sqrtf intrinsic

2025-03-13 Thread Ayan Shafqat
This patch introduces acle_sqrt.c in the AArch64 testsuite, verifying
that the new __sqrt and __sqrtf intrinsics emit the expected fsqrt
instructions for double and float arguments.

Coverage for new intrinsics ensures that __sqrt and __sqrtf are
correctly expanded to hardware instructions and do not fall back to
library calls, regardless of optimization levels.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/acle_sqrt.c: New test.

Signed-off-by: Ayan Shafqat 
---
 .../gcc.target/aarch64/acle/acle_sqrt.c | 17 +
 1 file changed, 17 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/acle_sqrt.c

diff --git a/gcc/testsuite/gcc.target/aarch64/acle/acle_sqrt.c 
b/gcc/testsuite/gcc.target/aarch64/acle/acle_sqrt.c
new file mode 100644
index 000..1e3ed9eaa6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/acle_sqrt.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "arm_acle.h"
+
+double test_acle_sqrt (double x)
+{
+  return __sqrt (x);
+}
+
+float test_acle_sqrtf (float x)
+{
+  return __sqrtf (x);
+}
+
+/* { dg-final { scan-assembler-times "fsqrt\td\[0-9\]" 1 } } */
+/* { dg-final { scan-assembler-times "fsqrt\ts\[0-9\]" 1 } } */
-- 
2.43.0



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Thu, 13 Mar 2025 at 23:16, Ville Voutilainen
 wrote:
>
> On Thu, 13 Mar 2025 at 23:03, Patrick Palka  wrote:
> > +  // Defined as a template to work around PR libstdc++/116440.
> > +  template
> > +   constexpr explicit(!__convertible())
> > +   tuple(const _Elements&... __elements)
>
> I don't understand how a constructor template declared like this can
> ever be called. The template parameter pack
> can't be provided or deduced, and can't have a default. So we're
> effectively making this signature always lose
> overload resolution to the one that takes a pack of _UElements&&.
>
> Which may be fine. I can't head-compile a test that would fail in that
> case. If any of the incoming argument isn't one
> of _Elements, that constructor wins overload resolution anyway. If the
> incoming arguments are exactly _Elements, that
> constructor does the same thing as this one. I think.

Oh, never mind. The pack is just deduced as an empty pack.


Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Thu, 13 Mar 2025 at 23:14, Patrick Palka  wrote:
> Ah, I just realized the C++17 tuple impl already defines the
> tuple(const _Elements&...) constructor as a template in order to
> constrain it!  So this patch arguably makes the C++20 constructor
> set more consistent with the C++17 impl, and so should be quite safe.

Right, but the C++17 tuple impl doesn't introduce additional template
parameter packs, it
introduces template parameters that have default values, so that
constructor (or two of them) certainly remains callable,
because the default arguments allow deduction to succeed.


Re: [PATCH v2 3/3] Aarch64: Add tests for __sqrt and __sqrtf intrinsic

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 05:25:26PM -0400, Ayan Shafqat wrote:
> This patch introduces acle_sqrt.c in the AArch64 testsuite, verifying
> that the new __sqrt and __sqrtf intrinsics emit the expected fsqrt
> instructions for double and float arguments.
> 
> Coverage for new intrinsics ensures that __sqrt and __sqrtf are
> correctly expanded to hardware instructions and do not fall back to
> library calls, regardless of optimization levels.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/aarch64/acle/acle_sqrt.c: New test.
> 
> Signed-off-by: Ayan Shafqat 

Tests should be in the same patch as the code they are testing,
not committed separately.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/acle/acle_sqrt.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +#include "arm_acle.h"
> +
> +double test_acle_sqrt (double x)

The normal GNU formatting is
double
test_acle_sqrt (double x)
(i.e. function name at the start of line, so one can grep for it).

> +{
> +  return __sqrt (x);
> +}
> +
> +float test_acle_sqrtf (float x)

Ditto.

Jakub



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Jonathan Wakely
On Thu, 13 Mar 2025, 22:28 Patrick Palka,  wrote:

> On Thu, 13 Mar 2025, Jonathan Wakely wrote:
>
> > On Thu, 13 Mar 2025 at 21:29, Patrick Palka  wrote:
> > >
> > > On Thu, 13 Mar 2025, Ville Voutilainen wrote:
> > >
> > > > On Thu, 13 Mar 2025 at 23:16, Ville Voutilainen
> > > >  wrote:
> > > > >
> > > > > On Thu, 13 Mar 2025 at 23:03, Patrick Palka 
> wrote:
> > > > > > +  // Defined as a template to work around PR
> libstdc++/116440.
> > > > > > +  template
> > > > > > +   constexpr explicit(!__convertible())
> > > > > > +   tuple(const _Elements&... __elements)
> > > > >
> > > > > I don't understand how a constructor template declared like this
> can
> > > > > ever be called. The template parameter pack
> > > > > can't be provided or deduced, and can't have a default. So we're
> > > > > effectively making this signature always lose
> > > > > overload resolution to the one that takes a pack of _UElements&&.
> > > > >
> > > > > Which may be fine. I can't head-compile a test that would fail in
> that
> > > > > case. If any of the incoming argument isn't one
> > > > > of _Elements, that constructor wins overload resolution anyway. If
> the
> > > > > incoming arguments are exactly _Elements, that
> > > > > constructor does the same thing as this one. I think.
> > > >
> > > > Oh, never mind. The pack is just deduced as an empty pack.
> > >
> > > Yep that's my understanding, though I don't know where in the standard
> > > this is specified, a quick Ctrl+F is failing me.
> > >
> > > I can use template or template if that's
> > > preferred :)
> >
> > I would prefer template to the empty pack, I think
> > the default template argument makes it a little more obvious how that
> > constructor can be called (I'm sure Ville won't be the only one to
> > raise an eyebrow at that).
>
> Fixed.
>
> >
> > Thanks for figuring this out, and noticing that that the template-ness
> > of that constructor is what changed between C++17 and C++20. I think
> > when I re-implemented it using concepts I assumed the template-ness
> > was there for the _ImplicitCtor / _ExplicitCtor stuff, which is done
> > using explicit(bool) in C++20. I wasn't looking at the tuple(const
> > _Elements&...) constructor at all, because the errors all pointed to
> > tuple(_UTypes&&...).
>
> Yeah, I had to whip out GDB in order to find this hidden instantiation
> context.  We do eventually recurse into the _Utypes&&... constructor,
> but it's apparently not the start of it.
>
> >
> > Do we also want to constraint the tuple(const _Elements&...)
> > constructor with requires sizeof...(_Elements) >= 1, which is present
> > on the C++17 version?
>
> I guess we should constrain the corresponding allocator-aware
> constructor too at the same time.  But it does seem the constraint
> isn't necessary (at least nowadays) due to the explicit specialization,
> so I haven't added it.
>
> Updated patch using 'typename = void':
>

OK for trunk and after a bit of time, 14. Thanks!



> -- >8 --
>
> Subject: [PATCH v2] libstdc++: Work around C++20 tuple>
> constraint
>  recursion [PR116440]
>
> The type tuple> is clearly copy/move constructible, but for
> reasons that are not yet completely understood checking this property
> triggers constraint recursion with our C++20 tuple implementation (but
> not the C++17 implementation).
>
> It turns out this recursion stems from considering the non-template
> tuple(const _Elements&) constructor when checking for copy/move
> constructibility.  Checking this constructor is of course redundant,
> since the defaulted copy/move constructors are better matches.
>
> GCC has a non-standard "perfect candidate" optimization[1] that causes
> overload resolution to shortcut considering template candidates if we
> find a (non-template) perfect candidate.  So to work around this issue
> (and as a general compile-time optimization) this patch turns the
> problematic constructor into a template so that GCC doesn't consider it
> when checking for copy/move constructibility.
>
> Changing the template-ness of a constructor can affect the outcome of
> overload resolution (since template-ness is a tiebreaker) so there's a
> risk this change could e.g. introduce overload resolution ambiguities.
> But the original C++17 implementation has long defined this constructor
> as a template (in order to constrain it etc), so doing the same thing
> in the C++20 mode should naturally be quite safe.
>
> The testcase still fails with Clang (in C++20 mode) since it doesn't
> implement said optimization.
>
> PR libstdc++/116440
>
> libstdc++-v3/ChangeLog:
>
> * include/std/tuple (tuple::tuple(const _Elements&...))
> [C++20]: Turn into a template.
> * testsuite/20_util/tuple/116440.C: New test.
>
> [1]: See r11-7287-g187d0d5871b1fa and
> https://isocpp.org/files/papers/P3606R0.html
> ---
>  libstdc++-v3/include/std/tuple| 14 +
>  libstdc++-v3/testsuite/20_util/tuple/116440.C | 29 +++

Re: [PATCH] COBOL v3: 3/14 80K bld: config and build machinery

2025-03-13 Thread Jakub Jelinek
On Thu, Mar 13, 2025 at 06:33:08PM -0400, James K. Lowden wrote:
> I guess the most controversial engineering choice was to rely on
> __int128 and _Float128.  Those gave us native processing and binary <->
> string conversion.  Having been advised of real.cc, we're taking a
> look.  If we can get 128-bit hardware computation more portably, that's
> all to the good.  

One thing is the compiler and another thing the target library and what
the compiler emits.
gmp was an alternative for wide_int (see https://gcc.gnu.org/PR119242
for a partial patch, which proves it can be just done in wide_int pretty
easily when the conversion from _Float128 inside of the compiler source
is finished).
I see there _Float128 value used to store various values in the FE, is
it used to store not just floating point but also integers?  E.g. all
possible __int128 or unsigned __int128 values certainly can't be represented
in _Float128 (while 64-bit integers can).
Another is use of gmp in libgcobol, I think there is no need for that there.
I'd suggest just to use tree as storage for the values in the FE (at least
where it currently mallocs some memory and stores the values directly in
there) REAL_CSTs for floating point and INTEGER_CSTs for integers.
For REAL_CSTs, the real.cc APIs allow to convert a string to
REAL_VALUE_TYPE (see real_from_string{,2,3}), do rounding, arithmetics,
convert REAL_VALUE_TYPE to string (see real_to_decimal) or convert from
wide_int (real_from_integer).
I saw there multiplication of _Float128 by __int128 (the latter being some
power of 10, from 1e0 to 1e38), not really sure how is that supposed to
work in general because the few largest powers of 10 are not exactly
representable in _Float128.

> But I don't see the point, in 2025, of gmp.  it's more important that 
> COBOL take advantage of current hardware than run on a VAX.
> That's where users are. Hardware computation has real benefits, and
> software math has measurable impact.  When the IT budget has a line
> item for electricity, that's what matters.  

Note, __int128 support isn't implemented in hardware almost anywhere, on
64-bit targets it is typically implemented using 64-bit arithmetics,
sometimes with help of 64x64->128bit multiply (or highpart 64x64->64
multiply) and/or 128/64->64bit division/modulo.
Similarly, _Float128 is implemented in hardware only on a few targets,
I think mainly s390x and more recent powerpc (sparc has it in the ISA
but I think it used to be software emulated), so e.g. on x86_64 or I think
on aarch64 too _Float128 is just software emulated.

Jakub



Re: [PATCH] arm: Fix REVERSIBLE_CC_MODE [PR110796...]

2025-03-13 Thread Richard Earnshaw (lists)
On 13/03/2025 08:22, Christophe Lyon wrote:
> Since we have vcmp and vcmpe instructions (vcmpe raises an "Invalid
> Operation" exception in presence of a NaN operand), we need to tell
> the compiler it is not safe to reverse comparisons of floating-point
> arguments.
> 
> On armv8-m.main+dsp+fp (cortex-m33):
> PASS: gcc.dg/torture/builtin-iseqsig-1.c
> at -O1, -O2, -O3, -Os
> 
> On armv8.1-m.main+mve.fp+fp.dp (cortex-m55):
> PASS: gcc.dg/torture/builtin-iseqsig-1.c
> PASS: gcc.dg/torture/builtin-iseqsig-2.c
> PASS: gcc.dg/torture/builtin-iseqsig-3.c
> at -O1, -O2, -O3, -Os
> 
> On armv7e-m+fp.dp (cortex-m7):
> PASS: gcc.dg/torture/builtin-iseqsig-1.c
> PASS: gcc.dg/torture/builtin-iseqsig-2.c
> PASS: gcc.dg/torture/builtin-iseqsig-3.c
> PASS: gcc.dg/torture/pr82692.c
> at -O1, -O2, -O3, -Os
> 
> On armv8-a+simd:
> PASS: gcc.dg/torture/builtin-iseqsig-1.c
> PASS: gcc.dg/torture/builtin-iseqsig-2.c
> PASS: gcc.dg/torture/builtin-iseqsig-3.c
> PASS: gfortran.dg/ieee/comparisons_3.F90
> at -Os (they already passed at other optimization levels)
> 
>   gcc/
>   PR target/110796
>   PR target/118446
>   * config/arm/arm.h (REVERSIBLE_CC_MODE): Take floating-point modes
>   into account.
> ---
>  gcc/config/arm/arm.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 8472b756127..3c9c7c795cb 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -2257,7 +2257,10 @@ extern int making_const_table;
>  
>  #define SELECT_CC_MODE(OP, X, Y)  arm_select_cc_mode (OP, X, Y)
>  
> -#define REVERSIBLE_CC_MODE(MODE) 1
> +/* Having an integer comparison mode guarantees that we can use
> +   reverse_condition, but the usual restrictions apply to floating-point
> +   comparisons.  */
> +#define REVERSIBLE_CC_MODE(MODE) ((MODE) != CCFPmode && (MODE) != CCFPEmode)
>  
>  #define REVERSE_CONDITION(CODE,MODE) \
>(((MODE) == CCFPmode || (MODE) == CCFPEmode) \

I'd like to understand the impact of this for conditional execution when the 
result comes from a floating-point comparison.  Adding support for 
non-reversable conditions would involve adding a substantial number of patterns 
to the machine description.

R.



[PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-13 Thread Andre Vieira (lists)
Apologies for the delay, had been waiting on some other relevant patches 
to go in to make sure we didn't break any valid existing behaviours. It 
should all be working properly now. I think I've also addressed all your 
comments.  Most notable change is that it now uses the 'sorry' mechanism.


Bootstrapped and regression tested on aarch64-none-linux-gnu.


aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2, so we are 
removing

that false dependency in GCC.  However, we chose for now to not support this
combination of features and will diagnose the combination of FEAT_SME 
without
FEAT_SVE2 as unsupported by GCC.  We may choose to support this in the 
future.


gcc/ChangeLog:

* config/aarch64/aarch64-arches.def (SME): Remove SVE2 as prerequisite
and add in FCMA and F16FML.
* config/aarch64/aarch64.cc (aarch64_override_options_internal):
Diagnose use of SME without SVE2.
* doc/invoke.texi (sme): Document that this can only be used with the
sve2 extension.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/no-sve-with-sme-1.c: New.
* gcc.target/aarch64/no-sve-with-sme-2.c: New.
* gcc.target/aarch64/no-sve-with-sme-3.c: New.
* gcc.target/aarch64/sme/streaming_mode_4.c: Diagnose new error.
* gcc.target/aarch64/pragma_cpp_predefs_4.c: Pass +sve2 to existing
+sme pragma.
* gcc.target/aarch64/sve/acle/general-c/binary_int_opt_single_n_2.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_opt_single_n_2.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_single_1.c: Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/binary_za_slice_int_opt_single_1.c:
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_2.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_3.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_4.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_2.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_3.c:
Likewise.
	* 
gcc.target/aarch64/sve/acle/general-c/binary_za_slice_uint_opt_single_1.c:

Likewise.
* gcc.target/aarch64/sve/acle/general-c/binaryxn_2.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/clamp_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/compare_scalar_count_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/dot_za_slice_int_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/dot_za_slice_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/dot_za_slice_lane_2.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/dot_za_slice_uint_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowxn_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/storexn_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c:
Likewise.
	* 
gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c:

Likewise.
* gcc.target/aarch64/sve/acle/general-c/ternary_qq_or_011_lane_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/unary_convertxn_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c:
Likewise.
* gcc.target/aarch64/sve/acle/general-c/unaryxn_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/write_za_1.c: Likewise.
* gcc.target/aarch64/sve/acle/general-c/write_za_slice_1.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c: Likewise.
* gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c: Likewise.




On 04/10/2024 13:08, Kyrylo Tkachov wrote:

Hi Andre,


On 2 Oct 2024, at 19:13, Andre Vieira  wrote:

External email: Use caution openin

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Qing Zhao


> On Mar 13, 2025, at 12:29, Martin Uecker  wrote:
> 
> Am Donnerstag, dem 13.03.2025 um 15:41 + schrieb Qing Zhao:
>> 
>>> On Mar 12, 2025, at 12:40, Martin Uecker  wrote:
>>> 
>>> Am Mittwoch, dem 12.03.2025 um 16:20 + schrieb Qing Zhao:
 
> On Mar 10, 2025, at 15:34, Martin Uecker  wrote:
> 
> Am Montag, dem 10.03.2025 um 15:00 -0400 schrieb John McCall:
>> 
> 
> ...
> 
>> That said, my preference is still to just give preference to the field 
>> name,
>> which sidesteps any need for disambiguation syntax and avoids this whole
>> problem where structs can be broken by just adding a global variable that
>> happens to collide with a field.
> 
> I don't think it is a good idea when the 'n' in 'buf' refers to the
> previous global 'n' coming before and the 'n' in attribute 
> refers to a member 'n' coming later in the following example.
> 
> constexpr int n = 1;
> 
> struct foo {
> char *p [[gnu::counted_by(n)]];
> char buf[n];
> int n;
> };
> 
> How are you going to explain this to anyone?
> 
> 
> And neither global names nor struct members may always be under
> the control of the programmer.  Also that simply bringing
> a new identifier into scope can break code elsewhere worries me.
> 
> 
> Finally, the following does not even compile in C++.
> 
> struct foo {
> char buf[n];
> const static int n = 2;
> };
> 
> While the next example is also ok in C++.
> 
> constexpr int n = 2;
> 
> struct foo {
> char buf[n];
> };
> 
> With both declarations of 'n' the example has UB in C++. 
> So I am not convinced the proposed rules make a lot
> of sense for C++ either.
> 
> 
> Disambiguation with '__self__.'  completely avoids all these issues
> while keeping the door open for later improvements.  
> 
> I still think one could use designator syntax, i.e. '.n', which
> would be clearer and intuitive for both C and C++ programmers.
 
 I think the major reason to use __self.n instead of .n is:
 
 The dot (.) operator, i.e., the member access operator in C, is used to 
 access the member of an _instance_ of 
 a structure/union.
 We should declare a variable with a structure type first, and then append 
 this member access operator to this 
 variable and followed by the member name to access the member, and then 
 use it in the expressions.
>>> 
>>> For a designator
>>> 
>>> struct foo { int n; } a = { .n = 1 };
>>> 
>>> we also refer to a member 'n' of an instance 'a' of a structure type.
>>> The instance is simply implied by the context.
>>> 
>>> For 
>>> 
>>> struct foo { int n; char *x __counted_by(.n) };
>>> 
>>> is also refers to a member of an instance of the struct. The
>>> instance is the 'a' which is later used in an expression 'a.x'
>>> So the instance would again be implied by the context.
>>> 
>>> So for me this makes perfect sense in both cases (and
>>> for both C and C++)
>> 
>> Why does ‘.n' also make sense in C++?
> 
> For my perspective, it makes sense because C++ also already
> uses this syntax of designators, so this syntax should already be
> familiar to C++ programmers just like it is for C programmers: 
> https://godbolt.org/z/7saEofhEb

Okay. Make sense to me too. -:)

> 
> It would also disambiguate the name lookup just as it does in C.
> 
> So it seems to be a possible way forward while avoiding
> language divergence and without introducing anything too novel
> in either language.

Yes, I agree.  This looks like a good compromise between two languages. 
> 
> (But others still have concerns about .n and prefer __self__.)

Yeah, need to think about this a little more.

Qing


> 
> Martin
> 
> 
> 



Re: The COBOL front end, version 3, now in 14 easy pieces (C++14)

2025-03-13 Thread James K. Lowden
On Thu, 13 Mar 2025 13:37:32 -0400
Paul Koning  wrote:

> > 4.  cast pointers formatted with %p as (void*)
> 
> Could that be (const void *) instead?

Yes.  Nothing is committed yet; I'll make that change first.  

Could you explain why it matters, please, for my edification?  

--jkl



Re: [PATCH][v3] Simple cobol.dg testsuite

2025-03-13 Thread James K. Lowden
On Tue, 11 Mar 2025 14:40:19 +0100 (CET)
Richard Biener  wrote:

> > Looking at pretty much all of the above, it seems very Fortran
> > specific with its weird diagnostic output (capital Warning:/Error:,
> > the (1) and (2) in the diagnostics with later printing of those
> > lines and the like. Unless cobol1 does it the same, this should be
> > replaced with what is done for other FEs (likely almost nothing).
> 
> OK, I've done that and amended the set of testcases with one
> exercising dg-error.  I had to prune the sprious
> 
> cobol1: error: failed compiling t.cob
> 
> message we emit.  I don't see any warnings emitted from the frontend
> and wasn't able to create a Cobol program where the middle-end 
> would emit one.
> 
> I'll wait for an ACK from the Cobol folks before pushing.

ok by me, and thanks.

--jkl


Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread Qing Zhao


> On Mar 12, 2025, at 12:58, Joseph Myers  wrote:
> 
> On Wed, 12 Mar 2025, Martin Uecker wrote:
> 
>> For a designator
>> 
>> struct foo { int n; } a = { .n = 1 };
>> 
>> we also refer to a member 'n' of an instance 'a' of a structure type.
>> The instance is simply implied by the context.
>> 
>> For 
>> 
>> struct foo { int n; char *x __counted_by(.n) };
>> 
>> is also refers to a member of an instance of the struct. The
>> instance is the 'a' which is later used in an expression 'a.x'
>> So the instance would again be implied by the context.
>> 
>> So for me this makes perfect sense in both cases (and
>> for both C and C++)
> 
> The main concern with the designator syntax is if you try to embed it in 
> arbitrary expressions (that is, say that __counted_by takes an expression, 
> but with an additional kind of primary-expression .IDENTIFIER that can be 
> used as a sub-expression therein).  The above is fine, but
> 
> struct foo { int n; char *x __counted_by((struct bar){.n = 1}.n };
> 
> leaves an ambiguity of whether ".n = 1" is a designated initializer in the 
> struct bar compound literal, or an assignment expression where .n refers 
> to the member of the struct foo for which the number of elements of x is 
> being counted.  Note that N3188 definitely does not allow .IDENTIFIER as 
> part of an expression, only as an alternative to an expression in an array 
> declarator.

Is N3188 the following:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm

What’s the status of this proposal?

Qing

> 
> -- 
> Joseph S. Myers
> josmy...@redhat.com
> 



[PATCH, part2, committed] Fortran: improve checking of substring bounds [PR119118]

2025-03-13 Thread Harald Anlauf

Dear all,

while testing different stuff using code from the initial commit,
I figured that I copy-pasted erroneous code that could lead to
an infinite loop which did not update its control variable,
and an unhandled REF_INQUIRY in a switch statement.

Fixed and committed as simple and obvious, also in the function
from where I plagiated... ;-)  after regtesting.

See r15-8040-ga5d56278d145d4 and attached.

Thanks,
Harald

Am 06.03.25 um 23:00 schrieb Steve Kargl:

On Thu, Mar 06, 2025 at 10:49:08PM +0100, Harald Anlauf wrote:


Thanks for the speedy review!



It was a bit easier than normal.  After I submitted
the PR, I started to poke around in fortran/resolve.cc
to see if I could deal with the issue.  I saw that you
grab the PR last night, and left you to work your
magic.

From a5d56278d145d439092adf6f65c865c85623f881 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 13 Mar 2025 18:46:54 +0100
Subject: [PATCH] Fortran: improve checking of substring bounds [PR119118]

Commit r15-7873 copy-pasted erroneous code containing a non-terminating
loop that did not progress its control variable, and a switch statement
with an unhandled case leading to a gcc_unreachable () with suitable input.

	PR fortran/119118

gcc/fortran/ChangeLog:

	* dependency.cc (contains_forall_index_p): Let loop over elements
	of a constructor update its control variable.  Handle REF_INQUIRY
	in switch statement.
	(gfc_contains_implied_index_p): Likewise.

gcc/testsuite/ChangeLog:

	* gfortran.dg/bounds_check_26.f90: Update test.
---
 gcc/fortran/dependency.cc | 6 --
 gcc/testsuite/gfortran.dg/bounds_check_26.f90 | 4 +++-
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/fortran/dependency.cc b/gcc/fortran/dependency.cc
index 28b872f6638..57c0c49391b 100644
--- a/gcc/fortran/dependency.cc
+++ b/gcc/fortran/dependency.cc
@@ -1853,7 +1853,7 @@ contains_forall_index_p (gfc_expr *expr)
 case EXPR_STRUCTURE:
 case EXPR_ARRAY:
   for (c = gfc_constructor_first (expr->value.constructor);
-	   c; gfc_constructor_next (c))
+	   c; c = gfc_constructor_next (c))
 	if (contains_forall_index_p (c->expr))
 	  return true;
   break;
@@ -1874,6 +1874,7 @@ contains_forall_index_p (gfc_expr *expr)
 	break;
 
   case REF_COMPONENT:
+  case REF_INQUIRY:
 	break;
 
   case REF_SUBSTRING:
@@ -1933,7 +1934,7 @@ gfc_contains_implied_index_p (gfc_expr *expr)
 case EXPR_STRUCTURE:
 case EXPR_ARRAY:
   for (c = gfc_constructor_first (expr->value.constructor);
-	   c; gfc_constructor_next (c))
+	   c; c = gfc_constructor_next (c))
 	if (gfc_contains_implied_index_p (c->expr))
 	  return true;
   break;
@@ -1954,6 +1955,7 @@ gfc_contains_implied_index_p (gfc_expr *expr)
 	break;
 
   case REF_COMPONENT:
+  case REF_INQUIRY:
 	break;
 
   case REF_SUBSTRING:
diff --git a/gcc/testsuite/gfortran.dg/bounds_check_26.f90 b/gcc/testsuite/gfortran.dg/bounds_check_26.f90
index 69ac9fbe2f2..ddfcbd07f3c 100644
--- a/gcc/testsuite/gfortran.dg/bounds_check_26.f90
+++ b/gcc/testsuite/gfortran.dg/bounds_check_26.f90
@@ -19,6 +19,8 @@ program main
   print *,  str(-n:11)  ! 2 checked bounds
   print *, len (str(-n:11)) ! 2 checked bounds
 
+  print *,  str(-n*n%kind:sum(n-[0,n%kind])) ! 2 checked bounds
+
 end program main
 
-! { dg-final { scan-tree-dump-times "Substring out of bounds:" 10 "original" } }
+! { dg-final { scan-tree-dump-times "Substring out of bounds:" 12 "original" } }
-- 
2.43.0



Re: [PATCH] libstdc++: Use __is_invocable/nothrow_invocable builtins more

2025-03-13 Thread Jonathan Wakely
On Thu, 6 Mar 2025 at 16:29, Jonathan Wakely  wrote:
>
> On Wed, 29 Jan 2025 at 18:55, Patrick Palka wrote:
> >
> > Tested on x86_64-pc-linux-gnu, I suppose this is stage 1 material?
>
> Yes, I think so, since it could impact C++17 programs.
>
> OK for trunk as soon as stage 1 opens though, thanks.

Oh I just realised we're already using the built-in for
is_nothrow_invocable, so this just extends it to the internal
__is_nothrow_invocable and the variable template. Sorry for not
realising the context of this.

So it shouldn't be risky. It's just completing the changes in the
commits you mentioned below.

OK for trunk now, rather than waiting for stage 1.

>
>
> >
> > -- >8 --
> >
> > As a follow-up to r15-1253 and r15-1254.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/std/type_traits (__is_invocable): Define in terms of
> > corresponding builtin if available.
> > (__is_nothrow_invocable): Likewise.
> > (is_invocable_v): Likewise.
> > (is_nothrow_invocable_v): Likewise.
> > ---
> >  libstdc++-v3/include/std/type_traits | 19 ++-
> >  1 file changed, 18 insertions(+), 1 deletion(-)
> >
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index 33892818257..f576d5b1426 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -3211,7 +3211,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >template
> >  struct __is_invocable
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_invocable)
> > +: __bool_constant<__is_invocable(_Fn, _ArgTypes...)>
> > +#else
> >  : __is_invocable_impl<__invoke_result<_Fn, _ArgTypes...>, void>::type
> > +#endif
> >  { };
> >
> >template
> > @@ -3262,8 +3266,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >// __is_nothrow_invocable (std::is_nothrow_invocable for C++11)
> >template
> >  struct __is_nothrow_invocable
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_nothrow_invocable)
> > +: __bool_constant<__is_nothrow_invocable(_Fn, _Args...)>
> > +#else
> >  : __and_<__is_invocable<_Fn, _Args...>,
> >   __call_is_nothrow_<_Fn, _Args...>>::type
> > +#endif
> >  { };
> >
> >  #pragma GCC diagnostic push
> > @@ -3702,10 +3710,19 @@ template 
> >inline constexpr bool is_convertible_v = is_convertible<_From, 
> > _To>::value;
> >  #endif
> >  template
> > -  inline constexpr bool is_invocable_v = is_invocable<_Fn, 
> > _Args...>::value;
> > +  inline constexpr bool is_invocable_v
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_invocable)
> > += __is_invocable(_Fn, _Args...);
> > +#else
> > += is_invocable<_Fn, _Args...>::value;
> > +#endif
> >  template
> >inline constexpr bool is_nothrow_invocable_v
> > +#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_nothrow_invocable)
> > += __is_nothrow_invocable(_Fn, _Args...);
> > +#else
> >  = is_nothrow_invocable<_Fn, _Args...>::value;
> > +#endif
> >  template
> >inline constexpr bool is_invocable_r_v
> >  = is_invocable_r<_Ret, _Fn, _Args...>::value;
> > --
> > 2.48.1.131.gda898a5c64
> >



Re: [PATCH] aarch64: remove SVE2 requirement from SME and diagnose it as unsupported

2025-03-13 Thread Richard Sandiford
"Andre Vieira (lists)"  writes:
> Apologies for the delay, had been waiting on some other relevant patches 
> to go in to make sure we didn't break any valid existing behaviours. It 
> should all be working properly now. I think I've also addressed all your 
> comments.  Most notable change is that it now uses the 'sorry' mechanism.
>
> Bootstrapped and regression tested on aarch64-none-linux-gnu.
>
>
> aarch64: remove SVE2 requirement from SME and diagnose it as unsupported
>
> As per the AArch64 ISA FEAT_SME does not require FEAT_SVE2, so we are 
> removing
> that false dependency in GCC.  However, we chose for now to not support this
> combination of features and will diagnose the combination of FEAT_SME 
> without
> FEAT_SVE2 as unsupported by GCC.  We may choose to support this in the 
> future.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-arches.def (SME): Remove SVE2 as prerequisite
>   and add in FCMA and F16FML.
>   * config/aarch64/aarch64.cc (aarch64_override_options_internal):
>   Diagnose use of SME without SVE2.
>   * doc/invoke.texi (sme): Document that this can only be used with the
>   sve2 extension.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/no-sve-with-sme-1.c: New.
>   * gcc.target/aarch64/no-sve-with-sme-2.c: New.
>   * gcc.target/aarch64/no-sve-with-sme-3.c: New.
>   * gcc.target/aarch64/sme/streaming_mode_4.c: Diagnose new error.
>   * gcc.target/aarch64/pragma_cpp_predefs_4.c: Pass +sve2 to existing
>   +sme pragma.
>   * gcc.target/aarch64/sve/acle/general-c/binary_int_opt_single_n_2.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_opt_single_n_2.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_single_1.c: Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/binary_za_slice_int_opt_single_1.c:
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_2.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_3.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_4.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_2.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_3.c:
>   Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/binary_za_slice_uint_opt_single_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/binaryxn_2.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/clamp_1.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/compare_scalar_count_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/dot_za_slice_int_lane_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/dot_za_slice_lane_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/dot_za_slice_lane_2.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/dot_za_slice_uint_lane_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/shift_right_imm_narrowxn_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/storexn_1.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c:
>   Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/ternary_qq_or_011_lane_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/unary_convertxn_1.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c:
>   Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/unaryxn_1.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/write_za_1.c: Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/write_za_slice_1.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c: Likewise.
>   * gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c: Likewise.

Thanks for the update.  I'll go through the tests prope

Re: [PATCH] c++: Make explicit instantiations not vague linkage

2025-03-13 Thread Jason Merrill

On 3/13/25 11:16 AM, Nathaniel Shead wrote:

I discovered from some further testing that I broke 'import std' in some
cases with my last patch; this fixes that.

This still isn't sufficient I've found to fix PR119154 completely, as
there's still more cases where this assert fires due to performing
import_export_decl on non-DECL_REALLY_EXTERN at EOF.  I'll continue
trying to reduce and find each case.

But either way I think this is a needed improvement; bootstrapped and
regtested on x86_64-pc-linux-gnu (so far just modules.exp), OK for trunk
if full regtest succeeds?

-- >8 --

My change in r15-8012 for PR c++/119154 caused a bug with explicit
instantation declarations.  The change cleared DECL_INTERFACE_KNOWN for
all vague-linkage entities, including explicit instantiations.  When we
then perform lazy loading at EOF (due to processing deferred function
bodies), expand_or_defer_fn ends up calling import_export_decl which
will error because DECL_INTERFACE_KNOWN is still unset but no definition
is available in the file, violating some assertions.

It turns out that for function templates marked inline we would not
respect an 'extern template' imported in general, either; this patch
fixes both of these issues by always treating explicit instantiations as
external, and so marking DECL_INTERFACE_KNOWN eagerly.

For an explicit instantiation declaration we don't want to emit the body
of the function as it must be emitted in a different TU anyway.  And for
explicit instantiation definitions we similarly know that it will have
been emitted in the interface TU we streamed it in from, so there's
no need to emit it.

gcc/cp/ChangeLog:

* decl2.cc (vague_linkage_p): Explicit instantiations are not
vague linkage.


This seems like a big hammer for fixing a modules-specific issue; let's 
address it in the modules code, not here.



gcc/testsuite/ChangeLog:

* g++.dg/modules/extern-tpl-3_a.C: New test.
* g++.dg/modules/extern-tpl-3_b.C: New test.
* g++.dg/modules/extern-tpl-4_a.C: New test.
* g++.dg/modules/extern-tpl-4_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/decl2.cc   |  3 ++
  gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C | 11 +
  gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C | 12 +
  gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C | 24 ++
  gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C | 46 +++
  5 files changed, 96 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
  create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 4a9fb1c3c00..712fdc45d40 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -2480,6 +2480,9 @@ vague_linkage_p (tree decl)
/* Unfortunately, import_export_decl has not always been called
   before the function is processed, so we cannot simply check
   DECL_COMDAT.  */
+  if (DECL_LANG_SPECIFIC (decl)
+  && DECL_EXPLICIT_INSTANTIATION (decl))
+return false;
if (DECL_COMDAT (decl)
|| (TREE_CODE (decl) == FUNCTION_DECL
  && DECL_DECLARED_INLINE_P (decl)
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
new file mode 100644
index 000..def3cd1413d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
@@ -0,0 +1,11 @@
+// { dg-additional-options "-fmodules -Wno-global-module" }
+// { dg-module-cmi M }
+
+module;
+template 
+struct S {
+  S() {}
+};
+export module M;
+extern template class S;
+S s;
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
new file mode 100644
index 000..5d96937ce02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
@@ -0,0 +1,12 @@
+// { dg-additional-options "-fmodules" }
+
+template 
+struct S {
+  S() {}
+};
+
+void foo() { S x;}
+
+import M;
+
+// Lazy loading of extern S at EOF should not ICE
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C 
b/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
new file mode 100644
index 000..16f1b041307
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
@@ -0,0 +1,24 @@
+// { dg-additional-options "-fmodules" }
+// { dg-module-cmi M }
+
+export module M;
+
+export template  inline void a() {}
+extern template void a();
+extern template void a();
+template void a();
+
+export template  void b() {}
+extern template void b();
+extern template void b();
+template void b();
+
+export template  inline int c = 123;
+extern template int c;
+extern template int c;
+template int c;
+
+export template  int d = 123;
+extern template int d;
+extern template int d;
+template int d;
diff --git a/gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C 
b/gcc/testsuite/g++.dg/modules/extern-tp

Re: The COBOL front end, version 3, now in 14 easy pieces (C++14)

2025-03-13 Thread Paul Koning



> On Mar 13, 2025, at 11:27 AM, James K. Lowden  
> wrote:
> 
> On Mon, 10 Mar 2025 19:10:26 +0100
> Richard Biener  wrote:
> 
>>> What is the right answer?  Designated initializers are part of C99,
>>> but weren't added to C++ until C++20
>>> (https://en.cppreference.com/w/cpp/language/initialization).
>>> Strictly speaking, we should remove all of them, because our
>>> baseline is C++14.
>> 
>> Yes.
> 
> I have completed work on this, where "completed" means "9 tests fail".  Bob 
> and I are looking into why, because of course my changes weren't meant to 
> change anything.  
> 
> The changes include
> 
> 1.  remove VLAs in favor of std::vector
> 2.  write contructors for classes containing unions
> 3.  eliminate use of designated initializers
> 4.  cast pointers formatted with %p as (void*)

Could that be (const void *) instead?

paul

Re: [PATCH] c, c++: Support musttail attribute even using __attribute__ form [PR116545]

2025-03-13 Thread Jason Merrill

On 3/13/25 11:27 AM, Jakub Jelinek wrote:

Hi!

Apparently some programs in the wild use
#if __has_attribute(musttail)
   __attribute__((musttail)) return foo ();
#else
   return foo ();
#endif
clang supports musttail both as a standard attribute ([[clang::musttail]]
which we also support for compatibility) and the above worked just
fine with GCC 14 which had __has_attribute(musttail) 0.  Now that it is
0, this doesn't compile anymore.
So, either we need to ensure that __has_attribute(musttail) is 0
and just __has_c{,pp}_attribute({gnu,clang}::musttail) are non-zero,
or IMHO better we just make it work in the attribute form, especially for
C < C23 I can see why some projects would prefer that form.
While [[gnu::musttail]] is rejected as an error in C11 etc. before GCC 15,
rather than just handled as an unknown attribute.
I view this as both a regression and compatibility issue.
The patch handles it in similar spots to fallthrough/assume attributes
inside of __attribute__.

While working on it, I've noticed we weren't diagnosing arguments to the
clang::musttail attribute (fixed by the c-attribs.cc hunk) and newly
on the __attribute__ form attribute (in that case the arguments aren't just
skipped, they are always parsed and because we don't call decl_attributes
etc., it wouldn't be diagnosed without a manual check).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-03-13  Jakub Jelinek  

PR c/116545
gcc/
* doc/extend.texi (musttail statement attribute): Document
that musttail GNU attribute can be used as well.
gcc/c-family/
* c-attribs.cc (c_common_clang_attributes): Add musttail.
gcc/c/
* c-parser.cc (c_parser_declaration_or_fndef): Parse
__attribute__((musttail)) return.
(c_parser_handle_musttail): Diagnose attribute arguments.
(c_parser_statement_after_labels): Parse
__attribute__((musttail)) return.
gcc/cp/
* parser.cc (cp_parser_expression_statement): Parse
__attribute__((musttail)) return.


Parsing a jump-statement under cp_parser_expression_statement just 
because it happens to start with __attribute is pretty strange.


How about changing cp_parser_std_attribute_spec_seq in 
cp_parser_statement to cp_parser_attributes_opt?


Jason



Re: [PATCH] COBOL v3: 3/14 80K bld: config and build machinery

2025-03-13 Thread Paul Koning



> On Mar 13, 2025, at 2:33 PM, James K. Lowden  wrote:
> 
> ...
> 3.  --enable-languages=cobol, and
> 4.  the host and target are "plausible", 64-bit LE.

Why does it need LE?  I understand 64 bits.  I might try it on my PowerPC based 
Mac.  :-)

paul



Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-03-13 Thread JeanHeyd Meneide
On Thu, Mar 13, 2025 Martin Uecker  wrote:

> ...
>
> So it seems to be a possible way forward while avoiding
> language divergence and without introducing anything too novel
> in either language.
>
> (But others still have concerns about .n and prefer __self__.)


 I would like to gently push back about __self__, or __self, or self,
because all of these identifiers are fairly common identifiers in code.
When I writing the paper for __self_func (
https://thephd.dev/_vendor/future_cxx/papers/C%20-%20__self_func.html ), I
searched GitHub and other source code indexing and repository services:
__self, __self__, and self has a substantial amount of uses. If there's an
alternative spelling to consider, I think that would be helpful.

I would also like to offer that other people have approached me about
`::` as a way to help disambiguate identifiers and prevent local shadowing
in macros ( see: https://github.com/ThePhD/future_cxx/issues/69 ). However,
I don't think it helps with the case of this GCC extension:

int main () {
int n = 1; // a local variable n
struct foo {
int n; // a member variable n
int a[n + 10];  // for VLA, this n refers to the local variable n.
//char *b __attribute__ ((counted_by(n + 10)))
// for counted_by, this n refers to the member variable n.
};
}

  If you use `::n`, this allows you to reference a global variable. But
the contentious `n` here isn't a global variable, it's a local. So it's not
of much help here. If you stack another "n" at the global scope, you then
have another problem:

extern int n;
int main () {
int n = 1; // a local variable n, shadows global
struct foo {
int n; // a member variable n
int a[n + 10];  // for VLA, this n refers to the local variable n.
//char *b __attribute__ ((counted_by(n + 10)))
// for counted_by, this n refers to the member variable n.
};
}

Now, even if you use C++-style `::n`, and then use the rules proposed by
context-sensitivity, it becomes impossible to refer to the local variable
outside of the struct without additional annotation. You get the opposite
of this problem with `${KEYWORD}.n` (${KEYWORD} as a placeholder for
__self, since I still have the above-named problems with __self): it
enables referring to the structure variable with ${KEYWORD}, and the local
variable with nothing, but still leaves the global variable as
non-referenceable anymore. Part of this problem is self-inflicted: VLAs in
structures are a GNU extension and not an ISO C feature (for reasons like
this one). But it's still technically a problem, and we can't necessarily
step on GCC's affordance to make an extension in this space, so whatever we
come up with we will have both problems to fix.

 I see 2 plausible ways forward, though I've only thought about this
for 4 days:

 (0) Accept that Yeoul (and the others) are correct in that issuing an
error (diagnostic) for this case would be better. Effectively, it's just
bad code and you ask the user to change the local variable from e.g. `n`,
which is something they should have control over (theoretically). Then,
standardize `::n` to refer to the global. The local variable could have a
different name, the name in the structure might be similar to a global (but
is found by counted_by's lookup), and the global variable has to be named
explicitly with `::n". This does not necessarily solve the forward
reference problem, but all solutions proposed here require delayed
resolution (especially to deal with the common struct case), so this seems
like a moot point in-general.

 (1) Accept that we need ${KEYWORD}, or ${DOT} , to refer to locals.
This does not solve the problem where a local variable shadows a global
variable, so even if this path is taken I would still suggest `::n` to go
with it, so that we can solve the problem where a local variable shadows a
global variable. Then there's no new real "lookup rule", so people who feel
like we're violating C's core design space might feel less uneasy because
you have to use the new syntax (a keyword or `.`) to access in-struct
things. This still has a forward reference problem, so it's once again moot
whether or not the forward reference problem can be solved here.

 The (0) solution can be seen as more "natural"; there's no dots, no
keyword, but it requires a potential change in local variables for
conflicting cases. `::global` comes along for the ride as the way to
separate member fields from globals. I could see this working and, as I
understand it, this is the choice Clang was currently progressing with (?).

 The (1) solution can be seen as less "natural"; it requires extra
syntax to say what is, overwhelmingly, the common use case and ISO
standard-supported use case to make way for a pathological GNU extension in
VLA members. It becomes a bit more natural if you use {DOT}, rather than
{KEYWORD}, thanks to designated initializers being in both Standard

Re: [PATCH] aarch64: arm: testsuite: Enable vld1xN and vst1xN tests [PR71233]

2025-03-13 Thread Christophe Lyon
On Tue, 11 Mar 2025 at 17:21, Christophe Lyon
 wrote:
>
> r14-7202-gc8ec3e1327cb1e added vld1xN and vst1xN intrinsics and some
> tests on arm, but didn't enable some existing tests.
>
> Since these tests are shared with aarch64, this patch replaces the
> 'dg-skip-if "unimplemented" { arm*-*-* }' directives with:
> dg-require-effective-target arm_neon_ok { target arm*-*-* }
> dg-add-options arm_neon
>
> so that we add the options needed on arm only when targeting arm.
>
> float16 intrinsics would require neon-fp16 FPU, poly64 intrinsics
> would require crypto-neon-fp-armv8: the patch enables the
> corresponding tests on aarch64 only, since they are already covered by
> other tests in gcc.target/arm/simd/.  For some reason, poly64 tests
> where missing from x2 and x3 tests, so the patch adds them as needed.
>
> Tested on aarch64-linux-gnu (no change) and arm-linux-gnueabihf (the
> additional tests pass).
>
> gcc/testsuite/
> PR target/71233
> * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Enable on arm.
> * gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise.
> * gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise.
> * gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Likewise.
> * gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise.
> * gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.
> ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vld1x2.c  | 11 ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vld1x3.c  | 11 ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vld1x4.c  | 13 -
>  .../gcc.target/aarch64/advsimd-intrinsics/vst1x2.c  | 11 ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vst1x3.c  | 11 ---
>  .../gcc.target/aarch64/advsimd-intrinsics/vst1x4.c  | 13 -
>  6 files changed, 48 insertions(+), 22 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c 
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> index 6e56ff171f8..bc2aad09a9c 100644
> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x2.c
> @@ -1,7 +1,8 @@
>  /* We haven't implemented these intrinsics for arm yet.  */
>  /* { dg-do run } */
> -/* { dg-skip-if "unimplemented" { arm*-*-* } } */
>  /* { dg-options "-O3" } */
> +/* { dg-require-effective-target arm_neon_ok { target arm*-*-* } } */
> +/* { dg-add-options arm_neon } */

Sorry, this does not work when we run the tests with flags setting
arch/cpu to M-profile without multilib.

For instance, if we force:
-mthumb -march=armv8-m.main+dsp+fp -mtune=cortex-m33 -mfloat-abi=hard -mfpu=auto
on a toolchain configured using:
--disable-multilib --with-mode=thumb --with-cpu=cortex-m33 --with-float=hard
arm_neon_ok computes that it needs to add:
-mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a
but we fail to link the testcase because we do not have crt0/libs in
softfp mode).

Sigh :-)

Christophe


>
>  #include 
>  #include "arm-neon-ref.h"
> @@ -40,7 +41,6 @@ VARIANT (int32, 2, _s32)  \
>  VARIANT (int64, 1, _s64)   \
>  VARIANT (poly8, 8, _p8)\
>  VARIANT (poly16, 4, _p16)  \
> -VARIANT (float16, 4, _f16) \
>  VARIANT (float32, 2, _f32) \
>  VARIANT (uint8, 16, q_u8)  \
>  VARIANT (uint16, 8, q_u16) \
> @@ -52,11 +52,16 @@ VARIANT (int32, 4, q_s32)   \
>  VARIANT (int64, 2, q_s64)  \
>  VARIANT (poly8, 16, q_p8)  \
>  VARIANT (poly16, 8, q_p16) \
> -VARIANT (float16, 8, q_f16)\
>  VARIANT (float32, 4, q_f32)
>
> +/* On arm, poly64 and float16 have dedicated tests
> +   (gcc.target/arm/simd/vld1*) */
>  #ifdef __aarch64__
>  #define VARIANTS(VARIANT) VARIANTS_1(VARIANT)  \
> +VARIANT (poly64, 1, _p64)  \
> +VARIANT (poly64, 2, q_p64) \
> +VARIANT (float16, 4, _f16) \
> +VARIANT (float16, 8, q_f16)\
>  VARIANT (mfloat8, 8, _mf8) \
>  VARIANT (mfloat8, 16, q_mf8)   \
>  VARIANT (float64, 1, _f64) \
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c 
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> index 42aeadf1c7d..5b86ed32930 100644
> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vld1x3.c
> @@ -1,7 +1,8 @@
>  /* We haven't implemented these intrinsics for arm yet.  */
>  /* { dg-do run } */
> -/* { dg-skip-if "unimplemented" { arm*-*-* } } */
>  /* { dg-options "-O3" } */
> +/* { dg-require-effective-target arm_neon_ok { target arm*-*-* } } */
> +/* { dg-add-options arm_neon } */
>
>  #include 
>  #include "arm-neon-ref.h"
> @@ -41,7 +42,6 @@ VARIANT (int32, 2, _s32)  \
>  VARIANT (int64, 1, _s64)   \
>  VARIANT (poly8, 8, _p8)\
>  VARIANT (poly16, 4, _p16)  \
> -VARIANT (float16, 

Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Thu, 13 Mar 2025 at 23:28, Patrick Palka  wrote:
> > Oh, never mind. The pack is just deduced as an empty pack.
>
> Yep that's my understanding, though I don't know where in the standard
> this is specified, a quick Ctrl+F is failing me.

I'll go with https://eel.is/c++draft/temp.variadic#9, and in particular
https://eel.is/c++draft/temp.variadic#9.sentence-4

> I can use template or template if that's
> preferred :)

I think I would prefer template, that wouldn't cause
the pack-deduction
hallucinations. :)


[c-family] Fix PR ada/119265

2025-03-13 Thread Eric Botcazou
This plugs a small loophole in the pattern matching done by -fdump-ada-spec.

Tested on x86-64/Linux, applied on mainline, 14 and 13 branches.


2025-03-13  Eric Botcazou  

PR ada/119265
* c-ada-spec.cc (dump_ada_node) : Deal with typedefs
of unsigned __int128.

-- 
Eric Botcazoudiff --git a/gcc/c-family/c-ada-spec.cc b/gcc/c-family/c-ada-spec.cc
index 152fb2093df..c7ae032230a 100644
--- a/gcc/c-family/c-ada-spec.cc
+++ b/gcc/c-family/c-ada-spec.cc
@@ -2255,8 +2255,8 @@ dump_ada_node (pretty_printer *pp, tree node, tree type, int spc,
 case BOOLEAN_TYPE:
   if (TYPE_NAME (node)
 	  && !(TREE_CODE (TYPE_NAME (node)) == TYPE_DECL
-	   && !strcmp (IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (node))),
-			   "__int128")))
+	   && !strncmp (IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (node))),
+			   "__int128", 8)))
 	{
 	  if (TREE_CODE (TYPE_NAME (node)) == IDENTIFIER_NODE)
 	pp_ada_tree_identifier (pp, TYPE_NAME (node), node,


[PATCH] COBOL: Prevent use of ASM_EXPR for optimized COBOL compilations [PR119214]

2025-03-13 Thread Robert Dubner
Based on the PR119214 discussion about "-O -ftracer" causing the assembler
to fail, I offer the following patch.

Okay for trunk?  (Gives me shivers to say that the first time!)

Author: Robert Dubner 
Date:   Thu Mar 13 21:03:46 2025 -0400

COBOL: Prevent use of ASM_EXPR for optimized COBOL compilations
[PR119214]

The creation of assembler labels using ASM_EXPR causes name collisions
in the
assembly language because some optimizations repeat code, and those
labels
can get repeated. Use of "if( !optimize )" prevents (at least) that
problem when
it cropped up with "-O -ftrace"

gcc/cobol/ChangeLog:

PR cobol/119214
* prevent use of ASM_EXPR for optimized cobol compilations
This temporary fix will be replaced with a LABEL_DECL solution

diff --git a/gcc/cobol/gengen.cc b/gcc/cobol/gengen.cc
index 4fc0a830c1e..e4331204d0a 100644
--- a/gcc/cobol/gengen.cc
+++ b/gcc/cobol/gengen.cc
@@ -3429,30 +3429,35 @@ gg_trans_unit_var_decl(const char *var_name)
 void
 gg_insert_into_assembler(const char *format, ...)
   {
-  // This routine inserts text directly into the assembly language
stream.
-
-  // Note that if for some reason your text has to have a '%' character,
it
-  // needs to be doubled in the GENERIC tag.  And that means if it is in
the
-  // 'format' variable, it needs to be quadrupled.
+  // Temporarily defeat all ASM_EXPR for optimized code per PR119214
+  // The correct solution using LABEL_DECL is forthcoming
+  if( !optimize )
+{
+// This routine inserts text directly into the assembly language
stream.
+
+// Note that if for some reason your text has to have a '%'
character, it
+// needs to be doubled in the GENERIC tag.  And that means if it is
in the
+// 'format' variable, it needs to be quadrupled.
+
+// Create the string to be inserted:
+char ach[256];
+va_list ap;
+va_start(ap, format);
+vsnprintf(ach, sizeof(ach), format, ap);
+va_end(ap);
+
+// Create the required generic tag
+tree asm_expr = build5_loc( location_from_lineno(),
+ASM_EXPR,
+VOID,
+build_string(strlen(ach), ach),
+NULL_TREE,
+NULL_TREE,
+NULL_TREE,
+NULL_TREE);
+//SET_EXPR_LOCATION (asm_expr, UNKNOWN_LOCATION);

-  // Create the string to be inserted:
-  char ach[256];
-  va_list ap;
-  va_start(ap, format);
-  vsnprintf(ach, sizeof(ach), format, ap);
-  va_end(ap);
-
-  // Create the required generic tag
-  tree asm_expr = build5_loc( location_from_lineno(),
-  ASM_EXPR,
-  VOID,
-  build_string(strlen(ach), ach),
-  NULL_TREE,
-  NULL_TREE,
-  NULL_TREE,
-  NULL_TREE);
-  //SET_EXPR_LOCATION (asm_expr, UNKNOWN_LOCATION);
-
-  // And insert it as a statement
-  gg_append_statement(asm_expr);
+// And insert it as a statement
+gg_append_statement(asm_expr);
+}
   }


Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Ville Voutilainen
On Fri, 14 Mar 2025 at 00:04, Ville Voutilainen
 wrote:
>
> On Fri, 14 Mar 2025 at 00:03, Ville Voutilainen
>  wrote:
> >
> > On Thu, 13 Mar 2025 at 23:57, Jonathan Wakely  wrote:
> > > > Do we also want to constraint the tuple(const _Elements&...)
> > > > constructor with requires sizeof...(_Elements) >= 1, which is present
> > > > on the C++17 version?
> > >
> > > Oh we don't need that constraint, because we have an explicit
> > > specialization for tuple<>.
> >
> > Are.. ..you sure? What about CTAD? That will look into the primary
> > template only, and will perform its madness based on that
> > and nothing else.
>
> Oh, but CTAD would fail if there's no arguments passed.
>
> So.. maybe we never needed that constraint, and the only reason I
> added it was not trusting my (or anyone else's) understanding
> of partial ordering and overload resolution.

Chances are of course that that's not it, and there's been some
over-eagerness in the compiler to instantiate
those declarations for the template/specialization overload
resolution, and then some of the original meta-helpers
were instantiated with empty packs. They didn't like that. We used
that constraint to send that evaluation to
the false-case of what's now _TupleConstraints.


Re: [PATCH] COBOL v3: 3/14 80K bld: config and build machinery

2025-03-13 Thread James K. Lowden
On Thu, 13 Mar 2025 14:45:23 -0400
Paul Koning  wrote:

> > 3.  --enable-languages=cobol, and
> > 4.  the host and target are "plausible", 64-bit LE.
> 
> Why does it need LE?  I understand 64 bits.  I might try it on my
> PowerPC based Mac.  :-)

That's a good point.  "Need" I don't know.  I'm not even sure we test
for it to determine "plausible".  Thanks for pointing that out.  

Talking out of school, I think Bob would say if it's never been tried,
it's more likely than not to fail.  For speed of computation, we start
off with minimal size integers, as small as one byte (which the user
may define), and scale up the intermediate value as needs dictate.  How
that all behaves on a big endian system is a phenomenon yet
unobserved.  

On your old Mac (I had one, too) we could start on Monday and check
back on Tuesday.  ;-)

Seriously, once things settle down (they will, right?) we'll get access
to the cfarm and see where the bits fly.  

--jkl



Re: [PATCH] libstdc++: Work around C++20 tuple> constraint recursion [PR116440]

2025-03-13 Thread Jonathan Wakely
On Thu, 13 Mar 2025 at 21:54, Jonathan Wakely  wrote:
>
> On Thu, 13 Mar 2025 at 21:29, Patrick Palka  wrote:
> >
> > On Thu, 13 Mar 2025, Ville Voutilainen wrote:
> >
> > > On Thu, 13 Mar 2025 at 23:16, Ville Voutilainen
> > >  wrote:
> > > >
> > > > On Thu, 13 Mar 2025 at 23:03, Patrick Palka  wrote:
> > > > > +  // Defined as a template to work around PR libstdc++/116440.
> > > > > +  template
> > > > > +   constexpr explicit(!__convertible())
> > > > > +   tuple(const _Elements&... __elements)
> > > >
> > > > I don't understand how a constructor template declared like this can
> > > > ever be called. The template parameter pack
> > > > can't be provided or deduced, and can't have a default. So we're
> > > > effectively making this signature always lose
> > > > overload resolution to the one that takes a pack of _UElements&&.
> > > >
> > > > Which may be fine. I can't head-compile a test that would fail in that
> > > > case. If any of the incoming argument isn't one
> > > > of _Elements, that constructor wins overload resolution anyway. If the
> > > > incoming arguments are exactly _Elements, that
> > > > constructor does the same thing as this one. I think.
> > >
> > > Oh, never mind. The pack is just deduced as an empty pack.
> >
> > Yep that's my understanding, though I don't know where in the standard
> > this is specified, a quick Ctrl+F is failing me.
> >
> > I can use template or template if that's
> > preferred :)
>
> I would prefer template to the empty pack, I think
> the default template argument makes it a little more obvious how that
> constructor can be called (I'm sure Ville won't be the only one to
> raise an eyebrow at that).
>
> Thanks for figuring this out, and noticing that that the template-ness
> of that constructor is what changed between C++17 and C++20. I think
> when I re-implemented it using concepts I assumed the template-ness
> was there for the _ImplicitCtor / _ExplicitCtor stuff, which is done
> using explicit(bool) in C++20. I wasn't looking at the tuple(const
> _Elements&...) constructor at all, because the errors all pointed to
> tuple(_UTypes&&...).
>
> Do we also want to constraint the tuple(const _Elements&...)
> constructor with requires sizeof...(_Elements) >= 1, which is present
> on the C++17 version?

Oh we don't need that constraint, because we have an explicit
specialization for tuple<>.



Re: c++/modules: Stream section, tls_model, and comdat_group

2025-03-13 Thread Jason Merrill

On 3/12/25 10:57 AM, Nathaniel Shead wrote:

On Mon, Mar 10, 2025 at 02:52:07PM -0400, Jason Merrill wrote:

On 3/10/25 9:52 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Or should this wait for GCC16?

-- >8 --

While looking at PR c++/119154 I noticed some more properties that we
don't stream or check properly.  This patch adds the ones we were
missing, and adds checks that the values don't clash with existing
decls.


These seem to me like properties that should be recomputed rather than
streamed; aren't we already streaming the section attribute?



Hm, right; we're streaming the attributes but in general we don't
reapply any of the effects they have that aren't streamed regardless.
We should probably do that somehow; it looks like using decl_attributes
directly for this might do too much so I suppose we would need another
interface for this.


I'd probably go ahead trying to use decl_attributes and deal with any 
problems.



This is probably the way to go to fix PR108080, too; rather than messing
around with trying to work out how to stream OPTIMIZATION_NODEs etc. we
can just reapply the attribute, which also would probably have the
expected behaviour in the case of mismatching optimisation flags between
exporter and consumer.


This comment also applies to the existing streaming of tls_model.


I assume we can just use 'decl_default_tls_model' in the no-attribute
case?  Is there anything we should worry about wrt to it potentially
providing different results in different modules (due to different
choices of -ftls-model)?


I think then you get what you get, just as with headers.  If it breaks, 
you shouldn't build your project with inconsistent ABI flags.



Also, while reworking my other patch I'm now semi-convinced that there's
no need to stream comdat group, it should be recalculated correctly
anyway in all cases that I could find.  But I definitely might have
missed something; either way it hasn't bitten us yet.

Nathaniel





Re: [PATCH v2] c++/modules: Always treat explicit instantiations as external

2025-03-13 Thread Nathaniel Shead
On Thu, Mar 13, 2025 at 01:37:37PM -0400, Jason Merrill wrote:
> On 3/13/25 11:16 AM, Nathaniel Shead wrote:
> > I discovered from some further testing that I broke 'import std' in some
> > cases with my last patch; this fixes that.
> > 
> > This still isn't sufficient I've found to fix PR119154 completely, as
> > there's still more cases where this assert fires due to performing
> > import_export_decl on non-DECL_REALLY_EXTERN at EOF.  I'll continue
> > trying to reduce and find each case.
> > 
> > But either way I think this is a needed improvement; bootstrapped and
> > regtested on x86_64-pc-linux-gnu (so far just modules.exp), OK for trunk
> > if full regtest succeeds?
> > 
> > -- >8 --
> > 
> > My change in r15-8012 for PR c++/119154 caused a bug with explicit
> > instantation declarations.  The change cleared DECL_INTERFACE_KNOWN for
> > all vague-linkage entities, including explicit instantiations.  When we
> > then perform lazy loading at EOF (due to processing deferred function
> > bodies), expand_or_defer_fn ends up calling import_export_decl which
> > will error because DECL_INTERFACE_KNOWN is still unset but no definition
> > is available in the file, violating some assertions.
> > 
> > It turns out that for function templates marked inline we would not
> > respect an 'extern template' imported in general, either; this patch
> > fixes both of these issues by always treating explicit instantiations as
> > external, and so marking DECL_INTERFACE_KNOWN eagerly.
> > 
> > For an explicit instantiation declaration we don't want to emit the body
> > of the function as it must be emitted in a different TU anyway.  And for
> > explicit instantiation definitions we similarly know that it will have
> > been emitted in the interface TU we streamed it in from, so there's
> > no need to emit it.
> > 
> > gcc/cp/ChangeLog:
> > 
> > * decl2.cc (vague_linkage_p): Explicit instantiations are not
> > vague linkage.
> 
> This seems like a big hammer for fixing a modules-specific issue; let's
> address it in the modules code, not here.

Fair enough, like this then?

Tested modules.exp so far, OK for trunk if full bootstrap and regtest
succeeds?

-- >8 --

Subject: [PATCH] c++/modules: Always treat explicit instantiations as external

My change in r15-8012 for PR c++/119154 caused a bug with explicit
instantation declarations.  The change cleared DECL_INTERFACE_KNOWN for
all vague-linkage entities, including explicit instantiations.  When we
then perform lazy loading at EOF (due to processing deferred function
bodies), expand_or_defer_fn ends up calling import_export_decl which
will error because DECL_INTERFACE_KNOWN is still unset but no definition
is available in the file, violating some assertions.

It turns out that for function templates marked inline we would not
respect an 'extern template' imported in general, either; this patch
fixes both of these issues by always treating explicit instantiations as
external, and so marking DECL_INTERFACE_KNOWN eagerly.

For an explicit instantiation declaration we don't want to emit the body
of the function as it must be emitted in a different TU anyway.  And for
explicit instantiation definitions we similarly know that it will have
been emitted in the interface TU we streamed it in from, so there's
no need to emit it.

gcc/cp/ChangeLog:

* module.cc (trees_out::core_bools): Always consider explicit
instantiations as external.

gcc/testsuite/ChangeLog:

* g++.dg/modules/extern-tpl-3_a.C: New test.
* g++.dg/modules/extern-tpl-3_b.C: New test.
* g++.dg/modules/extern-tpl-4_a.C: New test.
* g++.dg/modules/extern-tpl-4_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  | 11 -
 gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C | 11 +
 gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C | 12 +
 gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C | 24 ++
 gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C | 46 +++
 5 files changed, 102 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-3_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/extern-tpl-4_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 0d9e50bba7f..7440a9015b4 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -5660,7 +5660,11 @@ trees_out::core_bools (tree t, bits_out& bits)
   we need to import or export any vague-linkage entities on
   stream-in.  */
bool interface_known = t->decl_common.lang_flag_5;
-   if (interface_known && vague_linkage_p (t))
+   if (interface_known && vague_linkage_p (t)
+   /* But explicit instantiations are not vague linkage; we can always
+  rely on there being a definition in another TU.  */
+   && !(DECL_LANG_SPECIFIC (t)
+  

[PATCH 2/3] Aarch64: Add __sqrt and __sqrtf intrinsics to arm_acle.h

2025-03-13 Thread Ayan Shafqat
This patch introduces two new inline functions, __sqrt and __sqrtf, in
arm_acle.h for Aarch64 targets. These functions wrap the new builtins
__builtin_aarch64_sqrtdf and __builtin_aarch64_sqrtsf, respectively,
providing direct access to hardware instructions without relying on the
standard math library or optimization levels.

gcc/ChangeLog:

* config/aarch64/arm_acle.h (__sqrt, __sqrtf): New function.

Signed-off-by: Ayan Shafqat 
---
 gcc/config/aarch64/arm_acle.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 7976c117daf..d972a4e7e7e 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -118,6 +118,20 @@ __revl (unsigned long __value)
 return __rev (__value);
 }
 
+__extension__ extern __inline double
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__sqrt(double  __x)
+{
+return __builtin_aarch64_sqrtdf (__x);
+}
+
+__extension__ extern __inline float
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__sqrtf(float __x)
+{
+return __builtin_aarch64_sqrtsf (__x);
+}
+
 #pragma GCC push_options
 #pragma GCC target ("+nothing+jscvt")
 __extension__ extern __inline int32_t
-- 
2.43.0



  1   2   >