date:20250407

Re: [Patch, fortran] PR119460 - gfortran.dg/reduce_1.f90 FAILs

2025-04-07 Thread Christophe Lyon

On Sun, 6 Apr 2025 at 14:39, Paul Richard Thomas
 wrote:
>
> Hi All,
>
> As far as I can tell, the attached patch fixes the problems with the reduce 
> intrinsic. I would be grateful to the reporters if they would confirm that 
> this is the case.
>
> The key to the fix appears in reduce_3.f90, which failed even with -m64. 
> Although it was not apparent from the tree dump, the scalar result was going 
> on the stack. Once it became larger than the word size, it pushed the 
> arguments out of alignment with the library prototype.
>
> I took the opportunity to add character length checking to the library. I 
> think that it might be redundant and so might not appear in the submitted 
> version. Thus far, I have failed to trigger the errors because the frontend 
> seems to catch them all. reduce_c and reduce_scalar_c will look a lot neater 
> without them.
>
> Harald has been enormously helpful in hunting out remaining problems and 
> providing fixes. These are woven into the patch.
>
> Regtests on FC41/x86_64 - OK for mainline after confirmations from the 
> reporters?
>

Hi!

I've just verified that all gfortran.dg/reduce_*.f90 now pass on
arm-unknown-linux-gnueabihf.

Thanks,

Christophe


> Paul
>

[PATCH] libstdc++: Fix use-after-free in std::format [PR119671]

2025-04-07 Thread Jonathan Wakely

When formatting floating-point values to wide strings there's a case
where we invalidate a std::wstring buffer while a std::wstring_view is
still referring to it.

libstdc++-v3/ChangeLog:

PR libstdc++/119671
* include/std/format (__formatter_fp::format): Do not invalidate
__wstr unless _M_localized returns a valid string.
* testsuite/std/format/functions/format.cc: Check wide string
formatting of floating-point types with classic locale.
---

Tested x86_64-linux.

 libstdc++-v3/include/std/format  |  6 +++---
 .../testsuite/std/format/functions/format.cc | 12 
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 01a53143d1c..2e9319cdda6 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -1838,9 +1838,9 @@ namespace __format
 
  if (_M_spec._M_localized && __builtin_isfinite(__v))
{
- __wstr = _M_localize(__str, __expc, __fc.locale());
- if (!__wstr.empty())
-   __str = __wstr;
+ auto __s = _M_localize(__str, __expc, __fc.locale());
+ if (!__s.empty())
+   __str = __wstr = std::move(__s);
}
 
  size_t __width = _M_spec._M_get_width(__fc);
diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
b/libstdc++-v3/testsuite/std/format/functions/format.cc
index 000f2671816..93c33b456e6 100644
--- a/libstdc++-v3/testsuite/std/format/functions/format.cc
+++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
@@ -370,6 +370,18 @@ test_wchar()
   // P2909R4 Fix formatting of code units as integers (Dude, whereâs my 
char?)
   s = std::format(L"{:d} {:d}", wchar_t(-1), char(-1));
   VERIFY( s.find('-') == std::wstring::npos );
+
+  auto ws = std::format(L"{:L}", 0.5);
+  VERIFY( ws == L"0.5" );
+  // The default C locale.
+  std::locale cloc = std::locale::classic();
+  // PR libstdc++/119671 use-after-free formatting floating-point to wstring
+  ws = std::format(cloc, L"{:L}", 0.5);
+  VERIFY( ws == L"0.5" );
+  // A locale with no name, but with the same facets as the C locale.
+  std::locale locx(cloc, &std::use_facet>(cloc));
+  ws = std::format(locx, L"{:L}", 0.5);
+  VERIFY( ws == L"0.5" );
 }
 
 void
-- 
2.49.0

Re:[pushed] [PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng


Pushed to r15-9245 and r14-11538.

在 2025/4/7 下午3:44, Lulu Cheng 写道:

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

Sigend-off-by: Xi Ruoyao 
Sigend-off-by: Jakub Jelinek 

---
v1 -> v2:
Corrected typos in commit information.
v2 -> v3:
Regenerate libgfortran/configure using gnu autoconf2.69.
---
  libgfortran/acinclude.m4 | 4 
  libgfortran/configure| 8 
  libquadmath/configure| 8 
  libquadmath/configure.ac | 4 
  4 files changed, 24 insertions(+)

diff --git a/libgfortran/acinclude.m4 b/libgfortran/acinclude.m4
index a73207e5465..23fd621e518 100644
--- a/libgfortran/acinclude.m4
+++ b/libgfortran/acinclude.m4
@@ -274,6 +274,10 @@ AC_DEFUN([LIBGFOR_CHECK_FLOAT128], [
AC_CACHE_CHECK([whether we have a usable _Float128 type],
   libgfor_cv_have_float128, [
 GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  _Float128 foo (_Float128 x)
  {
   _Complex _Float128 z1, z2;
diff --git a/libgfortran/configure b/libgfortran/configure
index 11a1bc5f070..9898a94a372 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -30283,6 +30283,10 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
  
+#ifdef __loongarch__

+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  _Float128 foo (_Float128 x)
  {
   _Complex _Float128 z1, z2;
@@ -30336,6 +30340,10 @@ fi
  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
  
+#ifdef __loongarch__

+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  _Float128 foo (_Float128 x)
  {
   _Complex _Float128 z1, z2;
diff --git a/libquadmath/configure b/libquadmath/configure
index 49d70809218..f82dd3d0d6d 100755
--- a/libquadmath/configure
+++ b/libquadmath/configure
@@ -12843,6 +12843,10 @@ else
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
  
+#ifdef __loongarch__

+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
  typedef _Complex float __attribute__((mode(TC))) __complex128;
  #else
@@ -12894,6 +12898,10 @@ fi
  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
  
+#ifdef __loongarch__

+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
  typedef _Complex float __attribute__((mode(TC))) __complex128;
  #else
diff --git a/libquadmath/configure.ac b/libquadmath/configure.ac
index 349be2607c6..c64a8489219 100644
--- a/libquadmath/configure.ac
+++ b/libquadmath/configure.ac
@@ -233,6 +233,10 @@ AM_CONDITIONAL(LIBQUAD_USE_SYMVER_SUN, [test 
"x$quadmath_use_symver" = xsun])
  
  AC_CACHE_CHECK([whether __float128 is supported], [libquad_cv_have_float128],

[GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
  #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
  typedef _Complex float __attribute__((mode(TC))) __complex128;
  #else

[PATCH] deref-before-check-pr113253.c: Fix bogus warnings on lp32

2025-04-07 Thread Jonathan Yong


Attached patch OK for master branch?
Will push soon if there are no objections.
From 66c30c0db9560af4f61ebda0742d0eb7da45f474 Mon Sep 17 00:00:00 2001
From: Jonathan Yong <10wa...@gmail.com>
Date: Mon, 7 Apr 2025 15:40:05 +
Subject: [PATCH] deref-before-check-pr113253.c: Fix bogus warnings on lp32

Warnings about pointer sizes cause the test to fail
incorrectly. A dummy return value is also added to
set_marker_internal for completeness to suppress a
-Wreturn-type warning even though gcc does not issue
it by default.

Signed-off-by: Jonathan Yong <10wa...@gmail.com>

gcc/testsuite/ChangeLog:

	* gcc.dg/analyzer/deref-before-check-pr113253.c:
	(ptrdiff_t): use stddef.h type.
	(uintptr_t): ditto.
	(EMACS_INT): ditto.
	(set_marker_internal): Add dummy 0 to suppress -Wreturn-type.
---
 .../gcc.dg/analyzer/deref-before-check-pr113253.c| 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c b/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c
index d9015accd6a..1890312cc1a 100644
--- a/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c
+++ b/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c
@@ -5,12 +5,12 @@
 
 /* { dg-additional-options "-O2 -g" } */
 
-typedef long int ptrdiff_t;
-typedef unsigned long int uintptr_t;
-typedef long int EMACS_INT;
+typedef __PTRDIFF_TYPE__ ptrdiff_t;
+typedef __UINTPTR_TYPE__ uintptr_t;
+typedef __PTRDIFF_TYPE__ EMACS_INT;
 enum
 {
-  EMACS_INT_WIDTH = 64,
+  EMACS_INT_WIDTH = sizeof(EMACS_INT) * 8,
   VALBITS = EMACS_INT_WIDTH - 3,
 };
 typedef struct Lisp_X* Lisp_Word;
@@ -151,4 +151,5 @@ set_marker_internal(Lisp_Object position, Lisp_Object buffer)
   struct buffer* b = live_buffer(buffer);
   if (NILP(position) || (MARKERP(position) && !XMARKER(position)->buffer) || !b) /* { dg-bogus "Wanalyzer-deref-before-check" } */
 unchain_marker();
+  return 0;
 }
-- 
2.49.0

vsetvl abormal edge (was Re: [PATCH v2] RISC-V: vsetvl: skip abnormal edge on vsetvl insertion [PR119533])

2025-04-07 Thread Vineet Gupta

On 3/31/25 21:54, Jeff Law wrote:
> And if that's the case then you can't simply skip an abnormal edge.  You 
> have to do something sensible.
>
> That "something sensible" has traditionally been to ensure there is 
> never a need propagated to an edge since you can't insert on an abnormal 
> critical edge.
>
>> [ ... ]
> If there is a need in the block, then every path to the block must have 
> the right value.  That's precisely my point.  You can't simply skip an 
> edge in this case.
>
> What needs to happen is you need to find a way to ensure there is no 
> need at the start of the block that has incoming abnormal/EH edges. 
> This is a classic problem in LCM algorithms.
> See "prune_expressions" in gcse.cc.  I think the moral equivalent for 
> vsetvl generation is to conceptually kill every vsetvl at the entry 
> point to any block that as an incoming abnormal edge.  That should push 
> the insertion point into the block instead of to the incoming edges.

P.S: Apologies for the long email, but I think we agreed on that precedent
already ;-)

OK, it seems vsetvl is pretty much using the same logic as in gcse.cc
As pointed earlier, invalid_opt_bb_p () is indeed indentifying such BBs and
removed from anticipatable and transparent sets for LCM, but doesn't seem to
reflect in LCM output (see details below).

Thx to my colleague Mark Ryan, a resident go expert, we have a simpler go test
now (attached here)

Following is the annotated asm of interest.
 - bb 17 is the one with V insns, needing vsetvl - which LCM tries to "bubble
up" to 15, 16, .. 31
 - bb 38 is the one with unwinder call - note it's dump is code_label/s so it is
only referenced as rtl struct (via eh table, no direct call flow leading to it).

...

    ld    s9,176(sp)
    li    s10,0
    j    .L9

    # (code_label 165 164 166 31 7 (nil) [2 uses])
    # (note 166 165 167 31 [bb 31] NOTE_INSN_BASIC_BLOCK)
.L7:
    addi    s10,s10,1
    beq    s10,s11,.L10

    # (code_label 168 7 73 15 9 (nil) [1 uses])
    # (note 73 168 74 15 [bb 15] NOTE_INSN_BASIC_BLOCK)
.L9:
    slli    a5,s10,4
    add    a5,s9,a5
    ld    a4,0(a5)

    # (note 246 76 77 16 [bb 16] NOTE_INSN_BASIC_BLOCK)
    sd    a4,32(sp)
    ld    a5,8(a5)

    # (note 247 78 79 17 [bb 17] NOTE_INSN_BASIC_BLOCK) <---

    sd    a5,40(sp)
    vsetivli    zero,2,e64,m1,ta,ma    # 405
    vle64.v    v1,0(s6)    # 361
    addi    a5,sp,112
    vsetivli    zero,2,e64,m1,ta,ma    # 406
    vse64.v    v1,0(a5)    # 362
    beq    s0,zero,.L7
    ld    a1,112(sp)
    ld    a2,120(sp)
    mv    a0,s4
    call    runtime.ifaceE2T2P
    sd    a1,136(sp)
    sd    a1,56(sp)

    # (note 248 91 102 19 [bb 19] NOTE_INSN_BASIC_BLOCK)

    sd    a0,128(sp)
    sd    a0,48(sp)
    andi    a1,a1,0xff
    beq    a1,zero,.L7
...
...
...
    # (note 255 152 156 30 [bb 30] NOTE_INSN_BASIC_BLOCK)

    sd    a0,144(sp)
    sd    a0,80(sp)
    sd    a1,152(sp)
    sd    a1,88(sp)
    ld    a5,0(a0)

    # Another instance of [bb 31] created by later bbro
.LEHE1:
    addi    s10,s10,1
    bne    s10,s11,.L9
.L10:
    addi    s1,s1,1
    bne    s1,s2,.L5
    li    a4,0
...
...
    # (code_label/s 256 347 258 38 15 (nil) [1 uses])
    # (note 258 256 222 38 [bb 38] NOTE_INSN_BASIC_BLOCK)   # culprit EH
block which leads to EDGE_ABNORMAL
.L15:
.LEHB4:
    call    _Unwind_Resume
.LEHE4:
.LFE1:
    .section    .gcc_except_table,"a",@progbits
...
    .uleb128 .L15-.LFB1
    .uleb128 0
    .uleb128 .LEHB4-.LFB1


We see the following output where LCM is working its way up (read from bottom 
up)

;; 19 succs { 20 31 }    # P2 lift up 3 (succ 31 was lift up 2)
;; 30 succs { 38 31 }    # P2 lift up 3 (succ 31 was lift up 2)

;; 31 succs { 44 32 }    # P2 lift up 2b (succ 44 was succ of 15 which
was lift up 1)
;; 44 succs { 15 }    # P2 lift up 2a (succ 15 was lift up 1)

;; 15 succs { 38 16 }    # P2. lift up 1 (succ 16 was lift up 1)

;; 16 succs { 38 17 }    # P2. lift up 0 (succ 17 had vsetvl locally 
needed)

;; 17 succs { 18 31 }    # P1. VSETVL locally needed


Although I have evidence that bb 16, 17 etc with abnormal edges are being
skipped in LCM phases 2 and 3.
When they are identified, we clear them from Anticipated and Transparent sets.

Phase 2: Lift up vsetvl info.

  Try lift up 0.

 --- skipping abnormal SUCC edge bb 5 -> bb 38):  --- skipping bb 5):
 --- skipping abnormal SUCC edge bb 6 -> bb 38):  --- skipping bb 6):
 --- skipping abnormal SUCC edge bb 7 -> bb 38):  --- skipping bb 7):
 --- skipping abnormal SUCC edge bb 8 -> bb 38):  --- skipping bb 8):
 --- skipping abnormal SUCC edge bb 9 -> bb 38):  --- skipping bb 9):
 -

Re: [PATCH] libstdc++: Fix use-after-free in std::format [PR119671]

2025-04-07 Thread Tomasz Kaminski

On Tue, Apr 8, 2025 at 1:25 AM Jonathan Wakely  wrote:

> When formatting floating-point values to wide strings there's a case
> where we invalidate a std::wstring buffer while a std::wstring_view is
> still referring to it.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/119671
> * include/std/format (__formatter_fp::format): Do not invalidate
> __wstr unless _M_localized returns a valid string.
> * testsuite/std/format/functions/format.cc: Check wide string
> formatting of floating-point types with classic locale.
> ---
>
> Tested x86_64-linux.
>
LGTM.

>
>  libstdc++-v3/include/std/format  |  6 +++---
>  .../testsuite/std/format/functions/format.cc | 12 
>  2 files changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> index 01a53143d1c..2e9319cdda6 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -1838,9 +1838,9 @@ namespace __format
>
>   if (_M_spec._M_localized && __builtin_isfinite(__v))
> {
> - __wstr = _M_localize(__str, __expc, __fc.locale());
> - if (!__wstr.empty())
> -   __str = __wstr;
> + auto __s = _M_localize(__str, __expc, __fc.locale());
> + if (!__s.empty())
> +   __str = __wstr = std::move(__s);
> }
>
>   size_t __width = _M_spec._M_get_width(__fc);
> diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc
> b/libstdc++-v3/testsuite/std/format/functions/format.cc
> index 000f2671816..93c33b456e6 100644
> --- a/libstdc++-v3/testsuite/std/format/functions/format.cc
> +++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
> @@ -370,6 +370,18 @@ test_wchar()
>// P2909R4 Fix formatting of code units as integers (Dude, where’s my
> char?)
>s = std::format(L"{:d} {:d}", wchar_t(-1), char(-1));
>VERIFY( s.find('-') == std::wstring::npos );
> +
> +  auto ws = std::format(L"{:L}", 0.5);
> +  VERIFY( ws == L"0.5" );
> +  // The default C locale.
> +  std::locale cloc = std::locale::classic();
> +  // PR libstdc++/119671 use-after-free formatting floating-point to
> wstring
> +  ws = std::format(cloc, L"{:L}", 0.5);
> +  VERIFY( ws == L"0.5" );
> +  // A locale with no name, but with the same facets as the C locale.
> +  std::locale locx(cloc, &std::use_facet>(cloc));
> +  ws = std::format(locx, L"{:L}", 0.5);
> +  VERIFY( ws == L"0.5" );
>  }
>
>  void
> --
> 2.49.0
>
>

Re: [PATCH v5 5/5] libgomp: Add AArch64 SVE target tests to libgomp.

2025-04-07 Thread Tejas Belagod


On 4/7/25 3:33 PM, Jakub Jelinek wrote:

On Mon, Apr 07, 2025 at 03:28:29PM +0530, Tejas Belagod wrote:

Add AArch64 SVE target exectute tests to test various workshare constructs and
clauses with SVE types.

libgomp/ChangeLog:

* testsuite/libgomp.c-target/aarch64/aarch64.exp: Test driver.
* testsuite/libgomp.c-target/aarch64/firstprivate.c: New test.
* testsuite/libgomp.c-target/aarch64/lastprivate.c: Likewise.
* testsuite/libgomp.c-target/aarch64/private.c: Likewise.
* testsuite/libgomp.c-target/aarch64/shared.c: Likewise.
* testsuite/libgomp.c-target/aarch64/simd-aligned.c: Likewise.
* testsuite/libgomp.c-target/aarch64/simd-nontemporal.c: Likewise.
* testsuite/libgomp.c-target/aarch64/threadprivate.c: Likewise.
* testsuite/libgomp.c-target/aarch64/udr-sve.c: Likewise.


Ok.



Thanks for the reviews. All patches now applied on trunk.

Thanks,
Tejas.

Re: [PATCH] cobol: Address some iconv issues.

2025-04-07 Thread Iain Sandoe

> On 22 Mar 2025, at 23:13, Robert Dubner  wrote:

> But, by all means, if you have a fix for something I am not seeing, a fix
> that doesn't mess with the status quo ante, then by all means, apply it.

I applied the simplest fix possible - which was to remove the trailling //
from the conversion specifier.

Re-tested on all the platforms with a working implementation and applied
to trunk as attached.

thanks
Iain

0001-cobol-Address-some-iconv-issues.patch
Description: Binary data

Re: [Stage1][Middle-end][object-size][PATCH v1] Evaluate the object size by the size of the pointee type

2025-04-07 Thread Siddhesh Poyarekar


On 2025-04-07 14:53, Qing Zhao wrote:

Is there a reason to do this at the very end like this and not in the 
GIMPLE_ASSIGN case in the switch block?  Something like this:

 tree rhs = gimple_assign_rhs1 (stmt);
 tree counted_by_ref = NULL_TREE;
 if (gimple_assign_rhs_code (stmt) == POINTER_PLUS_EXPR
 || (gimple_assign_rhs_code (stmt) == ADDR_EXPR
 && TREE_CODE (TREE_OPERAND (rhs, 0)) == MEM_REF))
   reexamine = plus_stmt_object_size (osi, var, stmt);
 else if (gimple_assign_rhs_code (stmt) == COND_EXPR)
   reexamine = cond_expr_object_size (osi, var, stmt);
 else if (gimple_assign_single_p (stmt)
  || gimple_assign_unary_nop_p (stmt))
   {
 if (TREE_CODE (rhs) == SSA_NAME
 && POINTER_TYPE_P (TREE_TYPE (rhs)))
   reexamine = merge_object_sizes (osi, var, rhs);
 else
   expr_object_size (osi, var, rhs);
   }
+else if ((counted_by_ref = fam_struct_with_counted_by (rhs)))
+  record_fam_object_size (osi, var, counted_by_ref);
 else
   unknown_object_size (osi, var);

where you can then fold in all your gating conditions, including getting the 
counted_by ref into the fam_struct_with_counted_by and then limit the 
record_fam_object_size to just evaluating the type size + counted_by size.

This may even help avoid the insertion order hack you have to do, i.e. the 
gsi_insert_seq_before vs gsi_insert_seq_after.


This is a good suggestion. I will try this to see any issue there.

My initial thought is to give the counted_by information the lowest priority
  if there are other information (for example, malloc) available.

Do you see any issue here?


No, that is the right idea, but I don't know if you'll actually need to 
choose.  AFAICT, you'll either be able to trace the pointer to an 
allocation, in which case the fallback is unnecessary.  Otherwise you'll 
trace it to one of the following:


1. An assignment from an expression that returns the pointer
2. A NOP with a var_decl, which is handled in the GIMPLE_NOP case; you'd 
need to add a similar hook there.


I can't think of any other cases off the top of my head, how about you?


Also, it seems like simply making build_counted_by_ref available may be 
unnecessary and maybe you could explore associating the counted_by 
component_ref with the parent type somehow.  Alternatively, how about building 
an .ACCESS_WITH_SIZE for types with FAM that have a counted_by?  That would 
allow the current access_with_size_object_size() to work out of the box.


I am not sure about this though.

Our initial design is to change every component_ref  (which is a reference to 
the FAM)
in the data flow of the routine that has a counted_by attribute to a  call to 
.ACCESS_WITH_SIZE.
Then put this call to .ACCESS_WITH_SIZE into the data flow of the routine.

Now, if we try to associate counted_by information to the parent type, how can 
we add such information
To the data flow of the routine if there is no explicit reference to the array 
itself?


I'm not entirely sure, but maybe whenever there is an access on a ptr to 
the parent struct, add a call to .ACCESS_WITH_SIZE there, with a built 
expression for its size?  e.g for:


struct S
{
  size_t c;
  char a[] __counted_by (c);
}

void foo (Struct S *s)
{
  ...
  sz = __builtin_dynamic_object_size (s, 0);
  ...
}

we could generate:

void foo (struct S *s)
{
  ...
  sz_exp = c + sizeof (struct S);
  s_1 = .ACCESS_WITH_SIZE (&s..., &c, ...);
  ...
  sz = __builtin_dynamic_object_size (*s_1, 0);
}

or something like that.  But like I said, it's an alternative idea to 
avoid special-casing in tree-object-size, which should provide size 
information across all passes not just for object size.


Thanks,
Sid

Re: [PATCH] c: fix checking for a tag for variably modified tagged types [PR119612]

2025-04-07 Thread Joseph Myers

On Sat, 5 Apr 2025, Martin Uecker wrote:

> 
> 
> The checking assertion added for PR118765
> 
> https://gcc.gnu.org/cgit/gcc/commit/?id=accbc1b90bd942aa36ac1485a21056b774ce02df
> 
> did indeed catch some case I hadn't considered.  I think there
> might be other cases in the C FE where we test for !TYPE_NAME
> and where this might be slightly wrong, but I only touched
> cases where this is now exposed by C23.
> 
> Bootstrapped and regression tested for x86_64.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH 1/2] testsuite, cobol: Add libquadmath paths.

2025-04-07 Thread Richard Biener

On Sun, Apr 6, 2025 at 10:03 AM Iain Sandoe  wrote:
>
> Even when we are using IEC 128b floating point, the quadmath library can
> be pulled in 'as needed'.

LGTM.

> gcc/testsuite/ChangeLog:
>
> * lib/cobol.exp: Add libquadmath paths.
>
> Signed-off-by: Iain Sandoe 
> ---
>  gcc/testsuite/lib/cobol.exp | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/gcc/testsuite/lib/cobol.exp b/gcc/testsuite/lib/cobol.exp
> index 968e7f3bca3..6819b930bd1 100644
> --- a/gcc/testsuite/lib/cobol.exp
> +++ b/gcc/testsuite/lib/cobol.exp
> @@ -122,6 +122,15 @@ proc cobol_link_flags { paths } {
> }
> append ld_library_path ":${gccpath}/libgcobol/.libs"
>}
> +  if { [file exists "${gccpath}/libquadmath/.libs/libquadmath.a"] ||
> +[file exists 
> "${gccpath}/libquadmath/.libs/libquadmath.${shlib_ext}"] } {
> +   if { $target_wants_B_option } {
> +  append flags "-B${gccpath}/libquadmath/.libs "
> +   } else {
> +  append flags "-L${gccpath}/libquadmath/.libs "
> +   }
> +   append ld_library_path ":${gccpath}/libquadmath/.libs"
> +  }
>if { [file exists "${gccpath}/libstdc++-v3/src/.libs/libstdc++.a"] ||
>[file exists 
> "${gccpath}/libstdc++-v3/src/.libs/libstdc++.${shlib_ext}"] } {
> if { $target_wants_B_option } {
> --
> 2.39.2 (Apple Git-143)
>

Re: [PATCH] tailc: Extend the IPA-VRP workaround [PR119614]

2025-04-07 Thread Richard Biener

On Mon, 7 Apr 2025, Jakub Jelinek wrote:

> Hi!
> 
> The IPA-VRP workaround in the tailc/musttail passes was just comparing
> the singleton constant from a tail call candidate return with the ret_val.
> This unfortunately doesn't work in the following testcase, where we have
>[local count: 152205050]:
>   baz (); [must tail call]
>   goto ; [100.00%]
> 
>[local count: 762356696]:
>   _8 = foo ();
> 
>[local count: 1073741824]:
>   # _3 = PHI <0B(4), _8(6)>
>   return _3;
> and in the unreduced testcase even more PHIs before we reach the return
> stmt.
> 
> Normally when the call has lhs, whenever we follow a (non-EH) successor
> edge, it calls propagate_through_phis and that walks the PHIs in the
> destination bb of the edge and when it sees a PHI whose argument matches
> that of the currently tracked value (ass_var), it updates ass_var to
> PHI result of that PHI.  I think it is theoretically dangerous that it
> picks the first one, perhaps there could be multiple PHIs, so perhaps safer
> would be walk backwards from the return value up to the call.
> 
> Anyway, this PR is about the IPA-VRP workaround, there ass_var is NULL
> because the potential tail call has no lhs, but ret_var is not TREE_CONSTANT
> but SSA_NAME with PHI as SSA_NAME_DEF_STMT.  The following patch handles
> it by pushing the edges we've walked through when ass_var is NULL into a
> vector and if ret_var is SSA_NAME set to PHI result, it attempts to walk
> back from the ret_var through arguments of PHIs corresponding to the
> edges we've walked back until we reach a constant and compare that constant
> against the singleton value as well.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

I do wonder with all these patches whether it would be better to
preserve the LHS on musttail calls instead?

> 2025-04-07  Jakub Jelinek  
> 
>   PR tree-optimization/119614
>   * tree-tailcall.cc (find_tail_calls): Remember edges which have been
>   walked through if !ass_var.  Perform IPA-VRP workaround even when
>   ret_var is not TREE_CONSTANT, in that case check in a loop if it is
>   a PHI result and in that case look at the PHI argument from
>   corresponding edge in the edge vector.
> 
>   * g++.dg/opt/pr119613.C: Change { c || c++11 } in obviously C++ only
>   test to just c++11.
>   * g++.dg/opt/pr119614.C: New test.
> 
> --- gcc/tree-tailcall.cc.jj   2025-04-04 20:52:34.450015821 +0200
> +++ gcc/tree-tailcall.cc  2025-04-05 14:50:50.106693562 +0200
> @@ -920,6 +920,7 @@ find_tail_calls (basic_block bb, struct
>auto_bitmap to_move_defs;
>auto_vec to_move_stmts;
>bool is_noreturn = gimple_call_noreturn_p (call);
> +  auto_vec edges;
>  
>abb = bb;
>agsi = gsi;
> @@ -933,6 +934,8 @@ find_tail_calls (basic_block bb, struct
>   {
> edge e = single_non_eh_succ_edge (abb);
> ass_var = propagate_through_phis (ass_var, e);
> +   if (!ass_var)
> + edges.safe_push (e);
> abb = e->dest;
> agsi = gsi_start_bb (abb);
>   }
> @@ -1040,9 +1043,7 @@ find_tail_calls (basic_block bb, struct
>/* If IPA-VRP proves called function always returns a singleton range,
>the return value is replaced by the only value in that range.
>For tail call purposes, pretend such replacement didn't happen.  */
> -  if (ass_var == NULL_TREE
> -   && !tail_recursion
> -   && TREE_CONSTANT (ret_var))
> +  if (ass_var == NULL_TREE && !tail_recursion)
>   if (tree type = gimple_range_type (call))
> if (tree callee = gimple_call_fndecl (call))
>   if ((INTEGRAL_TYPE_P (type)
> @@ -1052,9 +1053,43 @@ find_tail_calls (basic_block bb, struct
> type)
>   && useless_type_conversion_p (TREE_TYPE (ret_var), type)
>   && ipa_return_value_range (val, callee)
> - && val.singleton_p (&valr)
> - && operand_equal_p (ret_var, valr, 0))
> -   ok = true;
> + && val.singleton_p (&valr))
> +   {
> + tree rv = ret_var;
> + unsigned int i = edges.length ();
> + /* If ret_var is equal to valr, we can tail optimize.  */
> + if (operand_equal_p (ret_var, valr, 0))
> +   ok = true;
> + else
> +   /* Otherwise, if ret_var is a PHI result, try to find out
> +  if valr isn't propagated through PHIs on the path from
> +  call's bb to SSA_NAME_DEF_STMT (ret_var)'s bb.  */
> +   while (TREE_CODE (rv) == SSA_NAME
> +  && gimple_code (SSA_NAME_DEF_STMT (rv)) == GIMPLE_PHI)
> + {
> +   tree nrv = NULL_TREE;
> +   gimple *g = SSA_NAME_DEF_STMT (rv);
> +   for (; i; --i)
> + {
> +   if (edges[i - 1]->dest == gimple_bb (g))
> +

[PATCH] cobol: Fix up make html for COBOL [PR119227]

2025-04-07 Thread Jakub Jelinek

Hi!

What make html does for COBOL is quite inconsistent with all
other FEs.  Normally make html creates HTML/gcc-15.0.1/
subdirectory and puts there subdirectories like gcc, cpp, gccint, gfortran
etc. and only those contain *.html files.  COBOL puts gcobol.html and
gcobol-io.html into the current directory instead.

The following patch puts them into $(build_htmldir)/gcobol/ directory.

Tested on x86_64-linux with make html, ok for trunk?

2025-04-07  Jakub Jelinek  

PR web/119227
* Make-lang.in (GCOBOL_HTML_FILES): New variable.
(cobol.install-html, cobol.html, cobol.srchtml): Use
$(GCOBOL_HTML_FILES) instead of gcobol.html gcobol-io.html.
(gcobol.html): Rename goal to ...
($(build_htmldir)/gcobol/gcobol.html): ... this.  Run mkinstalldirs.
(gcobol-io.html): Rename goal to ...
($(build_htmldir)/gcobol/gcobol-io.html): ... this.  Run mkinstalldirs.

--- gcc/cobol/Make-lang.in.jj   2025-03-31 21:26:51.107135693 +0200
+++ gcc/cobol/Make-lang.in  2025-04-07 12:19:59.451852320 +0200
@@ -40,6 +40,8 @@ GCOBOL_TARGET_INSTALL_NAME := $(target_n
 GCOBC_INSTALL_NAME := $(shell echo gcobc|sed '$(program_transform_name)')
 GCOBC_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo gcobc|sed 
'$(program_transform_name)')
 
+GCOBOL_HTML_FILES = $(addprefix $(build_htmldir)/gcobol/,gcobol.html 
gcobol-io.html)
+
 cobol: cobol1$(exeext)
 cobol.serial = cobol1$(exeext)
 .PHONY: cobol
@@ -303,8 +305,8 @@ cobol.install-pdf: installdirs gcobol.pd
 
 cobol.install-plugin:
 
-cobol.install-html: installdirs gcobol.html gcobol-io.html
-   $(INSTALL_DATA) gcobol.html gcobol-io.html $(DESTDIR)$(htmldir)/
+cobol.install-html: installdirs $(GCOBOL_HTML_FILES)
+   $(INSTALL_DATA) $(GCOBOL_HTML_FILES) $(DESTDIR)$(htmldir)/
 
 cobol.info:
 cobol.srcinfo:
@@ -323,14 +325,16 @@ gcobol-io.pdf: $(srcdir)/cobol/gcobol.3
groff -mdoc -T pdf  $^ > $@~
@mv $@~ $@
 
-cobol.html: gcobol.html gcobol-io.html
-cobol.srchtml: gcobol.html gcobol-io.html
+cobol.html: $(GCOBOL_HTML_FILES)
+cobol.srchtml: $(GCOBOL_HTML_FILES)
ln $^ $(srcdir)/cobol/
 
-gcobol.html: $(srcdir)/cobol/gcobol.1
+$(build_htmldir)/gcobol/gcobol.html: $(srcdir)/cobol/gcobol.1
+   $(mkinstalldirs) $(build_htmldir)/gcobol
mandoc -T html $^ > $@~
@mv $@~ $@
-gcobol-io.html: $(srcdir)/cobol/gcobol.3
+$(build_htmldir)/gcobol/gcobol-io.html: $(srcdir)/cobol/gcobol.3
+   $(mkinstalldirs) $(build_htmldir)/gcobol
mandoc -T html $^ > $@~
@mv $@~ $@
 

Jakub

Re: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores [PR113257].

2025-04-07 Thread Kyrylo Tkachov



> On 7 Apr 2025, at 10:21, Tamar Christina  wrote:
> 
>> -Original Message-
>> From: Kyrylo Tkachov 
>> Sent: Monday, March 31, 2025 1:43 PM
>> To: i...@sandoe.co.uk
>> Cc: Tamar Christina ; GCC Patches > patc...@gcc.gnu.org>; Alice Carlotti ; Richard 
>> Sandiford
>> ; s...@gentoo.org
>> Subject: Re: [PATCH v2] aarch64, Darwin: Initial implementation of Apple 
>> cores
>> [PR113257].
>> 
>> Hi Iain,
>> 
>>> On 22 Mar 2025, at 15:31, Iain Sandoe  wrote:
>>> 
>>> 0. Sorry this has taken some time to close off; partly because of waiting
>>>  for input, but mostly that I've been stretched with other work.
>>> 1. As per the commit message, the apparent non-conformance with 8.5/6
>>>  because FEAT_SPECRES returns 0, is a result of the query operating
>>>  at user priv.  The cores are confirmed to support this for priv.
>>>  code.
>>> 2. I added entries for the apple-m1,2,3 cores in invoke.texi.
>>> 3. Following Andrew's suggestion and with some measurements by Tamar
>>>  and me, figured out the LITTLE.big chip ids (at least for a sub-
>>>  set).
>>> 
>>> This has been in use for a while on aarch64-darwin branches and I've
>>> checked manually that it gives the right .arch lines on cfarm185.
>>> 
>>> OK for trunk? (if so, when?)
>>> thanks
>>> Iain
>>> 
>>> --- 8< ---
>>> 
>>> After discussion with the open source support team at Apple, we have
>>> established that the cores conform to the 8.5 and 8.6 requirements.
>>> One of the mandatory features (FEAT_SPECRES) is not exposed (or
>>> available) in user-space code but is supported for privileged code.
>>> 
>>> The values for chip IDs and the LITTLE.big variants have been taken
>>> from lists in the XNU and LLVM sources.
>>> 
>>> PR target/113257
>>> 
>>> gcc/ChangeLog:
>>> 
>>> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Apple-a12,
>>> Apple-M1, Apple-M2, Apple-M3 with expanded names to allow for the
>>> LITTLE.big versions.
>>> * config/aarch64/aarch64-tune.md: Regenerate.
>>> * doc/invoke.texi: Add apple-m1,2 and 3 cores to the ones listed
>>> for arch and tune selections.
>>> 
>>> Signed-off-by: Iain Sandoe 
>>> ---
>>> gcc/config/aarch64/aarch64-cores.def | 16 
>>> gcc/config/aarch64/aarch64-tune.md   |  2 +-
>>> gcc/doc/invoke.texi  |  5 +++--
>>> 3 files changed, 20 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-cores.def
>> b/gcc/config/aarch64/aarch64-cores.def
>>> index 0e22d72976e..7f204fd0ac9 100644
>>> --- a/gcc/config/aarch64/aarch64-cores.def
>>> +++ b/gcc/config/aarch64/aarch64-cores.def
>>> @@ -173,6 +173,22 @@ AARCH64_CORE("cortex-a76.cortex-a55",
>> cortexa76cortexa55, cortexa53, V8_2A,  (F
>>> AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41,
>> 0xd15, -1)
>>> AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53,
>> 0x41, 0xd14, -1)
>>> 
>>> +/* Apple (A12 and M) cores.
>>> +   Known part numbers as listed in other public sources.
>>> +   Placeholders for schedulers, generic_armv8_a for costs.
>>> +   A12 seems mostly 8.3, M1 is 8.5 without BTI, M2 and M3 are 8.6
>>> +   From measurements made so far the odd-number core IDs are performance.
>> */
>>> +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (),
>> generic_armv8_a, 0x61, 0x12, -1)
>>> +AARCH64_CORE("apple-m1", applem1_0, cortexa57, V8_5A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x21, 0x20), -1)
>>> +AARCH64_CORE("apple-m1", applem1_1, cortexa57, V8_5A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x23, 0x22), -1)
>>> +AARCH64_CORE("apple-m1", applem1_2, cortexa57, V8_5A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x25, 0x24), -1)
>>> +AARCH64_CORE("apple-m1", applem1_3, cortexa57, V8_5A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x29, 0x28), -1)
>>> +AARCH64_CORE("apple-m2", applem2_0, cortexa57, V8_6A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x31, 0x30), -1)
>>> +AARCH64_CORE("apple-m2", applem2_1, cortexa57, V8_6A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x33, 0x32), -1)
>>> +AARCH64_CORE("apple-m2", applem2_2, cortexa57, V8_6A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x35, 0x34), -1)
>>> +AARCH64_CORE("apple-m2", applem2_3, cortexa57, V8_6A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x39, 0x38), -1)
>>> +AARCH64_CORE("apple-m3", applem3_0, cortexa57, V8_6A,  (),
>> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x49, 0x48), -1)
>> 
>> I don’t think we have precedent of different MIDR part numbers resolving to 
>> the
>> same -mcpu string, but I think it should all work as expected.
> 
> Indeed, I think for the current usage it should work fine.
> 
>> As long as you and Tamar are happy with the feature set here no objections 
>> from
>> me.
> 
> FWIW no objections from me.  This should unblock folks 😊
> 
> Thanks,
> Tamar
> 
>> Looks ok to me for GCC 15 with a documentation comment below…
>> 
>>> +
>>> /* Armv9.0-A Architecture Processors.  */
>>> 
>>> /* Arm ('A') cores. */
>>>

[PATCH v5 5/5] libgomp: Add AArch64 SVE target tests to libgomp.

2025-04-07 Thread Tejas Belagod

Add AArch64 SVE target exectute tests to test various workshare constructs and
clauses with SVE types.

libgomp/ChangeLog:

* testsuite/libgomp.c-target/aarch64/aarch64.exp: Test driver.
* testsuite/libgomp.c-target/aarch64/firstprivate.c: New test.
* testsuite/libgomp.c-target/aarch64/lastprivate.c: Likewise.
* testsuite/libgomp.c-target/aarch64/private.c: Likewise.
* testsuite/libgomp.c-target/aarch64/shared.c: Likewise.
* testsuite/libgomp.c-target/aarch64/simd-aligned.c: Likewise.
* testsuite/libgomp.c-target/aarch64/simd-nontemporal.c: Likewise.
* testsuite/libgomp.c-target/aarch64/threadprivate.c: Likewise.
* testsuite/libgomp.c-target/aarch64/udr-sve.c: Likewise.
---
 .../libgomp.c-target/aarch64/aarch64.exp  |  57 
 .../libgomp.c-target/aarch64/firstprivate.c   | 127 +
 .../libgomp.c-target/aarch64/lastprivate.c| 169 +++
 .../libgomp.c-target/aarch64/private.c| 105 +++
 .../libgomp.c-target/aarch64/shared.c | 264 ++
 .../libgomp.c-target/aarch64/simd-aligned.c   |  49 
 .../aarch64/simd-nontemporal.c|  49 
 .../libgomp.c-target/aarch64/threadprivate.c  |  45 +++
 .../libgomp.c-target/aarch64/udr-sve.c|  98 +++
 9 files changed, 963 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/aarch64.exp
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/firstprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/lastprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/private.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/shared.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/simd-aligned.c
 create mode 100644 
libgomp/testsuite/libgomp.c-target/aarch64/simd-nontemporal.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/threadprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/udr-sve.c

diff --git a/libgomp/testsuite/libgomp.c-target/aarch64/aarch64.exp 
b/libgomp/testsuite/libgomp.c-target/aarch64/aarch64.exp
new file mode 100644
index 000..02d5503c48b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-target/aarch64/aarch64.exp
@@ -0,0 +1,57 @@
+# Copyright (C) 2006-2025 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# Load support procs.
+load_lib libgomp-dg.exp
+load_gcc_lib gcc-dg.exp
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+lappend ALWAYS_CFLAGS "compiler=$GCC_UNDER_TEST"
+
+if { [check_effective_target_aarch64_sve] } {
+set sve_flags ""
+} else {
+set sve_flags "-march=armv8.2-a+sve"
+}
+
+# Initialize `dg'.
+dg-init
+
+#if ![check_effective_target_fopenmp] {
+#  return
+#}
+
+# Turn on OpenMP.
+lappend ALWAYS_CFLAGS "additional_flags=-fopenmp"
+
+# Gather a list of all tests.
+set tests [lsort [find $srcdir/$subdir *.c]]
+
+set ld_library_path $always_ld_library_path
+append ld_library_path [gcc-set-multilib-library-path $GCC_UNDER_TEST]
+set_ld_library_path_env_vars
+
+# Main loop.
+dg-runtest $tests "" $sve_flags
+
+# All done.
+dg-finish
diff --git a/libgomp/testsuite/libgomp.c-target/aarch64/firstprivate.c 
b/libgomp/testsuite/libgomp.c-target/aarch64/firstprivate.c
new file mode 100644
index 000..930ca6215b6
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-target/aarch64/firstprivate.c
@@ -0,0 +1,127 @@
+/* { dg-do run { target aarch64_sve256_hw } } */
+/* { dg-options "-msve-vector-bits=256 -fopenmp -O2" } */
+
+#include 
+#include 
+
+static void __attribute__ ((noipa))
+vec_compare (svint32_t *x, svint32_t y)
+{
+  svbool_t p = svnot_b_z (svptrue_b32 (), svcmpeq_s32 (svptrue_b32 (), *x, y));
+
+  if (svptest_any (svptrue_b32 (), p))
+__builtin_abort ();
+}
+
+void __attribute__ ((noipa))
+firstprivate_sections ()
+{
+  int b[8], c[8];
+  svint32_t vb, vc;
+  int i;
+
+#pragma omp parallel for
+  for (i = 0; i < 8; i++)
+{
+  b[i] = i;
+  c[i] = i + 1;
+}
+
+  vb = svld1_s32 (svptrue_b32 (), b);
+  vc = svld1_s32 (svptrue_b32 (), c);
+
+#pragma omp parallel sections firstprivate (vb, vc)
+  {
+#pragma omp section
+vec_compare (&vb, svindex_s32 (0, 1));
+vec_compare (&vc, svindex_s32 (1, 1));
+
+#pragma omp section
+

Re: [PATCH v5 5/5] libgomp: Add AArch64 SVE target tests to libgomp.

2025-04-07 Thread Jakub Jelinek

On Mon, Apr 07, 2025 at 03:28:29PM +0530, Tejas Belagod wrote:
> Add AArch64 SVE target exectute tests to test various workshare constructs and
> clauses with SVE types.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.c-target/aarch64/aarch64.exp: Test driver.
>   * testsuite/libgomp.c-target/aarch64/firstprivate.c: New test.
>   * testsuite/libgomp.c-target/aarch64/lastprivate.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/private.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/shared.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/simd-aligned.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/simd-nontemporal.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/threadprivate.c: Likewise.
>   * testsuite/libgomp.c-target/aarch64/udr-sve.c: Likewise.

Ok.

Jakub

[PATCH v5 0/5] [AArch64, OpenMP] Support SVE types with various OpenMP clauses and constructs

2025-04-07 Thread Tejas Belagod

Hi,

I've combined the two patchsets

  https://gcc.gnu.org/pipermail/gcc-patches/2025-March/678086.html

and the associated tests in
  https://gcc.gnu.org/pipermail/gcc-patches/2025-April/680030.html

into a single patchset with the review comments incorporated. Thanks
Jakub and RichardS for the reviews.

Patches 1-4 have been OKed with the suggested changes. Patch 5 is yet to
get an OK - so will wait until it is Oked before applying.

Thanks,
Tejas.

Richard Sandiford (1):
  gomp: Various fixes for SVE types [PR101018]

Tejas Belagod (4):
  Add function to strip pointer type and get down to the actual pointee
type.
  AArch64: Diagnose OpenMP offloading when SVE types involved.
  AArch64: Add OpenMP target compile error tests
  libgomp: Add AArch64 SVE target tests to libgomp.

 gcc/config/aarch64/aarch64-sve-builtins.cc|   37 +-
 gcc/fold-const.cc |7 +
 gcc/gimplify.cc   |   60 +-
 gcc/omp-low.cc|2 +-
 gcc/poly-int.h|   19 +
 gcc/target.h  |   37 +-
 .../gcc.target/aarch64/sve/gomp/gomp.exp  |   46 +
 .../aarch64/sve/gomp/target-device.c  |  201 ++
 .../gcc.target/aarch64/sve/gomp/target-link.c |   57 +
 .../gcc.target/aarch64/sve/gomp/target.c  | 2049 +
 gcc/tree.h|   11 +
 .../libgomp.c-target/aarch64/aarch64.exp  |   57 +
 .../libgomp.c-target/aarch64/firstprivate.c   |  127 +
 .../libgomp.c-target/aarch64/lastprivate.c|  169 ++
 .../libgomp.c-target/aarch64/private.c|  105 +
 .../libgomp.c-target/aarch64/shared.c |  264 +++
 .../libgomp.c-target/aarch64/simd-aligned.c   |   49 +
 .../aarch64/simd-nontemporal.c|   49 +
 .../libgomp.c-target/aarch64/threadprivate.c  |   45 +
 .../libgomp.c-target/aarch64/udr-sve.c|   98 +
 20 files changed, 3475 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/gomp.exp
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target-device.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target-link.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/aarch64.exp
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/firstprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/lastprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/private.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/shared.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/simd-aligned.c
 create mode 100644 
libgomp/testsuite/libgomp.c-target/aarch64/simd-nontemporal.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/threadprivate.c
 create mode 100644 libgomp/testsuite/libgomp.c-target/aarch64/udr-sve.c

-- 
2.25.1

[PATCH v5 4/5] AArch64: Add OpenMP target compile error tests

2025-04-07 Thread Tejas Belagod

Add compile-only OpenMP error tests for target clause used with SVE types.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/gomp/gomp.exp: Test driver.
* gcc.target/aarch64/sve/gomp/target-device.c: New test.
* gcc.target/aarch64/sve/gomp/target-link.c: Likewise.
* gcc.target/aarch64/sve/gomp/target.c: Likewise.
---
 .../gcc.target/aarch64/sve/gomp/gomp.exp  |   46 +
 .../aarch64/sve/gomp/target-device.c  |  201 ++
 .../gcc.target/aarch64/sve/gomp/target-link.c |   57 +
 .../gcc.target/aarch64/sve/gomp/target.c  | 2049 +
 4 files changed, 2353 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/gomp.exp
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target-device.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target-link.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/gomp/target.c

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/gomp/gomp.exp 
b/gcc/testsuite/gcc.target/aarch64/sve/gomp/gomp.exp
new file mode 100644
index 000..376985de1ff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/gomp/gomp.exp
@@ -0,0 +1,46 @@
+# Copyright (C) 2006-2025 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+  return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+if ![check_effective_target_fopenmp] {
+  return
+}
+
+if { [check_effective_target_aarch64_sve] } {
+set sve_flags ""
+} else {
+set sve_flags "-march=armv8.2-a+sve"
+}
+
+# Main loop.
+dg-runtest [lsort [find $srcdir/$subdir *.c]] "$sve_flags -fopenmp" ""
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/gomp/target-device.c 
b/gcc/testsuite/gcc.target/aarch64/sve/gomp/target-device.c
new file mode 100644
index 000..75dd39bb24a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/gomp/target-device.c
@@ -0,0 +1,201 @@
+/* { dg-do compile } */
+/* { dg-options "-msve-vector-bits=256 -fopenmp -O2" } */
+
+#include 
+
+#define N __ARM_FEATURE_SVE_BITS
+
+int64_t __attribute__ ((noipa))
+target_device_ptr_vla (svbool_t vp, svint32_t *vptr)
+{
+
+  int a[N], b[N], c[N];
+  svint32_t va, vb, vc;
+  int64_t res;
+  int i;
+
+#pragma omp parallel for
+  for (i = 0; i < N; i++)
+{
+  b[i] = i;
+  c[i] = i + 1;
+}
+/* { dg-error {SVE type 'svint32_t \*' not allowed in 'target' device clauses} 
"" { target *-*-* } .+1 } */
+#pragma omp target data use_device_ptr (vptr) map (to: b, c)
+/* { dg-error {SVE type 'svint32_t \*' not allowed in 'target' device clauses} 
"" { target *-*-* } .+1 } */
+#pragma omp target is_device_ptr (vptr) map (to: b, c) map (from: res)
+  for (i = 0; i < 8; i++)
+{
+  /* { dg-error "cannot reference 'svint32_t' object types in 'target' 
region" "" { target *-*-* } .+1 } */
+  vb = *vptr;
+  /* { dg-error "cannot reference 'svint32_t' object types in 'target' 
region" "" { target *-*-* } .+2 } */
+  /* { dg-error "cannot reference 'svbool_t' object types in 'target' 
region" "" { target *-*-* } .+1 } */
+  vc = svld1_s32 (vp, c);
+  /* { dg-error "cannot reference 'svint32_t' object types in 'target' 
region" "" { target *-*-* } .+1 } */
+  va = svadd_s32_z (vp, vb, vc);
+  res = svaddv_s32 (svptrue_b32 (), va);
+}
+
+  return res;
+}
+
+int64_t __attribute__ ((noipa))
+target_device_addr_vla (svbool_t vp, svint32_t *vptr)
+{
+
+  int a[N], b[N], c[N];
+  svint32_t va, vb, vc;
+  int64_t res;
+  int i;
+
+#pragma omp parallel for
+  for (i = 0; i < N; i++)
+{
+  b[i] = i;
+  c[i] = i + 1;
+}
+
+/* { dg-error "SVE type 'svint32_t' not allowed in 'target' device clauses" "" 
{  target *-*-* } .+1 } */
+#pragma omp target data use_device_addr (vb) map (to: b, c)
+/* { dg-error {SVE type 'svint32_t \*' not allowed in 'target' device clauses} 
"" { target *-*-* } .+1 } */
+#pragma omp target is_device_ptr (vptr) map (to: b, c) map (from: res)
+  for (i = 0; i < 8; i++)
+{
+  /* { dg-error "cannot reference 'svint32_t' object types in 'target' 
region" "" { target *-*-* } .+1 } */
+  vb = *vptr;
+  /* { dg-error "cannot reference 'svint32_t' object types in 'ta

nvptx: Support '-mfake-ptx-alloca': defer failure to run-time 'alloca' usage (was: [PUSHED] nvptx: Support '-mfake-ptx-alloca')

2025-04-07 Thread Thomas Schwinge

Hi!

On 2025-02-27T21:51:11+0100, I wrote:
> With '-mfake-ptx-alloca' enabled, the user-visible behavior changes only
> for configurations where PTX 'alloca' is not available.  Rather than a
> compile-time 'sorry, unimplemented: dynamic stack allocation not supported'
> in presence of dynamic stack allocation, compilation and assembly then
> succeeds.  However, attempting to link in such '*.o' files then fails due
> to unresolved symbol '__GCC_nvptx__PTX_alloca_not_supported'.
>
> This is meant to be used in scenarios where large volumes of code are
> compiled, a small fraction of which runs into dynamic stack allocation, but
> these parts are not important for specific use cases, and we'd thus like the
> build to succeed, and error out just upon actual, very rare use of the
> offending '*.o' files.

This can be tuned further; I've pushed to trunk branch
commit 199f1abeef579912b4c40c42519825cedca6530f
"nvptx: Support '-mfake-ptx-alloca': defer failure to run-time 'alloca' usage",
see attached.


Similar to the implicit FAIL -> UNSUPPORTED due to compile-time detection
of 'sorry, unimplemented: dynamic stack allocation not supported' in
'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-prune', we should also be able to
implement a corrsponding thing for this run-time failure, via new
machinery in the replacement 'gcc/testsuite/lib/gcc-dg.exp:${tool}_load',
via scanning and modifying 'result'.  (..., but I'm not working on that
right now.)


Grüße
 Thomas


>From 199f1abeef579912b4c40c42519825cedca6530f Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sun, 6 Apr 2025 17:44:18 +0200
Subject: [PATCH] nvptx: Support '-mfake-ptx-alloca': defer failure to run-time
 'alloca' usage

Follow-up to commit 1146410c0feb0e82c689b1333fdf530a2b34dc2b
"nvptx: Support '-mfake-ptx-alloca'".  '-mfake-ptx-alloca' is applicable only
for configurations where PTX 'alloca' is not supported, where target libraries
are built with it enabled (that is, libstdc++, libgfortran).

This change progresses:

[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C  -std=gnu++98: exception handling not supported

..., and "enables" a few test cases:

FAIL: g++.old-deja/g++.other/sibcall1.C  -std=gnu++17 (test for excess errors)
[Etc.]

FAIL: g++.old-deja/g++.other/unchanging1.C  -std=gnu++17 (test for excess errors)
[Etc.]

..., which now (unrelatedly to 'alloca', and in the same way as configurations
where PTX 'alloca' is supported) FAIL due to:

unresolved symbol _Unwind_DeleteException
collect2: error: ld returned 1 exit status

Most importantly, it progresses ~830 libstdc++ test cases:

[-FAIL:-]{+PASS:+} [...] (test for excess errors)

..., with (if applicable, for most of them):

[-UNRESOLVED:-]{+PASS:+} [...] [-compilation failed to produce executable-]{+execution test+}

..., or just a few 'FAIL: [...] execution test' where these test cases also
FAIL in configurations where PTX 'alloca' is supported, or ~120 instances of
'FAIL: [...]  execution test' due to run-time
'GCC/nvptx: sorry, unimplemented: dynamic stack allocation not supported'.

This change also resolves the cases noted in
commit bac2d8a246892334e24dfa7d62be0cd0648c5606
"nvptx: Build libgfortran with '-mfake-ptx-alloca' [PR107635]":

| With '-mfake-ptx-alloca', libgfortran again succeeds to build, and compared
| to before, we've got only a small number of regressions due to nvptx 'ld'
| complaining about 'unresolved symbol __GCC_nvptx__PTX_alloca_not_supported':
|
| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)

[-FAIL:-]{+PASS:+} gfortran.dg/coarray/codimension_2.f90 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)

| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

[-FAIL:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} gfortran.dg/coarray/event_4.f08 -fcoarray=lib  -O2  -lcaf_single [-compilation failed to produce executable-]{+execution test+}

| [-PASS:-]{+FAIL:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single (test for excess errors)
| [-PASS:-]{+UNRESOLVED:+} gfortran.dg/coarray/fail_image_2.f08 -fcoarray=lib  -O2  -lcaf_single [-execution test-]{+compilation failed to produce executable+}

[-FAIL:-]{+

[PUSHED] GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]

2025-04-07 Thread Thomas Schwinge

For both GCN, nvptx, this gets rid of 'configure'-time:

configure: WARNING: No native atomic operations are provided for this 
platform.
configure: WARNING: They will be faked using a mutex.
configure: WARNING: Performance of certain classes will degrade as a result.

..., and changes:

-checking for lock policy for shared_ptr reference counts... mutex
+checking for lock policy for shared_ptr reference counts... atomic

That means, '[...]/[target]/libstdc++-v3/', 'Makefile's change:

-ATOMICITY_SRCDIR = config/cpu/generic/atomicity_mutex
+ATOMICITY_SRCDIR = config/cpu/generic/atomicity_builtins

..., and '[...]/[target]/libstdc++-v3/config.h' changes:

/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef HAVE_ATOMIC_LOCK_POLICY */
+#define HAVE_ATOMIC_LOCK_POLICY 1

/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1

..., and '[...]/[target]/libstdc++-v3/include/[target]/bits/c++config.h'
changes:

/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY */
+#define _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY 1

/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1

This means that '[...]/[target]/libstdc++-v3/libsupc++/atomicity.cc',
'[...]/[target]/libstdc++-v3/libsupc++/atomicity.o' then uses atomic
instructions for synchronization instead of C++ static local variables, which
in turn for their guard variables, via 'libstdc++-v3/libsupc++/guard.cc', used
'libgcc/gthr.h' recursive mutexes, which currently are unsupported for GCN.

For GCN, this turns ~500 libstdc++ execution test FAILs into PASSes, and also
progresses:

PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 execution test
PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 execution test
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C  -std=gnu++98: exception handling 
not supported

(For nvptx, there is no effective change, due to other misconfiguration.)

PR target/119645
libstdc++-v3/
* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY) [GCN, nvptx]:
Hard-code results.
* configure: Regenerate.
* configure.host [GCN, nvptx] (atomicity_dir): Set to
'cpu/generic/atomicity_builtins'.
---
 libstdc++-v3/acinclude.m4   |  7 ---
 libstdc++-v3/configure  | 11 ++-
 libstdc++-v3/configure.host | 11 +++
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 02fd349e11df..a0094c2dd95b 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4023,10 +4023,11 @@ AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [
 dnl Why don't we check 8-byte CAS for sparc64, where _Atomic_word is long?!
 dnl New targets should only check for CAS for the _Atomic_word type.
 AC_TRY_COMPILE([
-#if defined __riscv
+#if defined __AMDGCN__ || defined __nvptx__
+/* Yes, please.  */
+#elif defined __riscv
 # error "Defaulting to mutex-based locks for ABI compatibility"
-#endif
-#if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
 # error "No 2-byte compare-and-swap"
 #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
 # error "No 4-byte compare-and-swap"
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 56d0bcb297ea..819a1d82876a 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -16394,10 +16394,11 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
-#if defined __riscv
+#if defined __AMDGCN__ || defined __nvptx__
+/* Yes, please.  */
+#elif defined __riscv
 # error "Defaulting to mutex-based locks for ABI compatibility"
-#endif
-#if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
 # error "No 2-byte compare-and-swap"
 #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
 # error "No 4-byte compare-and-swap"
@@ -16444,7 +16445,7 @@ $as_echo "mutex" >&6; }
   # unnecessary for this test.
 
 cat > conftest.$ac_ext << EOF
-#line 16447 "configure"
+#line 16448 "configure"
 int main()
 {
   _Decimal32 d1;
@@ -16486,7 +16487,7 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
   # unnecessary for this test.
 
   cat > conftest.$ac_ext << EOF
-#line 16489 "configure"
+#line 16490 "configure"
 template
   struct same
   { typedef T2 type; };
diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index 8375764bf4dc..253e5a9ad0db 100644
--- a/libstdc+

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-04-07 Thread Martin Uecker

Am Montag, dem 07.04.2025 um 14:44 +0200 schrieb Michael Matz:
> Hello,
> 
> On Sat, 5 Apr 2025, Bill Wendling wrote:
> 
> > > > > > So, a different attribute name “counted_by_exp” might be better?
> > > > > 
> > > > > I would prefer Martins empty-decl idea to that: "counted_by(;len+0)"
> > > > > (looks up 'len' normally, i.e. doesn't look into current struct).  It
> > > > > would naturally fit the either decl+expr or lone-ident parse.
> > > > > It may look weird but empty declarations are okayish IMHO.
> > > > > 
> > > > > But overall: I just don't know, it all looks a bit unsexy, there only
> > > seem
> > > > > to be rocks and hard places :)
> > > > 
> > > > I would not worry about this case too much, because I do expect this
> > > > to be a common use case anyway.  That it looks strange may even be
> > > > an advantage here, as it alerts the reader that this is unusual.
> > > 
> > > This is an interesting point and also a good point. -:)
> > > 
> > > The other thought that bother me a little bit is:
> > > 
> > > For the same attribute, counted_by, is it strange to have two different
> > > looking up rules
> > > depending on the different number of arguments?l
> > > 
> > 
> > Sorry for the HTML. On my phone.
> > 
> > I think adding a ';' isn't the best option. It's too easy to overlook when
> > reading the attribute and forget when writing the attribute.  Using a
> > separate attribute name is much cleaner, IMO. Then again, I've been wrong
> > before. :-)
> 
> So, what specifically would the two attributes do different?  FWIW: what 
> worries me about accepting a generic expression in counted_by, that isn't 
> prefixed by a (possibly empty) decl, is that after seeing a non-type 
> identifier the parser doesn't yet know if it's the lone-ident case (look 
> up in struct scope) or the expression case (look up everything in global 
> scope).  It requires look-ahead to decide this.
> 
> Would that be the difference between the attributes?  One accepting _only_ 
> a lone-ident or the decl+expr syntax, and the other _only_ expressions 
> that are never looked up in struct-scope (not even if its lone-ident)?

My understanding is that one accepts only a lone identifier and nothing
else, i.e.

counted_by(identifier)

and the other only accepts expressions, possibly including a forward
declaration.

counted_by_expr(expression)
counted_by_expr(decl; expression)


And

counted_by_expr(size_t id; id)

would be equivalent to

counted_by(id)

when the struct has a member "id" with type size_t.

Martin

Re: GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]

2025-04-07 Thread Jonathan Wakely

On Mon, 7 Apr 2025 at 12:38, Andrew Stubbs  wrote:
> Otherwise LGTM. At least GCN certainly does support atomics, so the
> configure test must be broken somehow.

I'm very curious why they fail.

[PATCH] tree-optimization/119640 - ICE with vectorized shift placement

2025-04-07 Thread Richard Biener

When the whole shift is invariant but the shift amount needs
to be converted and a vector shift used we can mess up placement
of vector stmts because we do not make SLP scheduling aware of
the need to insert code for it.  The following mitigates this
by more conservative placement of such code in vectorizable_shift.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/119640
* tree-vect-stmts.cc (vectorizable_shift): Always insert code
for one of our SLP operands before the code for the vector
shift itself.

* gcc.dg/vect/pr119640.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr119640.c | 17 +
 gcc/tree-vect-stmts.cc   | 11 +++
 2 files changed, 24 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr119640.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr119640.c 
b/gcc/testsuite/gcc.dg/vect/pr119640.c
new file mode 100644
index 000..8872817ac31
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr119640.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-funswitch-loops" } */
+
+int save, mask_nbits;
+
+void execute(long imm)
+{
+  long shift = 0;
+  int destReg[4];
+  for (unsigned i = 0; i < 4; i++)
+{
+  if (imm)
+   shift = 1ULL << mask_nbits;
+  destReg[i] = shift;
+  save = destReg[0];
+}
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 37a54bf5000..ffca2ab50d5 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -6758,13 +6758,16 @@ vectorizable_shift (vec_info *vinfo,
 {
   if (was_scalar_shift_arg)
{
- /* If the argument was the same in all lanes create
-the correctly typed vector shift amount directly.  */
+ /* If the argument was the same in all lanes create the
+correctly typed vector shift amount directly.  Note
+we made SLP scheduling think we use the original scalars,
+so place the compensation code next to the shift which
+is conservative.  See PR119640 where it otherwise breaks.  */
  op1 = fold_convert (TREE_TYPE (vectype), op1);
  op1 = vect_init_vector (vinfo, stmt_info, op1, TREE_TYPE (vectype),
- !loop_vinfo ? gsi : NULL);
+ gsi);
  vec_oprnd1 = vect_init_vector (vinfo, stmt_info, op1, vectype,
-!loop_vinfo ? gsi : NULL);
+gsi);
  vec_oprnds1.create (slp_node->vec_stmts_size);
  for (k = 0; k < slp_node->vec_stmts_size; k++)
vec_oprnds1.quick_push (vec_oprnd1);
-- 
2.43.0

Re: [PATCH] lto: lto-opts fixes [PR119625]

2025-04-07 Thread Richard Biener

On Fri, 4 Apr 2025, Jakub Jelinek wrote:

> On Fri, Apr 04, 2025 at 08:52:10PM +0200, Richard Biener wrote:
> > > Or do you want something further (like
> > > switch (global_options.x_flag_cf_protection & ~CF_SET)
> > > )?
> > 
> > Dunno what that CF_SET is, we’re supposed to record options like the user 
> > specified so we can merge them.  Why does the backend alter this?
> 
> The option user specified was -fhardened but that for some reason
> isn't present in gcc.lto_.opts at all.
> Also, it is unclear to me if the options that -fhardened sets
> should be marked also as OPTION_SET_P (as if the user specified
> all those options explicitly when specifying -fhardened explicitly)
> or not.
> CCing Marek on that.
> 
> And I admit I have no idea what that CF_SET is.

Maybe HJ can explain?  We do seem to mask it out in most places,
so we probably should for lto-opts as well, also given we have no
way of specifying it.

Richard.

[PATCH] cobol: Fix up update_web_docs_git for COBOL [PR119227]

2025-04-07 Thread Jakub Jelinek

Hi!

As mentioned in the PR, the COBOL documentation is currently not present
in onlinedocs at all.
While the script generates gcobol{,-io}.{pdf,html}, it generates them in
the gcc/gcc/cobol/ subdirectory of the update_web_docs_git temporary
directory and nothing find it there afterwards, all the processing is on
for file in */*.html *.ps *.pdf *.tar; do
So, this patch puts gcobol{,-io}.html into gcobol/ subdirectory and
gcobol{,-io}.pdf into the current directory, so that it is picked up.
With this it makes into onlinedocs:
find . -name \*cobol\*
./onlinedocs/gcobol.pdf.gz
./onlinedocs/gcobol.pdf
./onlinedocs/gcobol_io.pdf.gz
./onlinedocs/gcobol_io.pdf
./onlinedocs/gcobol
./onlinedocs/gcobol/gcobol_io.html.gz
./onlinedocs/gcobol/gcobol_io.html
./onlinedocs/gcobol/gcobol.html.gz
./onlinedocs/gcobol/gcobol.html
./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html.gz
./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html

So far just tested running the script locally with a few vars tweaked, ok
for trunk?
I'd then test it on sourceware proper...

2025-04-07  Jakub Jelinek  

PR web/119227
* update_web_docs_git: Rename mdoc2pdf_html to cobol_mdoc2pdf_html,
perform mkdir -p $DOCSDIR/gcobol gcobol, remove $d/ from pdf and in
html replace it with gcobol/; update uses of the renamed function.

--- maintainer-scripts/update_web_docs_git.jj   2025-03-11 09:18:22.169127780 
+0100
+++ maintainer-scripts/update_web_docs_git  2025-04-07 13:22:41.785354269 
+0200
@@ -205,11 +205,12 @@ done
 #
 # The COBOL FE maintains man pages.  Convert them to HTML and PDF.
 #
-mdoc2pdf_html() {
+cobol_mdoc2pdf_html() {
+mkdir -p $DOCSDIR/gcobol gcobol
 input="$1"
 d="${input%/*}"
-pdf="$d/$2"
-html="$d/$3"
+pdf="$2"
+html="gcobol/$3"
 groff -mdoc -T pdf "$input" > "${pdf}~"
 mv "${pdf}~" "${pdf}"
 mandoc -T html "$filename" > "${html}~"
@@ -221,10 +222,10 @@ find . -name gcobol.[13] |
 do
 case ${filename##*.} in
 1)
-mdoc2pdf_html "$filename" gcobol.pdf gcobol.html
+cobol_mdoc2pdf_html "$filename" gcobol.pdf gcobol.html
 ;;
 3)
-mdoc2pdf_html "$filename" gcobol_io.pdf gcobol_io.html
+cobol_mdoc2pdf_html "$filename" gcobol_io.pdf gcobol_io.html
 ;;
 esac
 done

Jakub

Re: [PATCH] lto: lto-opts fixes [PR119625]

2025-04-07 Thread H.J. Lu

On Mon, Apr 7, 2025 at 2:53 AM Richard Biener  wrote:
>
> On Fri, 4 Apr 2025, Jakub Jelinek wrote:
>
> > On Fri, Apr 04, 2025 at 08:52:10PM +0200, Richard Biener wrote:
> > > > Or do you want something further (like
> > > > switch (global_options.x_flag_cf_protection & ~CF_SET)
> > > > )?
> > >
> > > Dunno what that CF_SET is, we’re supposed to record options like the user 
> > > specified so we can merge them.  Why does the backend alter this?
> >
> > The option user specified was -fhardened but that for some reason
> > isn't present in gcc.lto_.opts at all.
> > Also, it is unclear to me if the options that -fhardened sets
> > should be marked also as OPTION_SET_P (as if the user specified
> > all those options explicitly when specifying -fhardened explicitly)
> > or not.
> > CCing Marek on that.
> >
> > And I admit I have no idea what that CF_SET is.
>
> Maybe HJ can explain?  We do seem to mask it out in most places,
> so we probably should for lto-opts as well, also given we have no
> way of specifying it.
>
> Richard.

See:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84248

-- 
H.J.

[Stage1][Middle-end][object-size][PATCH v1] Evaluate the object size by the size of the pointee type

2025-04-07 Thread Qing Zhao

when the type is a structure with flexible array member.

In tree-object-size.cc, if the size is UNKNOWN after evaluating use-def
chain, We can evaluate the SIZE of the pointee TYPE ONLY when this TYPE
is a structure type with flexible array member, since a structure with
FAM can not be an element of an array, so, the pointer must point to a
single object with this structure with FAM.

This is only available for C now.

bootstrapped and regression tested on both x86 and aarch64.

Okay for stage1?

thanks.

Qing

gcc/c/ChangeLog:

* c-lang.cc (LANG_HOOKS_BUILD_COUNTED_BY_REF):
Define to below function.
* c-tree.h (c_build_counted_by_ref): New extern function.
* c-typeck.cc (build_counted_by_ref): Rename to ...
(c_build_counted_by_ref): ...this.
(handle_counted_by_for_component_ref): Call the renamed function.

gcc/ChangeLog:

* langhooks-def.h (LANG_HOOKS_BUILD_COUNTED_BY_REF):
New language hook.
* langhooks.h (struct lang_hooks_for_types): Add
build_counted_by_ref.
* tree-object-size.cc (struct object_size_info): Add a new field
insert_after.
(gimplify_size_expressions): Insert sequence after or before
depending on the new field insert_after.
(compute_builtin_object_size): Init the new field to false;
(record_with_fam_object_size): New function.
(collect_object_sizes_for): Call record_with_fam_object_size.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-3.c: Update test;
* gcc.dg/flex-array-counted-by-4.c: Likewise.
* gcc.dg/flex-array-counted-by-5.c: Likewise.
---
 gcc/c/c-lang.cc   |   3 +
 gcc/c/c-tree.h|   1 +
 gcc/c/c-typeck.cc |   6 +-
 gcc/langhooks-def.h   |   4 +-
 gcc/langhooks.h   |   5 +
 .../gcc.dg/flex-array-counted-by-3.c  |   5 +
 .../gcc.dg/flex-array-counted-by-4.c  |  34 --
 .../gcc.dg/flex-array-counted-by-5.c  |   4 +
 gcc/tree-object-size.cc   | 106 +-
 9 files changed, 152 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-lang.cc b/gcc/c/c-lang.cc
index c69077b2a93..e9ec9e6e64a 100644
--- a/gcc/c/c-lang.cc
+++ b/gcc/c/c-lang.cc
@@ -51,6 +51,9 @@ enum c_language_kind c_language = clk_c;
 #undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE
 #define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE c_get_sarif_source_language
 
+#undef LANG_HOOKS_BUILD_COUNTED_BY_REF
+#define LANG_HOOKS_BUILD_COUNTED_BY_REF c_build_counted_by_ref
+
 /* Each front end provides its own lang hook initializer.  */
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 743ec5cbae6..66bec5d92fa 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -776,6 +776,7 @@ extern struct c_switch *c_switch_stack;
 
 extern bool null_pointer_constant_p (const_tree);
 
+extern tree c_build_counted_by_ref (tree, tree, tree *);
 
 inline bool
 c_type_variably_modified_p (tree t)
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 19e79b554dc..6efc3fb3e5d 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2936,8 +2936,8 @@ should_suggest_deref_p (tree datum_type)
 &(p->k)
 
 */
-static tree
-build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
+tree
+c_build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
 {
   tree type = TREE_TYPE (datum);
   if (!c_flexible_array_member_type_p (TREE_TYPE (subdatum)))
@@ -3031,7 +3031,7 @@ handle_counted_by_for_component_ref (location_t loc, tree 
ref)
   tree datum = TREE_OPERAND (ref, 0);
   tree subdatum = TREE_OPERAND (ref, 1);
   tree counted_by_type = NULL_TREE;
-  tree counted_by_ref = build_counted_by_ref (datum, subdatum,
+  tree counted_by_ref = c_build_counted_by_ref (datum, subdatum,
  &counted_by_type);
   if (counted_by_ref)
 ref = build_access_with_size_for_counted_by (loc, ref,
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 6b34d324ab5..7909ea8a92c 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -221,6 +221,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
 #define LANG_HOOKS_TYPE_DWARF_ATTRIBUTElhd_type_dwarf_attribute
 #define LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING 
lhd_unit_size_without_reusable_padding
 #define LANG_HOOKS_CLASSTYPE_AS_BASE   hook_tree_const_tree_null
+#define LANG_HOOKS_BUILD_COUNTED_BY_REF NULL
 
 #define LANG_HOOKS_FOR_TYPES_INITIALIZER { \
   LANG_HOOKS_MAKE_TYPE, \
@@ -248,7 +249,8 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
   LANG_HOOKS_GET_FIXED_POINT_TYPE_INFO, \
   LANG_HOOKS_TYPE_DWARF_ATTRIBUTE, \
   LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING, \
-  LANG_HOOKS_CLASSTYPE_AS_BASE \
+  LANG_HOOKS_CLASSTYPE_AS_BASE, \
+  LANG_HOOKS_BUILD_COUNTED_BY_REF \
 }
 
 /* Declar

Re: [committed] cobol: Eliminate cobolworx UAT errors when compiling with -Os

2025-04-07 Thread Richard Biener

On Fri, Apr 4, 2025 at 11:50 PM Robert Dubner  wrote:
>
> Anybody who might have gotten interested should stand down.
>
> As usual, that analysis got me thinking.
>
> I got focused on where var_decl_return_code was being used.  (I was wrong.
> I made the mistake because I had just eliminated two sets of errors caused
> by the optimization actually optimizing away things I need, so I had that
> in the front of my brain.)  Richard told me of something odd at the point
> where var_decl_return was being established.  I finally decided to look at
> that.
>
> Turned out, ultimately, to be a SHORT / USHORT mismatch on the variables
> being given to a MODIFY_EXPR.  Apparently the optimization algorithms can
> be extremely cranky about value types.

Yes.  Unless you implement the GET_ALIAS_SET language hook even more
so than when using C.  But I expect COBOL to be a quite restrictive language
with respect to typing - is there something like pointers so a COBOL program
can access the bit-representation of some variable as a different type?

Richard.

>
> In any event, with that straightened out, everything is working without
> the flag_strict_aliasing modification.
>
> Thanks for asking, and thanks for listening.
>
> > -Original Message-
> > From: Robert Dubner 
> > Sent: Friday, April 4, 2025 16:02
> > To: Sam James 
> > Cc: GCC Patches 
> > Subject: RE: [committed] cobol: Eliminate cobolworx UAT errors when
> > compiling with -Os
> >
> > > -Original Message-
> > > From: Sam James 
> > > Sent: Friday, April 4, 2025 14:28
> > > To: Robert Dubner 
> > > Cc: 'GCC Patches' 
> > > Subject: Re: [committed] cobol: Eliminate cobolworx UAT errors when
> > > compiling with -Os
> > >
> > > Robert Dubner  writes:
> > >
> > > > From e70fe5ed46ab129a8b1da961c47d3fb75b11b988 Mon Sep 17 00:00:00
> 2001
> > > > From: Bob Dubner mailto:rdub...@symas.com
> > > > Date: Fri, 4 Apr 2025 13:48:58 -0400
> > > > Subject: [PATCH] cobol: Eliminate cobolworx UAT errors when
> compiling
> > > with
> > > > -Os
> > > >
> > > > Testcases compiled with -Os were failing because static functions
> and
> > > > static
> > > > variables were being optimized away, because of improper data type
> > > casts,
> > > > and
> > > > because strict aliasing (whatever that is) was resulting in some
> loss
> > > > of
> > >
> > > Are you unfamiliar with that from C and C++? See
> > > https://gist.github.com/shafik/848ae25ee209f698763cffee272a58f8 (we
> all
> > > have our favourite documents to explain it) but I don't know if COBOL
> is
> > > amenable to the concept of TBAA.
> > >
> > > > data.
> > > > These changes eliminate those known problems.
> > >
> > > I'd suggest that this should be accompanied by some question,
> otherwise
> > > it's going to live there forever and it's not necessarily right
> (though
> > > see above - if COBOL is incompatible with the idea, it might need
> > > something along those lines, though not sure this is the right way of
> > > doing that).
> > >
> > > That is, while you're free to approve your own COBOL patches, you're
> > > also free to CC others and ask them for advice even if not explicit
> > > approval before pushing them if something doesn't seem to be correct
> or
> > > is a hack.
> >
> > I am not any kind of compiler expert, except, possibly in one way: I
> don't
> > trust them.  I have never trusted them.  I don't trust them, and I don't
> > trust the computers they run on.  I regard compilers the same way a
> > lion-tamer regards the big cats they work with every day.  I treat them
> > with firm respect; I expect them to behave the way they've been trained
> --
> > and I never turn my back on them.  So, no, I don't understand TBAA; I
> had
> > to look it up.  I don't understand aliasing.
> >
> > The problem at hand is not a COBOL problem.  It is a Bob problem.
> Richard
> > pointed me at the original root of the thing.  I believe I have
> addressed
> > that.  But the problem has not gone away.  The "flag_strict_aliasing =
> 0;"
> > solution makes the symptoms go away.  I spent a couple of hours messing
> > with this, and I was unable to resolve it.  So I have shrugged and used
> > what I have.
> >
> > You probably have figured out by now that when I need help, it's because
> > 1) I have missed something stupid, 2) I was just plain ignorant, or 3)
> > It's a hard problem.
> >
> > And because it's a hard problem, it's hard to describe.  But you sort of
> > asked, so I will sort of try to explain what's going on.
> >
> > IBM-flavored COBOL has the concept of a RETURN-CODE, a global 16-bit
> > integer that is shared by all PROGRAM-ID modules.  (Each is implemented
> as
> > a C type function.)
> >
> > In GCOBOL, a COBOL variable is actually a structure with a data area,
> > because COBOL variables have lots of metadata associated with them,
> > including the type of storage, the amount of storage, in many

[committed] libgomp.texi: Add GCN doc for omp_target_memcpy_rect

2025-04-07 Thread Tobias Burnus


omp_target_memcpy_rect uses Nvidia's (CUDA) and AMD's (ROCr/HSA)
features to transfer noncontiguous rectangular data efficient.

While for CUDA, the wording was already there, it was missing
for HSA. (Both landed in GCC 14, albeit HSA half a year later.)

This patch adds it also for AMD GPUs; for nvptx, I moved the
bullet point down; in the current version, the API call comes
between the stack memory and memory allocation bullets, which
seems to be misplaced.

Additionally, I added a crossref to the two API functions.

Tobias
commit 0c63c7524bd523ea82933e90689b63d80e16d67e
Author: Tobias Burnus 
Date:   Mon Apr 7 09:04:53 2025 +0200

libgomp.texi: Add GCN doc for omp_target_memcpy_rect

libgomp/ChangeLog:

* libgomp.texi (omp_target_memcpy_rect_async,
omp_target_memcpy_rect): Add @ref to 'Offload-Target Specifics'.
(AMD Radeon (GCN)): Document how memcpy_rect is implemented.
(nvptx): Move item about memcpy_rect item down; use present tense.
---
 libgomp/libgomp.texi | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 4217c29dd37..fed9d5efb6a 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -2316,7 +2316,7 @@ the initial device.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_target_memcpy_rect_async}, @ref{omp_target_memcpy}
+@ref{omp_target_memcpy_rect_async}, @ref{omp_target_memcpy}, @ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.6
@@ -2391,7 +2391,7 @@ the initial device.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_target_memcpy_rect}, @ref{omp_target_memcpy_async}
+@ref{omp_target_memcpy_rect}, @ref{omp_target_memcpy_async}, @ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.8.8
@@ -6911,6 +6911,11 @@ The implementation remark:
   @code{omp_thread_mem_alloc}, all use low-latency memory as first
   preference, and fall back to main graphics memory when the low-latency
   pool is exhausted.
+@item The OpenMP routines @code{omp_target_memcpy_rect} and
+  @code{omp_target_memcpy_rect_async} and the @code{target update}
+  directive for non-contiguous list items use the 3D memory-copy function
+  of the HSA library.  Higher dimensions call this functions in a loop and
+  are therefore supported.
 @item The unique identifier (UID), used with OpenMP's API UID routines, is the
   value returned by the HSA runtime library for @code{HSA_AMD_AGENT_INFO_UUID}.
   For GPUs, it is currently @samp{GPU-} followed by 16 lower-case hex digits,
@@ -7048,11 +7053,6 @@ The implementation remark:
   devices (``host fallback'').
 @item The default per-warp stack size is 128 kiB; see also @code{-msoft-stack}
   in the GCC manual.
-@item The OpenMP routines @code{omp_target_memcpy_rect} and
-  @code{omp_target_memcpy_rect_async} and the @code{target update}
-  directive for non-contiguous list items will use the 2D and 3D
-  memory-copy functions of the CUDA library.  Higher dimensions will
-  call those functions in a loop and are therefore supported.
 @item Low-latency memory (@code{omp_low_lat_mem_space}) is supported when the
   the @code{access} trait is set to @code{cgroup}, and libgomp has
   been built for PTX ISA version 4.1 or higher (such as in GCC's
@@ -7070,6 +7070,11 @@ The implementation remark:
   @code{omp_thread_mem_alloc}, all use low-latency memory as first
   preference, and fall back to main graphics memory when the low-latency
   pool is exhausted.
+@item The OpenMP routines @code{omp_target_memcpy_rect} and
+  @code{omp_target_memcpy_rect_async} and the @code{target update}
+  directive for non-contiguous list items use the 2D and 3D memory-copy
+  functions of the CUDA library.  Higher dimensions call those functions
+  in a loop and are therefore supported.
 @item The unique identifier (UID), used with OpenMP's API UID routines, consists
   of the @samp{GPU-} prefix followed by the 16-bytes UUID as returned by
   the CUDA runtime library.  This UUID is output in grouped lower-case

Re: ping: COBOL: testsuite and running NIST85

2025-04-07 Thread Richard Biener

On Mon, Apr 7, 2025 at 9:00 AM Simon Sobisch  wrote:
>
> My question stands on integrating COBOLworx' UAT as-is for now
> (Copyright is all on FSF; built automatically [it is autoconf, which is
> a requirement for VCS checkouts], possibly also hooked into the current
> test target) - with the goal to get rid of UAT later (next GCC version,
> not GCC 15).
>
> There's also the question about integrating NIST into GCC upstream -
> that is a subfolder and would only be executed upon explicit call by
> maintainers (newcob.val / newcob.val.gz may be either included in VCS or
> even downloaded manually...).

As I repeatedly said I'd welcome a test harness like Ada ACATS for running
the NIST testsuite plus a contrib/download_cobol_nist script that downloads
the NIST file and prepares it for use.  I'd suggest to, similar as with ACATS,
have a separate make target for testing (but still invoked with make check,
when present).

As for UAT, I understand it's work in progress to get that converted to dejagnu?

> With UAT, gcobol would have MUCH more test coverage directly for
> everyone, with NIST developers would have the chance to run "what is not
> disabled" from that testsuite for bigger changes like the
> FLOAT_128/libmath adjustment and when working on a new target.
>
> Both parts are already in the COBOLworx repo and work, can be used
> directly to check for regressions and the move from UAT to dejagnu can
> still be done after the increasing pile of bugs (which, as a COBOL
> programmer I partly find quite severe) and possibly some feature
> requests (especially around huge codegen) are taken care by the "rare
> resources" Bob and Jim.
>
> Concerning NIST: please take care to not get on the same low level like
> COBOL-IT and others, claiming gcobol passes NIST - it doesn't (no
> current compiler does pass all modules, and I think GnuCOBOL is the
> single one that nearly passes everything [and is able to at least parse
> the parts that are disabled - around the COMMUNICATION module which was
> obsolete in COBOL85 and was kind of resurrected by COBOL2023's Message
> Control System [MCS]).
>
> Simon

Re: ping: COBOL: testsuite and running NIST85

2025-04-07 Thread Simon Sobisch





Am 07.04.2025 um 09:30 schrieb Richard Biener:

On Mon, Apr 7, 2025 at 9:00 AM Simon Sobisch  wrote:


My question stands on integrating COBOLworx' UAT as-is for now
(Copyright is all on FSF; built automatically [it is autoconf, which is
a requirement for VCS checkouts], possibly also hooked into the current
test target) - with the goal to get rid of UAT later (next GCC version,
not GCC 15).

There's also the question about integrating NIST into GCC upstream -
that is a subfolder and would only be executed upon explicit call by
maintainers (newcob.val / newcob.val.gz may be either included in VCS or
even downloaded manually...).


As I repeatedly said I'd welcome a test harness like Ada ACATS for running
the NIST testsuite plus a contrib/download_cobol_nist script that downloads
the NIST file and prepares it for use.  I'd suggest to, similar as with ACATS,
have a separate make target for testing (but still invoked with make check,
when present).


Sounds good. As it is a single curl operation that can also be done with 
make in that subfolder, do we still need a separate download script?


I understand that in any case the test harness would check for the 
newcob.val file existing (builddir only is fine, right?), and if it is, 
then execute a `${MAKE} -C builddir/...test.../nist).


If Jim doesn't find the time to do this (please respond on this), I can 
prepare a patch (contributing mostly COBOLworx' work for that setup and 
config).



As for UAT, I understand it's work in progress to get that converted to dejagnu?


It is, but full UAT will take weeks, if not months, as far as I've 
understood Bob. I feel the urge to have his time spend on other things 
than a conversion (which _does_ provide additional benefit like testing 
with more configurations and be better included in the rest of GCC's 
tests), as GCC15 is near and the amount of things to don don't get less).


But whatever you guys decide will happen, I mostly wanted to raise my 
concern.



With UAT, gcobol would have MUCH more test coverage directly for
everyone, with NIST developers would have the chance to run "what is not
disabled" from that testsuite for bigger changes like the
FLOAT_128/libmath adjustment and when working on a new target.

Both parts are already in the COBOLworx repo and work, can be used
directly to check for regressions and the move from UAT to dejagnu can
still be done after the increasing pile of bugs (which, as a COBOL
programmer I partly find quite severe) and possibly some feature
requests (especially around huge codegen) are taken care by the "rare
resources" Bob and Jim.

Concerning NIST: please take care to not get on the same low level like
COBOL-IT and others, claiming gcobol passes NIST - it doesn't (no
current compiler does pass all modules, and I think GnuCOBOL is the
single one that nearly passes everything [and is able to at least parse
the parts that are disabled - around the COMMUNICATION module which was
obsolete in COBOL85 and was kind of resurrected by COBOL2023's Message
Control System [MCS]).


Simon

[PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

Sigend-off-by: Xi Ruoyao 
Sigend-off-by: Jakub Jelinek 

---
v1 -> v2:
Corrected typos in commit information.
v2 -> v3:
Regenerate libgfortran/configure using gnu autoconf2.69.
---
 libgfortran/acinclude.m4 | 4 
 libgfortran/configure| 8 
 libquadmath/configure| 8 
 libquadmath/configure.ac | 4 
 4 files changed, 24 insertions(+)

diff --git a/libgfortran/acinclude.m4 b/libgfortran/acinclude.m4
index a73207e5465..23fd621e518 100644
--- a/libgfortran/acinclude.m4
+++ b/libgfortran/acinclude.m4
@@ -274,6 +274,10 @@ AC_DEFUN([LIBGFOR_CHECK_FLOAT128], [
   AC_CACHE_CHECK([whether we have a usable _Float128 type],
  libgfor_cv_have_float128, [
GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
diff --git a/libgfortran/configure b/libgfortran/configure
index 11a1bc5f070..9898a94a372 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -30283,6 +30283,10 @@ else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
@@ -30336,6 +30340,10 @@ fi
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
diff --git a/libquadmath/configure b/libquadmath/configure
index 49d70809218..f82dd3d0d6d 100755
--- a/libquadmath/configure
+++ b/libquadmath/configure
@@ -12843,6 +12843,10 @@ else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
 typedef _Complex float __attribute__((mode(TC))) __complex128;
 #else
@@ -12894,6 +12898,10 @@ fi
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
 typedef _Complex float __attribute__((mode(TC))) __complex128;
 #else
diff --git a/libquadmath/configure.ac b/libquadmath/configure.ac
index 349be2607c6..c64a8489219 100644
--- a/libquadmath/configure.ac
+++ b/libquadmath/configure.ac
@@ -233,6 +233,10 @@ AM_CONDITIONAL(LIBQUAD_USE_SYMVER_SUN, [test 
"x$quadmath_use_symver" = xsun])
 
 AC_CACHE_CHECK([whether __float128 is supported], [libquad_cv_have_float128],
   [GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error  On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 #if (!defined(_ARCH_PPC)) || defined(__LONG_DOUBLE_IEEE128__)
 typedef _Complex float __attribute__((mode(TC))) __complex128;
 #else
-- 
2.34.1

ping: COBOL: testsuite and running NIST85

2025-04-07 Thread Simon Sobisch

My question stands on integrating COBOLworx' UAT as-is for now 
(Copyright is all on FSF; built automatically [it is autoconf, which is 
a requirement for VCS checkouts], possibly also hooked into the current 
test target) - with the goal to get rid of UAT later (next GCC version, 
not GCC 15).


There's also the question about integrating NIST into GCC upstream - 
that is a subfolder and would only be executed upon explicit call by 
maintainers (newcob.val / newcob.val.gz may be either included in VCS or 
even downloaded manually...).


With UAT, gcobol would have MUCH more test coverage directly for 
everyone, with NIST developers would have the chance to run "what is not 
disabled" from that testsuite for bigger changes like the 
FLOAT_128/libmath adjustment and when working on a new target.


Both parts are already in the COBOLworx repo and work, can be used 
directly to check for regressions and the move from UAT to dejagnu can 
still be done after the increasing pile of bugs (which, as a COBOL 
programmer I partly find quite severe) and possibly some feature 
requests (especially around huge codegen) are taken care by the "rare 
resources" Bob and Jim.


Concerning NIST: please take care to not get on the same low level like 
COBOL-IT and others, claiming gcobol passes NIST - it doesn't (no 
current compiler does pass all modules, and I think GnuCOBOL is the 
single one that nearly passes everything [and is able to at least parse 
the parts that are disabled - around the COMMUNICATION module which was 
obsolete in COBOL85 and was kind of resurrected by COBOL2023's Message 
Control System [MCS]).


Simon

Re: [PATCH] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng




在 2025/4/7 下午3:19, Jakub Jelinek 写道:

On Mon, Apr 07, 2025 at 03:12:22PM +0800, Lulu Cheng wrote:

The above hunks clearly show that you're regenerating it with some patched
autoconf or something like that.  Please manually remove those hunks or use
vanilla upstream autoconf 2.69.  Otherwise the CI will complain.

Otherwise the patch LGTM (though I think Sigend-off-by headers belong after
the ChangeLog entry, not before it).

Jakub


I will modify these in  V3.

Thanks!

Re: [PATCH] cobol: Diagnose ignored SECTIONs [PR119632].

2025-04-07 Thread Iain Sandoe



> On 7 Apr 2025, at 01:19, Simon Sobisch  wrote:
> 
> As noted in bug #119632, ignored section segments should be a warning, 
> ideally with an option to raise that to an error, like -Wignored, which 
> should be included with -Wall (note bug #119329) and is fine to be also 
> raised by default (you'd have to disable it explicitly).

Well, I’ll leave those decisions to the COBOL maintainers,
this was a patch I already had in my tree to help with debugging issues, 
attached is an updated version using %qs as suggested by Jakub,
thanks
Iain



0001-cobol-Diagnose-ignored-SECTIONs-PR119632.patch
Description: Binary data

Re: combine: Re-enable 2->2 combinations, with limits [PR116398]

2025-04-07 Thread Richard Sandiford

Sam James  writes:
> Richard Sandiford  writes:
>
>> This series is an update of:
>>
>>   https://gcc.gnu.org/pipermail/gcc-patches/2025-April/679924.html
>>
>> As discussed in that thread, the changes since last time are to make
>> distribute_links start from the last use, where easy, and to avoid
>> an unnecessary insn walk for split_i2i3.
>>
>
> FWIW, I've regtested the series on x86_64 so far (will do the other
> targets we support downstream too) and started building
> packages and had no problems, with
> https://inbox.sourceware.org/gcc-patches/mpt7c443wm7@arm.com/ on
> top.

A bootstrap & regression test on powerpc64le-linux-gnu (gcc120)
also passed.

I've now pushed the series and the simplify-rtx prepatch.  Thanks to
everyone for the reviews and the testing.

Richard

Re: ping: COBOL: testsuite and running NIST85

2025-04-07 Thread Simon Sobisch


Am 07.04.2025 um 09:36 schrieb Jakub Jelinek:

On Mon, Apr 07, 2025 at 09:30:59AM +0200, Richard Biener wrote:

On Mon, Apr 7, 2025 at 9:00 AM Simon Sobisch  wrote:

My question stands on integrating COBOLworx' UAT as-is for now
(Copyright is all on FSF; built automatically [it is autoconf, which is
a requirement for VCS checkouts], possibly also hooked into the current
test target) - with the goal to get rid of UAT later (next GCC version,
not GCC 15).

There's also the question about integrating NIST into GCC upstream -
that is a subfolder and would only be executed upon explicit call by
maintainers (newcob.val / newcob.val.gz may be either included in VCS or
even downloaded manually...).


As I repeatedly said I'd welcome a test harness like Ada ACATS for running
the NIST testsuite plus a contrib/download_cobol_nist script that downloads
the NIST file and prepares it for use.  I'd suggest to, similar as with ACATS,
have a separate make target for testing (but still invoked with make check,
when present).


But it would be much better if the harness for NIST testing was in dejagnu
rather than anything else, only that can handle easily cross-compilations
with target boards, parallelization respecting make job reserve, seamless
result integration.

My understanding has been NIST is a single file from which some tool needs
to dig up individual testcases (tcl string support should be able to deal
with that), figuring out what options etc. to pass search test and from
somewhere find out the expected output for each test.

Jakub


The source is available as "newcob.val".

From there a COBOL program EXEC85 is extracted, which is then to be 
compiled.
This extracted COBOL program is to be run to extract the requested NIST 
modules for test with the given configuration that it reads from a file.


These modules contain more COBOL sources which are to be compiled and 
run, which produce a test report each.


A final step is then to compare the results with the expectation.

I don't know enough about dejagnu to say if/how this may do several 
parts (it should start to compile EXEC85 and then use that for its 
configurations to extract the modules and run those - or you just 
compile that and run it for extraction once - via make - then run the 
modules multiple times using dejagnu).


The GnuCOBOL driver for NIST is written in perl, it "only" handles the 
part of running the module tests (starting from compilation, partially 
with different options as requested for some modules [like the DB one 
gcc cobol does not test currently], executes them, partially checks 
compiler messages - where the compiler's ability to "flag" something is 
to be tested, checks the log files to gather the number of single tests 
within each program and counting pass/fail [or compile error up front], 
recording the execution time, and finally creates an "overview" 
report.log and execution time for each module, which is then finally 
compared with plain diff + make).
As I understood it, COBOLworx' has put this all into make (and dropped 
part of the things that the gnucobol driver does).


Clone or look at 
https://gitlab.cobolworx.com/COBOLworx/gcc-cobol/-/tree/parser/gcc/cobol/nist 
to find out what COBOLworx did exactly and
https://sourceforge.net/p/gnucobol/code/HEAD/tree/branches/gnucobol-3.x/tests/cobol85/ 
[1] to find out what GC does.


Simon

[1] or 
https://github.com/OCamlPro/gnucobol/tree/gcos4gnucobol-3.x/tests/cobol85, 
if you prefer a git mirror

Re: [PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Jakub Jelinek

On Mon, Apr 07, 2025 at 03:44:52PM +0800, Lulu Cheng wrote:
> In GCC14, LoongArch added __float128 as an alias for _Float128.
> In commit r15-8962, support for q/Q suffixes for 128-bit floating point
> numbers.  This will cause the compiler to automatically link libquadmath
> when compiling Fortran programs.  But on LoongArch `long double` is
> IEEE quad, so there is no need to implement libquadmath.
> This causes link failure.
> 
>   PR target/119408
> 
> libgfortran/ChangeLog:
> 
>   * acinclude.m4: When checking for __float128 support, determine
>   whether the current architecture is LoongArch.  If so, return false.
>   * configure: Regenerate.
> 
> libquadmath/ChangeLog:
> 
>   * configure.ac: When checking for __float128 support, determine
>   whether the current architecture is LoongArch.  If so, return false.
>   * configure: Regenerate.
> 
>   Sigend-off-by: Xi Ruoyao 
>   Sigend-off-by: Jakub Jelinek 
> 
> ---
> v1 -> v2:
>   Corrected typos in commit information.
> v2 -> v3:
>   Regenerate libgfortran/configure using gnu autoconf2.69.

LGTM.

Jakub

GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645] (was: GCN, nvptx: Allow for "hosted" libstdc++ build)

2025-04-07 Thread Thomas Schwinge

Hi!

On 2025-03-14T11:39:20+0100, I wrote:
> As the first of a few patches to enable libstdc++ for GCN, nvptx targets,
> [...]

> some more fine-tuning is to follow later on.)

Any comments before I push the attached
"GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]"?

Jonathan, please put a sharp eye on the
'libstdc++-v3/acinclude.m4:GLIBCXX_ENABLE_LOCK_POLICY' change; to make
sure this only affects GCN, nvptx, but nothing else.


Grüße
 Thomas


>From 1d3278050f9560666f6debcd2ead711660bebd4e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 5 Apr 2025 23:11:23 +0200
Subject: [PATCH] GCN, nvptx libstdc++: Force use of '__atomic' builtins
 [PR119645]

For both GCN, nvptx, this gets rid of 'configure'-time:

configure: WARNING: No native atomic operations are provided for this platform.
configure: WARNING: They will be faked using a mutex.
configure: WARNING: Performance of certain classes will degrade as a result.

..., and changes:

-checking for lock policy for shared_ptr reference counts... mutex
+checking for lock policy for shared_ptr reference counts... atomic

That means, '[...]/[target]/libstdc++-v3/', 'Makefile's change:

-ATOMICITY_SRCDIR = config/cpu/generic/atomicity_mutex
+ATOMICITY_SRCDIR = config/cpu/generic/atomicity_builtins

..., and '[...]/[target]/libstdc++-v3/config.h' changes:

/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef HAVE_ATOMIC_LOCK_POLICY */
+#define HAVE_ATOMIC_LOCK_POLICY 1

/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1

..., and '[...]/[target]/libstdc++-v3/include/[target]/bits/c++config.h'
changes:

/* Defined if shared_ptr reference counting should use atomic operations. */
-/* #undef _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY */
+#define _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY 1

/* Define if the compiler supports C++11 atomics. */
-/* #undef _GLIBCXX_ATOMIC_BUILTINS */
+#define _GLIBCXX_ATOMIC_BUILTINS 1

This means that '[...]/[target]/libstdc++-v3/libsupc++/atomicity.cc',
'[...]/[target]/libstdc++-v3/libsupc++/atomicity.o' then uses atomic
instructions for synchronization instead of C++ static local variables, which
in turn for their guard variables, via 'libstdc++-v3/libsupc++/guard.cc', used
'libgcc/gthr.h' recursive mutexes, which currently are unsupported for GCN.

For GCN, this turns ~500 libstdc++ execution test FAILs into PASSes, and also
progresses:

PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++17 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++17 execution test
PASS: g++.dg/tree-ssa/pr20458.C  -std=gnu++26 (test for excess errors)
[-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C  -std=gnu++26 execution test
UNSUPPORTED: g++.dg/tree-ssa/pr20458.C  -std=gnu++98: exception handling not supported

(For nvptx, there is no effective change, due to other misconfiguration.)

	PR target/119645
	libstdc++-v3/
	* acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY) [GCN, nvptx]:
	Hard-code results.
	* configure: Regenerate.
	* configure.host [GCN, nvptx] (atomicity_dir): Set to
	'cpu/generic/atomicity_builtins'.
---
 libstdc++-v3/acinclude.m4   |  7 ---
 libstdc++-v3/configure  | 11 ++-
 libstdc++-v3/configure.host | 11 +++
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 912e85a46679..5b9f0c5ee89d 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4007,10 +4007,11 @@ AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [
 dnl Why don't we check 8-byte CAS for sparc64, where _Atomic_word is long?!
 dnl New targets should only check for CAS for the _Atomic_word type.
 AC_TRY_COMPILE([
-#if defined __riscv
+#if defined __AMDGCN__ || defined __nvptx__
+/* Yes, please.  */
+#elif defined __riscv
 # error "Defaulting to mutex-based locks for ABI compatibility"
-#endif
-#if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
 # error "No 2-byte compare-and-swap"
 #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
 # error "No 4-byte compare-and-swap"
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 0b4481f0739c..2b2eedeb2e71 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -16393,10 +16393,11 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
-#if defined __riscv
+#if defined __AMDGCN__ || defined __nvptx__
+/* Yes, please.  */
+#elif defined __riscv
 # error "Defaulting to mutex-based locks for ABI compatibility"
-#endif
-#if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
+#elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2
 # error "No 2-byte compare-and-swap"
 #elif ! define

Re: ping: COBOL: testsuite and running NIST85

2025-04-07 Thread Richard Biener

On Mon, Apr 7, 2025 at 9:41 AM Simon Sobisch  wrote:
>
>
>
> Am 07.04.2025 um 09:30 schrieb Richard Biener:
> > On Mon, Apr 7, 2025 at 9:00 AM Simon Sobisch  wrote:
> >>
> >> My question stands on integrating COBOLworx' UAT as-is for now
> >> (Copyright is all on FSF; built automatically [it is autoconf, which is
> >> a requirement for VCS checkouts], possibly also hooked into the current
> >> test target) - with the goal to get rid of UAT later (next GCC version,
> >> not GCC 15).
> >>
> >> There's also the question about integrating NIST into GCC upstream -
> >> that is a subfolder and would only be executed upon explicit call by
> >> maintainers (newcob.val / newcob.val.gz may be either included in VCS or
> >> even downloaded manually...).
> >
> > As I repeatedly said I'd welcome a test harness like Ada ACATS for running
> > the NIST testsuite plus a contrib/download_cobol_nist script that downloads
> > the NIST file and prepares it for use.  I'd suggest to, similar as with 
> > ACATS,
> > have a separate make target for testing (but still invoked with make check,
> > when present).
>
> Sounds good. As it is a single curl operation that can also be done with
> make in that subfolder, do we still need a separate download script?

I think that would be better, yes.

> I understand that in any case the test harness would check for the
> newcob.val file existing (builddir only is fine, right?), and if it is,
> then execute a `${MAKE} -C builddir/...test.../nist).

Yes.  I agree that the actual testing should ideally be done via dejagnu,
but OTOH all the extracting part can be done on the host and thus in
a script/Makefile.

> If Jim doesn't find the time to do this (please respond on this), I can
> prepare a patch (contributing mostly COBOLworx' work for that setup and
> config).
>
> > As for UAT, I understand it's work in progress to get that converted to 
> > dejagnu?
>
> It is, but full UAT will take weeks, if not months, as far as I've
> understood Bob. I feel the urge to have his time spend on other things
> than a conversion (which _does_ provide additional benefit like testing
> with more configurations and be better included in the rest of GCC's
> tests), as GCC15 is near and the amount of things to don don't get less).
>
> But whatever you guys decide will happen, I mostly wanted to raise my
> concern.

Note the set of Cobol tests in GCC 15 does not need to stay fixed, tests
can be added there even after the GCC 15.1 release.  Mind GCC 15 will
live for quite some time.

Richard.

> >> With UAT, gcobol would have MUCH more test coverage directly for
> >> everyone, with NIST developers would have the chance to run "what is not
> >> disabled" from that testsuite for bigger changes like the
> >> FLOAT_128/libmath adjustment and when working on a new target.
> >>
> >> Both parts are already in the COBOLworx repo and work, can be used
> >> directly to check for regressions and the move from UAT to dejagnu can
> >> still be done after the increasing pile of bugs (which, as a COBOL
> >> programmer I partly find quite severe) and possibly some feature
> >> requests (especially around huge codegen) are taken care by the "rare
> >> resources" Bob and Jim.
> >>
> >> Concerning NIST: please take care to not get on the same low level like
> >> COBOL-IT and others, claiming gcobol passes NIST - it doesn't (no
> >> current compiler does pass all modules, and I think GnuCOBOL is the
> >> single one that nearly passes everything [and is able to at least parse
> >> the parts that are disabled - around the COMMUNICATION module which was
> >> obsolete in COBOL85 and was kind of resurrected by COBOL2023's Message
> >> Control System [MCS]).
>
> Simon
>
>

RE: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores [PR113257].

2025-04-07 Thread Tamar Christina

> -Original Message-
> From: Kyrylo Tkachov 
> Sent: Monday, March 31, 2025 1:43 PM
> To: i...@sandoe.co.uk
> Cc: Tamar Christina ; GCC Patches  patc...@gcc.gnu.org>; Alice Carlotti ; Richard 
> Sandiford
> ; s...@gentoo.org
> Subject: Re: [PATCH v2] aarch64, Darwin: Initial implementation of Apple cores
> [PR113257].
> 
> Hi Iain,
> 
> > On 22 Mar 2025, at 15:31, Iain Sandoe  wrote:
> >
> > 0. Sorry this has taken some time to close off; partly because of waiting
> >   for input, but mostly that I've been stretched with other work.
> > 1. As per the commit message, the apparent non-conformance with 8.5/6
> >   because FEAT_SPECRES returns 0, is a result of the query operating
> >   at user priv.  The cores are confirmed to support this for priv.
> >   code.
> > 2. I added entries for the apple-m1,2,3 cores in invoke.texi.
> > 3. Following Andrew's suggestion and with some measurements by Tamar
> >   and me, figured out the LITTLE.big chip ids (at least for a sub-
> >   set).
> >
> > This has been in use for a while on aarch64-darwin branches and I've
> > checked manually that it gives the right .arch lines on cfarm185.
> >
> > OK for trunk? (if so, when?)
> > thanks
> > Iain
> >
> > --- 8< ---
> >
> > After discussion with the open source support team at Apple, we have
> > established that the cores conform to the 8.5 and 8.6 requirements.
> > One of the mandatory features (FEAT_SPECRES) is not exposed (or
> > available) in user-space code but is supported for privileged code.
> >
> > The values for chip IDs and the LITTLE.big variants have been taken
> > from lists in the XNU and LLVM sources.
> >
> > PR target/113257
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add Apple-a12,
> > Apple-M1, Apple-M2, Apple-M3 with expanded names to allow for the
> > LITTLE.big versions.
> > * config/aarch64/aarch64-tune.md: Regenerate.
> > * doc/invoke.texi: Add apple-m1,2 and 3 cores to the ones listed
> > for arch and tune selections.
> >
> > Signed-off-by: Iain Sandoe 
> > ---
> > gcc/config/aarch64/aarch64-cores.def | 16 
> > gcc/config/aarch64/aarch64-tune.md   |  2 +-
> > gcc/doc/invoke.texi  |  5 +++--
> > 3 files changed, 20 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64-cores.def
> b/gcc/config/aarch64/aarch64-cores.def
> > index 0e22d72976e..7f204fd0ac9 100644
> > --- a/gcc/config/aarch64/aarch64-cores.def
> > +++ b/gcc/config/aarch64/aarch64-cores.def
> > @@ -173,6 +173,22 @@ AARCH64_CORE("cortex-a76.cortex-a55",
> cortexa76cortexa55, cortexa53, V8_2A,  (F
> > AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41,
> 0xd15, -1)
> > AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53,
> 0x41, 0xd14, -1)
> >
> > +/* Apple (A12 and M) cores.
> > +   Known part numbers as listed in other public sources.
> > +   Placeholders for schedulers, generic_armv8_a for costs.
> > +   A12 seems mostly 8.3, M1 is 8.5 without BTI, M2 and M3 are 8.6
> > +   From measurements made so far the odd-number core IDs are performance.
> */
> > +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (),
> generic_armv8_a, 0x61, 0x12, -1)
> > +AARCH64_CORE("apple-m1", applem1_0, cortexa57, V8_5A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x21, 0x20), -1)
> > +AARCH64_CORE("apple-m1", applem1_1, cortexa57, V8_5A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x23, 0x22), -1)
> > +AARCH64_CORE("apple-m1", applem1_2, cortexa57, V8_5A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x25, 0x24), -1)
> > +AARCH64_CORE("apple-m1", applem1_3, cortexa57, V8_5A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x29, 0x28), -1)
> > +AARCH64_CORE("apple-m2", applem2_0, cortexa57, V8_6A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x31, 0x30), -1)
> > +AARCH64_CORE("apple-m2", applem2_1, cortexa57, V8_6A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x33, 0x32), -1)
> > +AARCH64_CORE("apple-m2", applem2_2, cortexa57, V8_6A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x35, 0x34), -1)
> > +AARCH64_CORE("apple-m2", applem2_3, cortexa57, V8_6A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x39, 0x38), -1)
> > +AARCH64_CORE("apple-m3", applem3_0, cortexa57, V8_6A,  (),
> generic_armv8_a, 0x61, AARCH64_BIG_LITTLE (0x49, 0x48), -1)
> 
> I don’t think we have precedent of different MIDR part numbers resolving to 
> the
> same -mcpu string, but I think it should all work as expected.

Indeed, I think for the current usage it should work fine.

> As long as you and Tamar are happy with the feature set here no objections 
> from
> me.

FWIW no objections from me.  This should unblock folks 😊

Thanks,
Tamar

> Looks ok to me for GCC 15 with a documentation comment below…
> 
> > +
> > /* Armv9.0-A Architecture Processors.  */
> >
> > /* Arm ('A') cores. */
> > diff --git a/gcc/config/aarch64/aarch64-tune.md
> b/gcc/config/aarch64/aarch64-tune.md
> > index 56a914f12b9..982074c2c21

[PATCH] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

Sigend-off-by: Xi Ruoyao <1...@xry111.site>
Sigend-off-by: Jakub Jelinek 

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

---
 libgfortran/acinclude.m4 |  4 
 libgfortran/configure| 18 +-
 libquadmath/configure|  8 
 libquadmath/configure.ac |  4 
 4 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/libgfortran/acinclude.m4 b/libgfortran/acinclude.m4
index a73207e5465..23fd621e518 100644
--- a/libgfortran/acinclude.m4
+++ b/libgfortran/acinclude.m4
@@ -274,6 +274,10 @@ AC_DEFUN([LIBGFOR_CHECK_FLOAT128], [
   AC_CACHE_CHECK([whether we have a usable _Float128 type],
  libgfor_cv_have_float128, [
GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
diff --git a/libgfortran/configure b/libgfortran/configure
index 11a1bc5f070..6ee56839968 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -16413,7 +16413,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16459,7 +16459,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16483,7 +16483,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16528,7 +16528,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16552,7 +16552,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -30283,6 +30283,10 @@ else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
@@ -30336,6 +30340,10 @@ fi
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif

Re: [PATCH] sra: Avoid creating TBAA hazards (PR118924)

2025-04-07 Thread Martin Jambor

Hi,

On Tue, Apr 01 2025, Richard Biener wrote:
> On Mon, 31 Mar 2025, Martin Jambor wrote:
>
>> Hi,
>> 
>> the testcase in PR 118924, when compiled on Aarch64, contains an
>> gimple aggregate assignment statement in between different types which
>> are types_compatible_p but behave differently for the purposes of
>> alias analysis.
>> 
>> SRA replaces the statement with a series of scalar assignments which
>> however have LHSs access chains modeled on the RHS type and so do not
>> alias with a subsequent reads and so are DSEd.
>> 
[...]
>> 
>> The issue here is that even when the access path is the same, it must
>> not be bolted on an aggregate type that does not match.  This patch
>> does that, taking just one simple function from the
>> ao_compare::compare_ao_refs machinery and using it to detect the
>> situation.  The rest is just merging the information in between
>> accesses of the same access group.
>> 
>
> I'm unsure about the lto_streaming_safe arg, in fact the implementation is
>
>   if (lto_streaming_safe)
> return type1 == type2;
>   else
> return TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2);
>
> but I think we guarantee (we _need_ to guarantee!) that when
> TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2), after LTO streaming
> it's still TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2).  Otherwise
> assignments previously valid GIMPLE might no longer be valid and
> things that aliased previously might no longer alias.
>
> But that's an implementation bug in types_equal_for_same_type_for_tbaa_p,
> but I think you should be able to pass in false as if the implementation
> were fixed.  IMO if necessary the implementation itself should
> use sth like if (flag_lto && !in_lto_p) rather than leaving it up to
> the caller ... the only existing caller uses
> lto_streaming_expected_p () as arg, which is similar to my proposal.
>
> I'd say you want to export at a forwarder
>
> bool
> types_equal_for_same_type_for_tbaa_p (tree type1, tree type2)
> {
>   return types_equal_for_same_type_for_tbaa_p (type1, type2, 
> lto_streaming_expected_p ());
> }
>
> instead as it should be an internal detail.
>
> OK with that change.
>
> Can you fixup the comment
>
> /* Return ture if TYPE1 and TYPE2 will always give the same answer
>when compared wit hother types using same_type_for_tbaa_p.  */
>
> when you are there?  The function is called same_type_for_tbaa
> and 'wit hother' should be 'with other'

Thanks, I am about to commit the following then (I did another way of
bootstrapping and testing on x86_64-linux and Aarch64-linux).

Martin


gcc/ChangeLog:

2025-04-04  Martin Jambor  

PR tree-optimization/118924
* tree-ssa-alias-compare.h (types_equal_for_same_type_for_tbaa_p):
Declare.
* tree-ssa-alias.cc: Include ipa-utils.h.
(types_equal_for_same_type_for_tbaa_p): New public overloaded variant.
* tree-sra.cc: Include tree-ssa-alias-compare.h.
(create_access): Initialzie grp_same_access_path to true.
(build_accesses_from_assign): Detect tbaa hazards and clear
grp_same_access_path fields of involved accesses when they occur.
(sort_and_splice_var_accesses): Take previous values of
grp_same_access_path into account.

gcc/testsuite/ChangeLog:

2025-03-25  Martin Jambor  

PR tree-optimization/118924
* g++.dg/tree-ssa/pr118924.C: New test.
---
 gcc/testsuite/g++.dg/tree-ssa/pr118924.C | 29 
 gcc/tree-sra.cc  | 17 +++---
 gcc/tree-ssa-alias-compare.h |  2 ++
 gcc/tree-ssa-alias.cc| 13 ++-
 4 files changed, 57 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr118924.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr118924.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
new file mode 100644
index 000..c95eacafc9c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr118924.C
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-std=c++17 -O2" } */
+
+template  struct Vector {
+  int m_data[Size];
+  Vector(int, int, int) {}
+};
+enum class E { POINTS, LINES, TRIANGLES };
+
+__attribute__((noipa))
+void getName(E type) {
+  static E check = E::POINTS;
+  if (type == check)
+check = (E)((int)check + 1);
+  else
+__builtin_abort ();
+}
+
+int main() {
+  int arr[]{0, 1, 2};
+  for (auto dim : arr) {
+Vector<3> localInvs(1, 1, 1);
+localInvs.m_data[dim] = 8;
+  }
+  E types[] = {E::POINTS, E::LINES, E::TRIANGLES};
+  for (auto primType : types)
+getName(primType);
+  return 0;
+}
diff --git a/gcc/tree-sra.cc b/gcc/tree-sra.cc
index c26559edc66..ae7cd57a5f2 100644
--- a/gcc/tree-sra.cc
+++ b/gcc/tree-sra.cc
@@ -100,6 +100,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "tree-sra.h"
 #include "opts.h"
+#include "tree-ssa-alias-compare.h"
 
 /* Enumeration of all aggregate reductions we can do.  */
 enum sra_mode { SRA_MO

Re: [PATCH] tailc: Extend the IPA-VRP workaround [PR119614]

2025-04-07 Thread Jakub Jelinek

On Mon, Apr 07, 2025 at 11:43:27AM +0200, Richard Biener wrote:
> I do wonder with all these patches whether it would be better to
> preserve the LHS on musttail calls instead?

It can't be instead, because without musttail all those tests regress
as well (before IPA-VRP they were successfully tail called, now they
aren't), it can be perhaps in addition to.  Though, I'm really worried
what all places just optimize singleton range into just the constant,
clearly it isn't just the vrp1/vrp2 passes, but also dom and also something
already during the IPA passes.

Jakub

Re: [PATCH] testsuite: arm: Tighten compile options for short-vfp-1.c [PR119556]

2025-04-07 Thread Richard Earnshaw


On 06/04/2025 19:49, Christophe Lyon wrote:

The previous version of this test required arch v6+ (for sxth), and
the number of vmov depended on the float-point ABI (where softfp
needed more of them to transfer floating-point values to and from
general registers).

With this patch we require arch v7-a, vfp FPU and -mfloat-abi=hard, we
also use -O2 to clean the generated code and convert
scan-assembler-times directives into check-function-bodies.

Tested on arm-none-linux-gnueabihf and several flavours of
arm-none-eabi.

gcc/testsuite/ChangeLog:

PR target/119556
* gcc.target/arm/short-vfp-1.c: Improve dg directives.


OK.

R.


---
  gcc/testsuite/gcc.target/arm/short-vfp-1.c | 46 ++
  1 file changed, 38 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/short-vfp-1.c 
b/gcc/testsuite/gcc.target/arm/short-vfp-1.c
index f6866c4f601..418fc279af0 100644
--- a/gcc/testsuite/gcc.target/arm/short-vfp-1.c
+++ b/gcc/testsuite/gcc.target/arm/short-vfp-1.c
@@ -1,45 +1,75 @@
  /* { dg-do compile } */
-/* { dg-require-effective-target arm_vfp_ok } */
-/* { dg-add-options arm_vfp } */
+/* { dg-require-effective-target arm_arch_v7a_fp_hard_ok } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v7a_fp_hard } */
+/* { dg-final { check-function-bodies "**" "" } } */
  
+/*

+** test_sisf:
+** vcvt.s32.f32(s[0-9]+), s0
+** vmovr0, \1  @ int
+** bx  lr
+*/
  int
  test_sisf (float x)
  {
return (int)x;
  }
  
+/*

+** test_hisf:
+** vcvt.s32.f32(s[0-9]+), s0
+** vmov(r[0-9]+), \1   @ int
+** sxthr0, \2
+** bx  lr
+*/
  short
  test_hisf (float x)
  {
return (short)x;
  }
  
+/*

+** test_sfsi:
+** vmov(s[0-9]+), r0   @ int
+** vcvt.f32.s32s0, \1
+** bx  lr
+*/
  float
  test_sfsi (int x)
  {
return (float)x;
  }
  
+/*

+** test_sfhi:
+** vmov(s[0-9]+), r0   @ int
+** vcvt.f32.s32s0, \1
+** bx  lr
+*/
  float
  test_sfhi (short x)
  {
return (float)x;
  }
  
+/*

+** test_hisi:
+** sxthr0, r0
+** bx  lr
+*/
  short
  test_hisi (int x)
  {
return (short)x;
  }
  
+/*

+** test_sihi:
+** bx  lr
+*/
  int
  test_sihi (short x)
  {
return (int)x;
  }
-
-/* { dg-final { scan-assembler-times {vcvt\.s32\.f32\ts[0-9]+, s[0-9]+} 2 } } 
*/
-/* { dg-final { scan-assembler-times {vcvt\.f32\.s32\ts[0-9]+, s[0-9]+} 2 } } 
*/
-/* { dg-final { scan-assembler-times {vmov\tr[0-9]+, s[0-9]+} 2 } } */
-/* { dg-final { scan-assembler-times {vmov\ts[0-9]+, r[0-9]+} 2 } } */
-/* { dg-final { scan-assembler-times {sxth\tr[0-9]+, r[0-9]+} 2 } } */

[PATCH v5 2/5] Add function to strip pointer type and get down to the actual pointee type.

2025-04-07 Thread Tejas Belagod

Add a function to traverse down the pointer layers to the pointee type.

gcc/ChangeLog:
* tree.h (strip_pointer_types): New.
---
 gcc/tree.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/tree.h b/gcc/tree.h
index 55f97f9f999..99f26177628 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -5053,6 +5053,17 @@ strip_array_types (tree type)
   return type;
 }
 
+/* Recursively traverse down pointer type layers to pointee type.  */
+
+inline const_tree
+strip_pointer_types (const_tree type)
+{
+  while (POINTER_TYPE_P (type))
+type = TREE_TYPE (type);
+
+  return type;
+}
+
 /* Desription of the reason why the argument of valid_constant_size_p
is not a valid size.  */
 enum cst_size_error {
-- 
2.25.1

[Stage 1][Middle-end][PATCH v5 1/3] Provide more contexts for -Warray-bounds, -Wstringop-*warning messages due to code movements from compiler transfromaton (Part 1) [PR109071, PR85788, PR88771, PR106

2025-04-07 Thread Qing Zhao

[PR109071,PR85788,PR88771,PR106762,PR108770,PR115274,PR117179]

Control this with a new option -fdiagnostics-details.

$ cat t.c
extern void warn(void);
static inline void assign(int val, int *regs, int *index)
{
  if (*index >= 4)
warn();
  *regs = val;
}
struct nums {int vals[4];};

void sparx5_set (int *ptr, struct nums *sg, int index)
{
  int *val = &sg->vals[index];

  assign(0,ptr, &index);
  assign(*val, ptr, &index);
}

$ gcc -Wall -O2  -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ 
[-Warray-bounds=]
   12 |   int *val = &sg->vals[index];
  |   ^~~
t.c:8:18: note: while referencing ‘vals’
8 | struct nums {int vals[4];};
  |  ^~~~

In the above, Although the warning is correct in theory, the warning message
itself is confusing to the end-user since there is information that cannot
be connected to the source code directly.

It will be a nice improvement to add more information in the warning message
to report where such index value come from.

In order to achieve this, we add a new data structure "move_history" to record
1. the "condition" that triggers the code movement;
2. whether the code movement is on the true path of the "condition";
3. the "compiler transformation" that triggers the code movement.

Whenever there is a code movement along control flow graph due to some
specific transformations, such as jump threading, path isolation, tree
sinking, etc., a move_history structure is created and attached to the
moved gimple statement.

During array out-of-bound checking or -Wstringop-* warning checking, the
"move_history" that was attached to the gimple statement is used to form
a sequence of diagnostic events that are added to the corresponding rich
location to be used to report the warning message.

This behavior is controled by the new option -fdiagnostics-details
which is off by default.

With this change, by adding -fdiagnostics-details,
the warning message for the above testing case is now:

$ gcc -Wall -O2 -fdiagnostics-details -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ 
[-Warray-bounds=]
   12 |   int *val = &sg->vals[index];
  |   ^~~
  ‘sparx5_set’: events 1-2
4 |   if (*index >= 4)
  |  ^
  |  |
  |  (1) when the condition is evaluated to true
..
   12 |   int *val = &sg->vals[index];
  |   ~~~
  |   |
  |   (2) out of array bounds here
t.c:8:18: note: while referencing ‘vals’
8 | struct nums {int vals[4];};
  |  ^~~~

The change was divided into 3 parts:

Part 1: Add new data structure move_history, record move_history during
transformation;
Part 2: In warning analysis, Use the new move_history to form a rich
location with a sequence of events, to report more context info
of the warnings.
Part 3: Add debugging mechanism for move_history.

PR tree-optimization/109071
PR tree-optimization/85788
PR tree-optimization/88771
PR tree-optimization/106762
PR tree-optimization/108770
PR tree-optimization/115274
PR tree-optimization/117179

gcc/ChangeLog:

* Makefile.in (OBJS): Add diagnostic-move-history.o.
* gcc/common.opt (fdiagnostics-details): New option.
* gcc/doc/invoke.texi (fdiagnostics-details): Add
documentation for the new option.
* gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the move
history when removing the gimple.
* gimple-pretty-print.cc (pp_gimple_stmt_1): Emit MV_H marking
if the gimple has a move_history.
* gimple-ssa-isolate-paths.cc (isolate_path): Set move history
for the gimples of the duplicated blocks.
* tree-ssa-sink.cc (sink_code_in_bb): Create move_history for
stmt when it is sinked.
* toplev.cc (toplev::finalize):  Call move_history_finalize.
* tree-ssa-threadupdate.cc (ssa_redirect_edges): Create move_history
for stmts when they are duplicated.
(back_jt_path_registry::duplicate_thread_path): Likewise.
* diagnostic-move-history.cc: New file.
* diagnostic-move-history.h: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/pr117375.c: New test.
---
 gcc/Makefile.in |   1 +
 gcc/common.opt  |   4 +
 gcc/diagnostic-move-history.cc  | 265 
 gcc/diagnostic-move-history.h   |  92 +++
 gcc/doc/invoke.texi |  11 ++
 gcc/gimple-iterator.cc  |   3 +
 gcc/gimple-pretty-print.cc  |   4 +
 gcc/gimple-ssa-isolate-paths.cc |  11 ++
 gcc/testsuite/gcc.dg/pr117375.c |  13 ++
 gcc/toplev.cc   |   3 +
 gcc/tree-ssa-sink.cc|  57 +++
 gcc/tree-ssa-threadupdate.cc|  25 +

[Stage 1][Middle-end][PATCH v5 2/3] Provide more contexts for -Warray-bounds, -Wstringop-*warning messages due to code movements from compiler transformation (Part 2) [PR109071, PR85788, PR88771, PR10

2025-04-07 Thread Qing Zhao

[PR109071,PR85788,PR88771,PR106762,PR108770,PR115274,PR117179]

During array out-of-bound checking or -Wstringop-* warning checking, the
"move_history" that was attached to the gimple statement is used to form
a sequence of diagnostic events that are added to the corresponding rich
location to be used to report the warning message.

PR tree-optimization/109071
PR tree-optimization/85788
PR tree-optimization/88771
PR tree-optimization/106762
PR tree-optimization/108770
PR tree-optimization/115274
PR tree-optimization/117179

gcc/ChangeLog:

* Makefile.in (OBJS): Add move-history-rich-location.o.
* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add
one new parameter. Use rich location with details for warning_at.
(array_bounds_checker::check_array_ref): Use rich location with
ditails for warning_at.
(array_bounds_checker::check_mem_ref): Add one new parameter.
Use rich location with details for warning_at.
(array_bounds_checker::check_addr_expr): Use rich location with
move_history_diagnostic_path for warning_at.
(array_bounds_checker::check_array_bounds): Call check_mem_ref with
one more parameter.
* gimple-array-bounds.h: Update prototype for check_mem_ref.
* gimple-ssa-warn-restrict.cc (maybe_diag_access_bounds): Use
rich location with details for warning_at.
* gimple-ssa-warn-access.cc (warn_string_no_nul): Likewise.
(maybe_warn_nonstring_arg): Likewise.
(maybe_warn_for_bound): Likewise.
(warn_for_access): Likewise.
(check_access): Likewise.
(pass_waccess::check_strncat): Likewise.
(pass_waccess::maybe_check_access_sizes): Likewise.
* move-history-rich-location.cc: New file.
* move-history-rich-location.h: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/pr109071.c: New test.
* gcc.dg/pr109071_1.c: New test.
* gcc.dg/pr109071_2.c: New test.
* gcc.dg/pr109071_3.c: New test.
* gcc.dg/pr109071_4.c: New test.
* gcc.dg/pr109071_5.c: New test.
* gcc.dg/pr109071_6.c: New test.
---
 gcc/Makefile.in   |   1 +
 gcc/gimple-array-bounds.cc|  39 +
 gcc/gimple-array-bounds.h |   2 +-
 gcc/gimple-ssa-warn-access.cc | 131 +-
 gcc/gimple-ssa-warn-restrict.cc   |  25 +++---
 gcc/move-history-rich-location.cc |  56 +
 gcc/move-history-rich-location.h  |  65 +++
 gcc/testsuite/gcc.dg/pr109071.c   |  43 ++
 gcc/testsuite/gcc.dg/pr109071_1.c |  36 
 gcc/testsuite/gcc.dg/pr109071_2.c |  50 
 gcc/testsuite/gcc.dg/pr109071_3.c |  42 ++
 gcc/testsuite/gcc.dg/pr109071_4.c |  41 ++
 gcc/testsuite/gcc.dg/pr109071_5.c |  33 
 gcc/testsuite/gcc.dg/pr109071_6.c |  49 +++
 14 files changed, 530 insertions(+), 83 deletions(-)
 create mode 100644 gcc/move-history-rich-location.cc
 create mode 100644 gcc/move-history-rich-location.h
 create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_1.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_2.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_3.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_4.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_5.c
 create mode 100644 gcc/testsuite/gcc.dg/pr109071_6.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 29663be1d48..0a4bde23dc0 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1621,6 +1621,7 @@ OBJS = \
mcf.o \
mode-switching.o \
modulo-sched.o \
+   move-history-rich-location.o \
multiple_target.o \
omp-offload.o \
omp-expand.o \
diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 22286cbb4cc..79236e6b9c7 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 .  */
 
+#define INCLUDE_MEMORY
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -31,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-dfa.h"
 #include "fold-const.h"
 #include "diagnostic-core.h"
+#include "move-history-rich-location.h"
 #include "intl.h"
 #include "tree-vrp.h"
 #include "alloc-pool.h"
@@ -262,6 +264,7 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
 
 static bool
 check_out_of_bounds_and_warn (location_t location, tree ref,
+ gimple *stmt,
  tree low_sub_org, tree low_sub, tree up_sub,
  tree up_bound, tree up_bound_p1,
  const irange *vr,
@@ -275,12 +278,13 @@ check_out_of_bounds_and_warn (location_t location, tree 
ref,
   bool warned = fal

[committed] libstdc++: Add new headers to Doxygen config file

2025-04-07 Thread Jonathan Wakely

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (INPUT): Add flat_map, flat_set,
text_encoding, stdbit.h and stdckdint.h.
---

Pushed to trunk.

 libstdc++-v3/doc/doxygen/user.cfg.in | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in 
b/libstdc++-v3/doc/doxygen/user.cfg.in
index ae50f6dd0c7..19ae67a67ba 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -861,6 +861,8 @@ INPUT  = @srcdir@/doc/doxygen/doxygroups.cc 
\
  include/deque \
  include/expected \
  include/filesystem \
+ include/flat_map \
+ include/flat_set \
  include/forward_list \
  include/format \
  include/fstream \
@@ -906,6 +908,7 @@ INPUT  = @srcdir@/doc/doxygen/doxygroups.cc 
\
  include/string_view \
  include/syncstream \
  include/system_error \
+ include/text_encoding \
  include/thread \
  include/tuple \
  include/typeindex \
@@ -942,6 +945,8 @@ INPUT  = @srcdir@/doc/doxygen/doxygroups.cc 
\
  include/cwchar \
  include/cuchar \
  include/cwctype \
+ include/stdbit.h \
+ include/stdckdint.h \
  include/ \
  include/bits \
  include/@host_alias@/bits \
-- 
2.49.0

[Stage 1][Middle-end][PATCH v5 3/3] Provide more contexts for -Warray-bounds, -Wstringop-* warning messages due to code movements from compiler transformation (Part 3) [PR109071, PR85788, PR88771, PR1

2025-04-07 Thread Qing Zhao

[PR109071,PR85788,PR88771,PR106762,PR108770,PR115274,PR117179]

Add debugging for move history.

PR tree-optimization/109071
PR tree-optimization/85788
PR tree-optimization/88771
PR tree-optimization/106762
PR tree-optimization/108770
PR tree-optimization/115274
PR tree-optimization/117179

gcc/ChangeLog:

* diagnostic-move-history.cc (dump_move_history): New routine.
(dump_move_history_for): Likewise.
(debug_mv_h): Likewise.
* diagnostic-move-history.h (dump_move_history): New prototype.
(dump_move_history_for): Likewise.
* gimple-ssa-isolate-paths.cc (isolate_path): Add debugging message
when setting move history for statements.
* tree-ssa-sink.cc (sink_code_in_bb): Likewise.
* tree-ssa-threadupdate.cc (ssa_redirect_edges): Likewise.
(back_jt_path_registry::duplicate_thread_path): Likewise.
---
 gcc/diagnostic-move-history.cc  | 67 +
 gcc/diagnostic-move-history.h   |  2 +
 gcc/gimple-ssa-isolate-paths.cc | 10 +
 gcc/tree-ssa-sink.cc| 10 -
 gcc/tree-ssa-threadupdate.cc| 18 +
 5 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/gcc/diagnostic-move-history.cc b/gcc/diagnostic-move-history.cc
index 83d8a42b577..045b0d3d5d1 100644
--- a/gcc/diagnostic-move-history.cc
+++ b/gcc/diagnostic-move-history.cc
@@ -25,6 +25,7 @@
 #include "backend.h"
 #include "tree.h"
 #include "gimple.h"
+#include "tree-pretty-print.h"
 #include "gimple-iterator.h"
 #include "cfganal.h"
 #include "diagnostic-move-history.h"
@@ -263,3 +264,69 @@ set_move_history_to_stmts_in_bb (basic_block bb, edge 
entry,
 
   return true;
 }
+
+/* Dump the move_history data structure MV_HISTORY.  */
+
+void
+dump_move_history (FILE *file, move_history_t mv_history)
+{
+  fprintf (file, "The move history is: \n");
+  if (!mv_history)
+{
+  fprintf (file, "No move history.\n");
+  return;
+}
+
+  for (move_history_t cur_ch = mv_history; cur_ch;
+   cur_ch = cur_ch->prev_move)
+{
+  expanded_location exploc_cond = expand_location (cur_ch->condition);
+
+  if (exploc_cond.file)
+   fprintf (file, "[%s:", exploc_cond.file);
+  fprintf (file, "%d, ", exploc_cond.line);
+  fprintf (file, "%d] ", exploc_cond.column);
+
+  fprintf (file, "%s ", cur_ch->is_true_path ? "true" : "false");
+  const char *reason = NULL;
+  switch (cur_ch->reason)
+   {
+   case COPY_BY_THREAD_JUMP:
+ reason = "copy_by_thread_jump";
+ break;
+   case COPY_BY_ISOLATE_PATH:
+ reason = "copy_by_isolate_path";
+ break;
+   case MOVE_BY_SINK:
+ reason = "move_by_sink";
+ break;
+   default:
+ reason = "UNKNOWN";
+ break;
+   }
+  fprintf (file, "%s \n", reason);
+}
+}
+
+/* Dump the move_history date structure attached to the gimple STMT.  */
+void
+dump_move_history_for (FILE *file, const gimple *stmt)
+{
+  move_history_t mv_history = get_move_history (stmt);
+  if (!mv_history)
+fprintf (file, "No move history.\n");
+  else
+dump_move_history (file, mv_history);
+}
+
+DEBUG_FUNCTION void
+debug_mv_h (const move_history_t mv_history)
+{
+  dump_move_history (stderr, mv_history);
+}
+
+DEBUG_FUNCTION void
+debug_mv_h (const gimple * stmt)
+{
+  dump_move_history_for (stderr, stmt);
+}
diff --git a/gcc/diagnostic-move-history.h b/gcc/diagnostic-move-history.h
index 9a58766d544..0c56974119d 100644
--- a/gcc/diagnostic-move-history.h
+++ b/gcc/diagnostic-move-history.h
@@ -88,5 +88,7 @@ extern bool set_move_history_to_stmt (gimple *, edge,
of the entry edge.  */
 extern bool set_move_history_to_stmts_in_bb (basic_block, edge,
 bool, enum move_reason);
+extern void dump_move_history (FILE *, move_history_t);
+extern void dump_move_history_for (FILE *, const gimple *);
 
 #endif // DIAGNOSTIC_MOVE_HISTORY_H
diff --git a/gcc/gimple-ssa-isolate-paths.cc b/gcc/gimple-ssa-isolate-paths.cc
index 14c86590b17..bbaba09e192 100644
--- a/gcc/gimple-ssa-isolate-paths.cc
+++ b/gcc/gimple-ssa-isolate-paths.cc
@@ -176,6 +176,16 @@ isolate_path (basic_block bb, basic_block duplicate,
  incoming edge.  */
   if (flag_diagnostics_details)
 {
+  if (dump_file)
+   {
+ fprintf (dump_file, "Set move history for stmts of B[%d]"
+  " as not on the destination of the edge\n",
+  bb->index);
+ fprintf (dump_file, "Set move history for stmts of B[%d]"
+  " as on the destination of the edge\n",
+  duplicate->index);
+   }
+
   set_move_history_to_stmts_in_bb (bb, e, false, COPY_BY_ISOLATE_PATH);
   set_move_history_to_stmts_in_bb (duplicate, e,
   true, COPY_BY_ISOLATE_PATH);
diff --git a/gcc/tree-ssa-sink.cc b/gcc/tree-ssa-sink.cc
index 0b3441e894c..cc096e17

[committed] libstdc++: Add new headers to for PCH

2025-04-07 Thread Jonathan Wakely

This adds the new C23 headers to the PCH, and also removes the
__has_include check for  because we provide that
unconditionally now.

libstdc++-v3/ChangeLog:

* include/precompiled/stdc++.h: Include  and
. Include  unconditionally.
---

Tested x86_64-linux.  Pushed to trunk.

 libstdc++-v3/include/precompiled/stdc++.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index 1ffde3ed450..f4b312d9e47 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -230,15 +230,15 @@
 #include 
 #include 
 #include 
-#if __has_include()
-# include 
-#endif
+#include 
 #include 
 #include 
 #endif
 
 #if __cplusplus > 202302L
 #include 
+#include 
+#include 
 #endif
 
 #endif // HOSTED
-- 
2.49.0

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-04-07 Thread Michael Matz

Hello,

On Mon, 7 Apr 2025, Martin Uecker wrote:

> > So, what specifically would the two attributes do different?  FWIW: what 
> > worries me about accepting a generic expression in counted_by, that isn't 
> > prefixed by a (possibly empty) decl, is that after seeing a non-type 
> > identifier the parser doesn't yet know if it's the lone-ident case (look 
> > up in struct scope) or the expression case (look up everything in global 
> > scope).  It requires look-ahead to decide this.
> > 
> > Would that be the difference between the attributes?  One accepting _only_ 
> > a lone-ident or the decl+expr syntax, and the other _only_ expressions 
> > that are never looked up in struct-scope (not even if its lone-ident)?
> 
> My understanding is that one accepts only a lone identifier and nothing
> else, i.e.
> 
> counted_by(identifier)
> 
> and the other only accepts expressions, possibly including a forward
> declaration.
> 
> counted_by_expr(expression)
> counted_by_expr(decl; expression)

What exactly happens when counted_by_expr is used with only an identifier 
expression, without decl?  Is the ident looked up normally, i.e. not in 
struct scope.  If so, then good, it would resolve my worry.


Ciao,
Michael.

Re: [Stage1][Middle-end][object-size][PATCH v1] Evaluate the object size by the size of the pointee type

2025-04-07 Thread Siddhesh Poyarekar


On 2025-04-07 11:56, Qing Zhao wrote:

when the type is a structure with flexible array member.


Not just when the structure has a flexible array member, also when the 
FAM size is recorded in a __counted_by__, isn't it?



In tree-object-size.cc, if the size is UNKNOWN after evaluating use-def
chain, We can evaluate the SIZE of the pointee TYPE ONLY when this TYPE
is a structure type with flexible array member, since a structure with
FAM can not be an element of an array, so, the pointer must point to a
single object with this structure with FAM.

This is only available for C now.

bootstrapped and regression tested on both x86 and aarch64.

Okay for stage1?

thanks.

Qing

gcc/c/ChangeLog:

* c-lang.cc (LANG_HOOKS_BUILD_COUNTED_BY_REF):
Define to below function.
* c-tree.h (c_build_counted_by_ref): New extern function.
* c-typeck.cc (build_counted_by_ref): Rename to ...
(c_build_counted_by_ref): ...this.
(handle_counted_by_for_component_ref): Call the renamed function.

gcc/ChangeLog:

* langhooks-def.h (LANG_HOOKS_BUILD_COUNTED_BY_REF):
New language hook.
* langhooks.h (struct lang_hooks_for_types): Add
build_counted_by_ref.
* tree-object-size.cc (struct object_size_info): Add a new field
insert_after.
(gimplify_size_expressions): Insert sequence after or before
depending on the new field insert_after.
(compute_builtin_object_size): Init the new field to false;
(record_with_fam_object_size): New function.
(collect_object_sizes_for): Call record_with_fam_object_size.

gcc/testsuite/ChangeLog:

* gcc.dg/flex-array-counted-by-3.c: Update test;
* gcc.dg/flex-array-counted-by-4.c: Likewise.
* gcc.dg/flex-array-counted-by-5.c: Likewise.
---
  gcc/c/c-lang.cc   |   3 +
  gcc/c/c-tree.h|   1 +
  gcc/c/c-typeck.cc |   6 +-
  gcc/langhooks-def.h   |   4 +-
  gcc/langhooks.h   |   5 +
  .../gcc.dg/flex-array-counted-by-3.c  |   5 +
  .../gcc.dg/flex-array-counted-by-4.c  |  34 --
  .../gcc.dg/flex-array-counted-by-5.c  |   4 +
  gcc/tree-object-size.cc   | 106 +-
  9 files changed, 152 insertions(+), 16 deletions(-)

diff --git a/gcc/c/c-lang.cc b/gcc/c/c-lang.cc
index c69077b2a93..e9ec9e6e64a 100644
--- a/gcc/c/c-lang.cc
+++ b/gcc/c/c-lang.cc
@@ -51,6 +51,9 @@ enum c_language_kind c_language = clk_c;
  #undef LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE
  #define LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE c_get_sarif_source_language
  
+#undef LANG_HOOKS_BUILD_COUNTED_BY_REF

+#define LANG_HOOKS_BUILD_COUNTED_BY_REF c_build_counted_by_ref
+
  /* Each front end provides its own lang hook initializer.  */
  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
  
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h

index 743ec5cbae6..66bec5d92fa 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -776,6 +776,7 @@ extern struct c_switch *c_switch_stack;
  
  extern bool null_pointer_constant_p (const_tree);
  
+extern tree c_build_counted_by_ref (tree, tree, tree *);
  
  inline bool

  c_type_variably_modified_p (tree t)
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 19e79b554dc..6efc3fb3e5d 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2936,8 +2936,8 @@ should_suggest_deref_p (tree datum_type)
  &(p->k)
  
  */

-static tree
-build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
+tree
+c_build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type)
  {
tree type = TREE_TYPE (datum);
if (!c_flexible_array_member_type_p (TREE_TYPE (subdatum)))
@@ -3031,7 +3031,7 @@ handle_counted_by_for_component_ref (location_t loc, tree 
ref)
tree datum = TREE_OPERAND (ref, 0);
tree subdatum = TREE_OPERAND (ref, 1);
tree counted_by_type = NULL_TREE;
-  tree counted_by_ref = build_counted_by_ref (datum, subdatum,
+  tree counted_by_ref = c_build_counted_by_ref (datum, subdatum,
  &counted_by_type);
if (counted_by_ref)
  ref = build_access_with_size_for_counted_by (loc, ref,
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 6b34d324ab5..7909ea8a92c 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -221,6 +221,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
  #define LANG_HOOKS_TYPE_DWARF_ATTRIBUTE   lhd_type_dwarf_attribute
  #define LANG_HOOKS_UNIT_SIZE_WITHOUT_REUSABLE_PADDING 
lhd_unit_size_without_reusable_padding
  #define LANG_HOOKS_CLASSTYPE_AS_BASE  hook_tree_const_tree_null
+#define LANG_HOOKS_BUILD_COUNTED_BY_REF NULL
  
  #define LANG_HOOKS_FOR_TYPES_INITIALIZER { \

LANG_HOOKS_MAKE_TYPE, \
@@ -248,7 +249,8 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
LANG_HOOKS_GET_FIXED_POINT_TYPE_INFO, \

Re: [PATCH] c-family: Improve location for -Wunknown-pragmas in a _Pragma [PR118838]

2025-04-07 Thread Marek Polacek

On Wed, Feb 12, 2025 at 08:27:37PM -0500, Lewis Hyatt wrote:
> Hello-
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118838
> 
> This patch addresses the issue mentioned in the PR (another instance of
> _Pragma string location issues). bootstrap + regtest all languages on
> aarch64 looks good. Is it OK please for now or for stage 1?  Note, it is not
> a regression, since this never worked in C or C++ frontends; but on the
> other hand, r15-4505 for GCC 15 fixed some related issues, so it could be
> nice if this one gets in along with it. Thanks!
> 
> -Lewis
> 
> -- >8 --
> 
> The warning for -Wunknown-pragmas is issued at the location provided by
> libcpp to the def_pragma() callback. This location is
> cpp_reader::directive_line, which is a location for the start of the line
> only; it is also not a valid location in case the unknown pragma was lexed
> from a _Pragma string. These factors make it impossible to suppress
> -Wunknown-pragmas via _Pragma("GCC diagnostic...") directives on the same
> source line, as in the PR and the test case. Address that by issuing the
> warning at a better location returned by cpp_get_diagnostic_override_loc().
> libcpp already maintains this location to handle _Pragma-related diagnostics
> internally; it was needed also to make a publicly accessible version of it.
> 
> gcc/c-family/ChangeLog:
> 
>   PR c/118838
>   * c-lex.cc (cb_def_pragma): Call cpp_get_diagnostic_override_loc()
>   to get a valid location at which to issue -Wunknown-pragmas, in case
>   it was triggered from a _Pragma.
> 
> libcpp/ChangeLog:
> 
>   PR c/118838
>   * errors.cc (cpp_get_diagnostic_override_loc): New function.
>   * include/cpplib.h (cpp_get_diagnostic_override_loc): Declare.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/118838
>   * c-c++-common/cpp/pragma-diagnostic-loc-2.c: New test.
>   * g++.dg/gomp/macro-4.C: Adjust expected output.
>   * gcc.dg/gomp/macro-4.c: Likewise.
>   * gcc.dg/cpp/Wunknown-pragmas-1.c: Likewise.
> ---
>  libcpp/errors.cc  | 10 +
>  libcpp/include/cpplib.h   |  5 +
>  gcc/c-family/c-lex.cc |  7 +-
>  .../cpp/pragma-diagnostic-loc-2.c | 15 +
>  gcc/testsuite/g++.dg/gomp/macro-4.C   |  8 +++
>  gcc/testsuite/gcc.dg/cpp/Wunknown-pragmas-1.c | 22 +++
>  gcc/testsuite/gcc.dg/gomp/macro-4.c   |  8 +++
>  7 files changed, 57 insertions(+), 18 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-diagnostic-loc-2.c
> 
> diff --git a/libcpp/errors.cc b/libcpp/errors.cc
> index 9621c4b66ea..d9efb6acd30 100644
> --- a/libcpp/errors.cc
> +++ b/libcpp/errors.cc
> @@ -52,6 +52,16 @@ cpp_diagnostic_get_current_location (cpp_reader *pfile)
>  }
>  }
>  
> +/* Sometimes a diagnostic needs to be generated before libcpp has been able
> +   to generate a valid location for the current token; in that case, the
> +   non-zero location returned by this function is the preferred one to use.  
> */
> +
> +location_t
> +cpp_get_diagnostic_override_loc (const cpp_reader *pfile)
> +{
> +  return pfile->diagnostic_override_loc;
> +}
> +
>  /* Print a diagnostic at the given location.  */
>  
>  ATTRIBUTE_CPP_PPDIAG (5, 0)
> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> index 90aa3160ebf..04d4621da3c 100644
> --- a/libcpp/include/cpplib.h
> +++ b/libcpp/include/cpplib.h
> @@ -1168,6 +1168,11 @@ extern const char *cpp_probe_header_unit (cpp_reader 
> *, const char *file,
>  extern const char *cpp_get_narrow_charset_name (cpp_reader *) ATTRIBUTE_PURE;
>  extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE;
>  
> +/* Sometimes a diagnostic needs to be generated before libcpp has been able
> +   to generate a valid location for the current token; in that case, the
> +   non-zero location returned by this function is the preferred one to use.  
> */

I don't love duplicating the comment like this, it's going to get out of sync.

> +extern location_t cpp_get_diagnostic_override_loc (const cpp_reader *);
> +
>  /* This function reads the file, but does not start preprocessing.  It
> returns the name of the original file; this is the same as the
> input file, except for preprocessed input.  This will generate at
> diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
> index e450c9a57f0..df84020de62 100644
> --- a/gcc/c-family/c-lex.cc
> +++ b/gcc/c-family/c-lex.cc
> @@ -248,7 +248,12 @@ cb_def_pragma (cpp_reader *pfile, location_t loc)
>  {
>const unsigned char *space, *name;
>const cpp_token *s;
> -  location_t fe_loc = loc;
> +
> +  /* If we are processing a _Pragma, LOC is not a valid location, but 
> libcpp
> +  will provide a good location via this function instead.  */
> +  location_t fe_loc = cpp_get_diagnostic_override_loc (pfile);
> +  if (!fe_loc)

I think checking == UNKNOWN

Re: [PATCH] cobol: Fix up update_web_docs_git for COBOL [PR119227]

2025-04-07 Thread Richard Biener

On Mon, 7 Apr 2025, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, the COBOL documentation is currently not present
> in onlinedocs at all.
> While the script generates gcobol{,-io}.{pdf,html}, it generates them in
> the gcc/gcc/cobol/ subdirectory of the update_web_docs_git temporary
> directory and nothing find it there afterwards, all the processing is on
> for file in */*.html *.ps *.pdf *.tar; do
> So, this patch puts gcobol{,-io}.html into gcobol/ subdirectory and
> gcobol{,-io}.pdf into the current directory, so that it is picked up.
> With this it makes into onlinedocs:
> find . -name \*cobol\*
> ./onlinedocs/gcobol.pdf.gz
> ./onlinedocs/gcobol.pdf
> ./onlinedocs/gcobol_io.pdf.gz
> ./onlinedocs/gcobol_io.pdf
> ./onlinedocs/gcobol
> ./onlinedocs/gcobol/gcobol_io.html.gz
> ./onlinedocs/gcobol/gcobol_io.html
> ./onlinedocs/gcobol/gcobol.html.gz
> ./onlinedocs/gcobol/gcobol.html
> ./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html.gz
> ./onlinedocs/gnat_rm/gnat_005frm_002finterfacing_005fto_005fother_005flanguages-interfacing-to-cobol.html
> ./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html.gz
> ./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-f-7-cobol-support.html
> ./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html.gz
> ./onlinedocs/gnat_rm/gnat_005frm_002fimplementation_005fadvice-rm-b-4-95-98-interfacing-with-cobol.html
> 
> So far just tested running the script locally with a few vars tweaked, ok
> for trunk?
> I'd then test it on sourceware proper...

OK.

Richard.

> 2025-04-07  Jakub Jelinek  
> 
>   PR web/119227
>   * update_web_docs_git: Rename mdoc2pdf_html to cobol_mdoc2pdf_html,
>   perform mkdir -p $DOCSDIR/gcobol gcobol, remove $d/ from pdf and in
>   html replace it with gcobol/; update uses of the renamed function.
> 
> --- maintainer-scripts/update_web_docs_git.jj 2025-03-11 09:18:22.169127780 
> +0100
> +++ maintainer-scripts/update_web_docs_git2025-04-07 13:22:41.785354269 
> +0200
> @@ -205,11 +205,12 @@ done
>  #
>  # The COBOL FE maintains man pages.  Convert them to HTML and PDF.
>  #
> -mdoc2pdf_html() {
> +cobol_mdoc2pdf_html() {
> +mkdir -p $DOCSDIR/gcobol gcobol
>  input="$1"
>  d="${input%/*}"
> -pdf="$d/$2"
> -html="$d/$3"
> +pdf="$2"
> +html="gcobol/$3"
>  groff -mdoc -T pdf "$input" > "${pdf}~"
>  mv "${pdf}~" "${pdf}"
>  mandoc -T html "$filename" > "${html}~"
> @@ -221,10 +222,10 @@ find . -name gcobol.[13] |
>  do
>  case ${filename##*.} in
>  1)
> -mdoc2pdf_html "$filename" gcobol.pdf gcobol.html
> +cobol_mdoc2pdf_html "$filename" gcobol.pdf gcobol.html
>  ;;
>  3)
> -mdoc2pdf_html "$filename" gcobol_io.pdf gcobol_io.html
> +cobol_mdoc2pdf_html "$filename" gcobol_io.pdf gcobol_io.html
>  ;;
>  esac
>  done
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] cobol: Fix up make html for COBOL [PR119227]

2025-04-07 Thread Richard Biener

On Mon, 7 Apr 2025, Jakub Jelinek wrote:

> Hi!
> 
> What make html does for COBOL is quite inconsistent with all
> other FEs.  Normally make html creates HTML/gcc-15.0.1/
> subdirectory and puts there subdirectories like gcc, cpp, gccint, gfortran
> etc. and only those contain *.html files.  COBOL puts gcobol.html and
> gcobol-io.html into the current directory instead.
> 
> The following patch puts them into $(build_htmldir)/gcobol/ directory.
> 
> Tested on x86_64-linux with make html, ok for trunk?

OK.

Richard.

> 2025-04-07  Jakub Jelinek  
> 
>   PR web/119227
>   * Make-lang.in (GCOBOL_HTML_FILES): New variable.
>   (cobol.install-html, cobol.html, cobol.srchtml): Use
>   $(GCOBOL_HTML_FILES) instead of gcobol.html gcobol-io.html.
>   (gcobol.html): Rename goal to ...
>   ($(build_htmldir)/gcobol/gcobol.html): ... this.  Run mkinstalldirs.
>   (gcobol-io.html): Rename goal to ...
>   ($(build_htmldir)/gcobol/gcobol-io.html): ... this.  Run mkinstalldirs.
> 
> --- gcc/cobol/Make-lang.in.jj 2025-03-31 21:26:51.107135693 +0200
> +++ gcc/cobol/Make-lang.in2025-04-07 12:19:59.451852320 +0200
> @@ -40,6 +40,8 @@ GCOBOL_TARGET_INSTALL_NAME := $(target_n
>  GCOBC_INSTALL_NAME := $(shell echo gcobc|sed '$(program_transform_name)')
>  GCOBC_TARGET_INSTALL_NAME := $(target_noncanonical)-$(shell echo gcobc|sed 
> '$(program_transform_name)')
>  
> +GCOBOL_HTML_FILES = $(addprefix $(build_htmldir)/gcobol/,gcobol.html 
> gcobol-io.html)
> +
>  cobol: cobol1$(exeext)
>  cobol.serial = cobol1$(exeext)
>  .PHONY: cobol
> @@ -303,8 +305,8 @@ cobol.install-pdf: installdirs gcobol.pd
>  
>  cobol.install-plugin:
>  
> -cobol.install-html: installdirs gcobol.html gcobol-io.html
> - $(INSTALL_DATA) gcobol.html gcobol-io.html $(DESTDIR)$(htmldir)/
> +cobol.install-html: installdirs $(GCOBOL_HTML_FILES)
> + $(INSTALL_DATA) $(GCOBOL_HTML_FILES) $(DESTDIR)$(htmldir)/
>  
>  cobol.info:
>  cobol.srcinfo:
> @@ -323,14 +325,16 @@ gcobol-io.pdf: $(srcdir)/cobol/gcobol.3
>   groff -mdoc -T pdf  $^ > $@~
>   @mv $@~ $@
>  
> -cobol.html: gcobol.html gcobol-io.html
> -cobol.srchtml: gcobol.html gcobol-io.html
> +cobol.html: $(GCOBOL_HTML_FILES)
> +cobol.srchtml: $(GCOBOL_HTML_FILES)
>   ln $^ $(srcdir)/cobol/
>  
> -gcobol.html: $(srcdir)/cobol/gcobol.1
> +$(build_htmldir)/gcobol/gcobol.html: $(srcdir)/cobol/gcobol.1
> + $(mkinstalldirs) $(build_htmldir)/gcobol
>   mandoc -T html $^ > $@~
>   @mv $@~ $@
> -gcobol-io.html: $(srcdir)/cobol/gcobol.3
> +$(build_htmldir)/gcobol/gcobol-io.html: $(srcdir)/cobol/gcobol.3
> + $(mkinstalldirs) $(build_htmldir)/gcobol
>   mandoc -T html $^ > $@~
>   @mv $@~ $@
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules

2025-04-07 Thread Jakub Jelinek

On Wed, Apr 02, 2025 at 04:24:30PM -0500, Robert Dubner wrote:
> Solutions have been put in place that don't involve modifying the source 
> code of the copied files. I haven't made an opportunity to understand how 
> they work, but I am choosing to drop the matter.

In order to unbreak Darwin, I've committed following patch after ack from
Richi on IRC.

Feel free to adjust it incrementally.

2025-04-07  Jakub Jelinek  

* Make-lang.in (cobol/charmaps.cc, cobol/valconv.cc): Use a BRE
only sed regex.

--- gcc/cobol/Make-lang.in.jj   2025-04-07 13:52:19.385557244 +0200
+++ gcc/cobol/Make-lang.in  2025-04-07 14:19:45.178603788 +0200
@@ -90,9 +90,7 @@ cobol1_OBJS =\
 # so that the .h files can be found.
 
 cobol/charmaps.cc cobol/valconv.cc: cobol/%.cc: $(LIB_SOURCE)/%.cc
-   -l='ec\|common-defs\|io\|gcobolio\|gfileio\|charmaps'; \
-   l=$$l'\|valconv\|exceptl'; \
-   sed -e '/^#include/s,"\('$$l'\)\.h","../../libgcobol/\1.h",' $^ > $@
+   sed -e '/^#include/s,"\([^"]*[^g"].h\)","../../libgcobol/\1",' $^ > $@
 
 LIB_SOURCE_H=$(wildcard $(LIB_SOURCE)/*.h)
 


Jakub

[committed] libstdc++: Remove stray pragma in new header [PR119642]

2025-04-07 Thread Jonathan Wakely

libstdc++-v3/ChangeLog:

PR libstdc++/119642
* include/bits/formatfwd.h: Remove stray pragma.
---

Tested x86_64-linux.  Pushed to trunk.

 libstdc++-v3/include/bits/formatfwd.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libstdc++-v3/include/bits/formatfwd.h 
b/libstdc++-v3/include/bits/formatfwd.h
index 44922cb83fc..a6b5ac8c8ce 100644
--- a/libstdc++-v3/include/bits/formatfwd.h
+++ b/libstdc++-v3/include/bits/formatfwd.h
@@ -67,5 +67,4 @@ namespace __format
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 #endif // __glibcxx_format
-#pragma GCC diagnostic pop
 #endif // _GLIBCXX_FORMAT_FWD_H
-- 
2.49.0

[PATCH] riscv: Add support for riscv*-gnu (GNU/Hurd on RISC-V)

2025-04-07 Thread Hakan Candar



This produces a toolchain that can successfully build binaries targeting
riscv*-gnu.

gcc/ChangeLog:

* config.gcc: Recognize riscv*-*-gnu* targets.
* config/riscv/gnu.h: New file.

Signed-off-by: Hakan Candar 
---
 gcc/config.gcc | 14 ++
 gcc/config/riscv/gnu.h | 59 ++
 2 files changed, 73 insertions(+)
 create mode 100644 gcc/config/riscv/gnu.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f7f2002a45f..a01ba713a6c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2540,6 +2540,20 @@ riscv*-*-linux*)
gcc_cv_initfini_array=yes
with_tls=${with_tls:-trad}
;;
+riscv*-*-gnu*)
+   tm_file="elfos.h gnu-user.h gnu.h glibc-stdint.h ${tm_file} riscv/gnu.h"
+   tmake_file="${tmake_file} riscv/t-riscv"
+   gnu_ld=yes
+   gas=yes
+   case $target in
+   riscv32be-*|riscv64be-*)
+   tm_defines="${tm_defines} TARGET_BIG_ENDIAN_DEFAULT=1"
+   ;;
+   esac
+   # Force .init_array support.  The configure script cannot always
+   # automatically detect that GAS supports it, yet we require it.
+   gcc_cv_initfini_array=yes
+   ;;
 riscv*-*-elf* | riscv*-*-rtems*)
tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
case ${target} in
diff --git a/gcc/config/riscv/gnu.h b/gcc/config/riscv/gnu.h
new file mode 100644
index 000..047399b10ff
--- /dev/null
+++ b/gcc/config/riscv/gnu.h
@@ -0,0 +1,59 @@
+/* Definitions for RISC-V GNU/Hurd systems with ELF format.
+   Copyright (C) 1998-2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#define TARGET_OS_CPP_BUILTINS()   \
+  do { \
+GNU_USER_TARGET_OS_CPP_BUILTINS(); \
+  } while (0)
+
+#define GNU_USER_DYNAMIC_LINKER "/lib/ld-riscv" XLEN_SPEC "-" ABI_SPEC ".so.1"
+
+#define ICACHE_FLUSH_FUNC "__riscv_flush_icache"
+
+#define CPP_SPEC "%{pthread:-D_REENTRANT}"
+
+#define LD_EMUL_SUFFIX \
+  "%{mabi=lp64d:}" \
+  "%{mabi=lp64f:_lp64f}" \
+  "%{mabi=lp64:_lp64}" \
+  "%{mabi=ilp32d:}" \
+  "%{mabi=ilp32f:_ilp32f}" \
+  "%{mabi=ilp32:_ilp32}"
+
+#define LINK_SPEC "\
+-melf" XLEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
+%{mno-relax:--no-relax} \
+-X \
+%{mbig-endian:-EB} \
+%{mlittle-endian:-EL} \
+%{shared} \
+  %{!shared: \
+%{!static: \
+  %{!static-pie: \
+   %{rdynamic:-export-dynamic} \
+   -dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} \
+%{static:-static} %{static-pie:-static -pie --no-dynamic-linker -z text}}"
+
+#define STARTFILE_PREFIX_SPEC  \
+   "/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/usr/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/lib/ "\
+   "/usr/lib/ "
+
+#define RISCV_USE_CUSTOMISED_MULTI_LIB select_by_abi
-- 
2.47.0

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-04-07 Thread Michael Matz

Hello,

On Sat, 5 Apr 2025, Bill Wendling wrote:

> > >>> So, a different attribute name “counted_by_exp” might be better?
> > >>
> > >> I would prefer Martins empty-decl idea to that: "counted_by(;len+0)"
> > >> (looks up 'len' normally, i.e. doesn't look into current struct).  It
> > >> would naturally fit the either decl+expr or lone-ident parse.
> > >> It may look weird but empty declarations are okayish IMHO.
> > >>
> > >> But overall: I just don't know, it all looks a bit unsexy, there only
> > seem
> > >> to be rocks and hard places :)
> > >
> > > I would not worry about this case too much, because I do expect this
> > > to be a common use case anyway.  That it looks strange may even be
> > > an advantage here, as it alerts the reader that this is unusual.
> >
> > This is an interesting point and also a good point. -:)
> >
> > The other thought that bother me a little bit is:
> >
> > For the same attribute, counted_by, is it strange to have two different
> > looking up rules
> > depending on the different number of arguments?l
> >
> 
> Sorry for the HTML. On my phone.
> 
> I think adding a ';' isn't the best option. It's too easy to overlook when
> reading the attribute and forget when writing the attribute.  Using a
> separate attribute name is much cleaner, IMO. Then again, I've been wrong
> before. :-)

So, what specifically would the two attributes do different?  FWIW: what 
worries me about accepting a generic expression in counted_by, that isn't 
prefixed by a (possibly empty) decl, is that after seeing a non-type 
identifier the parser doesn't yet know if it's the lone-ident case (look 
up in struct scope) or the expression case (look up everything in global 
scope).  It requires look-ahead to decide this.

Would that be the difference between the attributes?  One accepting _only_ 
a lone-ident or the decl+expr syntax, and the other _only_ expressions 
that are never looked up in struct-scope (not even if its lone-ident)?

Ciao,
Michael.

[PATCH v2] x86: Update memcpy/memset inline strategies for -mtune=generic

2025-04-07 Thread H.J. Lu

Simplify memcpy and memset inline strategies to avoid branches for
-mtune=generic:

1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
   load and store for up to 16 * 16 (256) bytes when the data size is
   fixed and known.
2. Inline only if data size is known to be <= 256.
   a. Use "rep movsb/stosb" with simple code sequence if the data size
  is a constant.
   b. Use loop if data size is not a constant.
3. Use memcpy/memset library function if data size is unknown or > 256.

Here is the performance data from March 2021 when the original patch was
submitted.  With -march=x86-64 -O2,

1. On Ice Lake processor,

Performance impacts on SPEC CPU 2017:

500.perlbench_r  0.51%
502.gcc_r0.55%
505.mcf_r0.38%
520.omnetpp_r   -0.74%
523.xalancbmk_r -0.35%
525.x264_r   2.99%
531.deepsjeng_r -0.17%
541.leela_r -0.98%
548.exchange2_r  0.89%
557.xz_r 0.70%
Geomean  0.37%

503.bwaves_r 0.04%
507.cactuBSSN_r -0.01%
508.namd_r  -0.45%
510.parest_r-0.09%
511.povray_r-1.37%
519.lbm_r0.00%
521.wrf_r   -2.56%
526.blender_r   -0.01%
527.cam4_r  -0.05%
538.imagick_r0.36%
544.nab_r0.08%
549.fotonik3d_r -0.06%
554.roms_r   0.05%
Geomean -0.34%

Significant impacts on eembc benchmarks:

eembc/nnet_test  14.85%
eembc/mp2decoddata2  13.57%

2. On Cascadelake processor,

Performance impacts on SPEC CPU 2017:

500.perlbench_r -0.02%
502.gcc_r0.10%
505.mcf_r   -1.14%
520.omnetpp_r   -0.22%
523.xalancbmk_r  0.21%
525.x264_r   0.94%
531.deepsjeng_r -0.37%
541.leela_r -0.46%
548.exchange2_r -0.40%
557.xz_r 0.60%
Geomean -0.08%

503.bwaves_r-0.50%
507.cactuBSSN_r  0.05%
508.namd_r  -0.02%
510.parest_r 0.09%
511.povray_r-1.35%
519.lbm_r0.00%
521.wrf_r   -0.03%
526.blender_r   -0.83%
527.cam4_r   1.23%
538.imagick_r0.97%
544.nab_r   -0.02%
549.fotonik3d_r -0.12%
554.roms_r   0.55%
Geomean  0.00%

Significant impacts on eembc benchmarks:

eembc/nnet_test  9.90%
eembc/mp2decoddata2  16.42%
eembc/textv2data3   -4.86%
eembc/qos12.90%

3. On Znver3 processor,

Performance impacts on SPEC CPU 2017:

500.perlbench_r -0.96%
502.gcc_r   -1.06%
505.mcf_r   -0.01%
520.omnetpp_r   -1.45%
523.xalancbmk_r  2.89%
525.x264_r   4.98%
531.deepsjeng_r  0.18%
541.leela_r -1.54%
548.exchange2_r -1.25%
557.xz_r-0.01%
Geomean  0.16%

503.bwaves_r 0.04%
507.cactuBSSN_r  0.85%
508.namd_r  -0.13%
510.parest_r 0.39%
511.povray_r 0.00%
519.lbm_r0.00%
521.wrf_r0.28%
526.blender_r   -0.10%
527.cam4_r  -0.58%
538.imagick_r0.69%
544.nab_r   -0.04%
549.fotonik3d_r -0.04%
554.roms_r   0.40%
Geomean  0.15%

Significant impacts on eembc benchmarks:

eembc/aifftr01   13.95%
eembc/idctrn01   8.41%
eembc/nnet_test  30.25%
eembc/mp2decoddata2  5.05%
eembc/textv2data36.43%
eembc/qos   -5.79%

Code size differences are:

SPEC CPU 2017 with -march=x86-64 -O2

before after   diff
500.perlbench_r 22261782226866 0.031%
502.gcc_r   92507279253711 0.032%
505.mcf_r   21653  21730   0.356%
520.omnetpp_r   21318392133259 0.067%
523.xalancbmk_r 46956154696039 0.009%
525.x264_r  490651 490659  0.002%
531.deepsjeng_r 85832  86056   0.261%
541.leela_r 169005 165021 -2.357%
548.exchange2_r 70189  69901  -0.410%
557.xz_r196314 197506  0.607%
503.bwaves_r37430  37878   1.197%
507.cactuBSSN_r 35504383550622 0.005%
508.namd_r  880455 880519  0.007%
510.parest_r85617988586781 0.292%
511.povray_r10582681058068-0.019%
519.lbm_r   16415  16415   0.000%
521.wrf_r   23197011   232022270.022%
526.blender_r   10408951   104221750.127%
527.cam4_r  18979378   189834100.021%
538.imagick_r   19990521998780-0.014%
544.nab_r   191416 191688  0.142%
549.fotonik3d_r 384499 384507  0.002%
554.roms_r  853869 854277  0.048%

SPEC CPU 2017 with -march=x86-64 -Ofast -funroll-loops

before after   diff
500.perlbench_r 29408602946588 0.195%
502.gcc_r   11577095   115819750.042%
505.mcf_r   64469  64546   0.119%
520.omnetpp_r   25491492550669 0.060%
523.xalancbmk_r 69929566993236 0.004%
525.x264_r  836325 837125  0.096%
531.deepsjeng_r 137280 137464  0.134%
541.leela_r

[PATCH v2] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

Sigend-off-by: Xi Ruoyao 
Sigend-off-by: Jakub Jelinek 

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

---
v1 - > v2:
Corrected typos in commit information.

---
 libgfortran/acinclude.m4 |  4 
 libgfortran/configure| 18 +-
 libquadmath/configure|  8 
 libquadmath/configure.ac |  4 
 4 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/libgfortran/acinclude.m4 b/libgfortran/acinclude.m4
index a73207e5465..23fd621e518 100644
--- a/libgfortran/acinclude.m4
+++ b/libgfortran/acinclude.m4
@@ -274,6 +274,10 @@ AC_DEFUN([LIBGFOR_CHECK_FLOAT128], [
   AC_CACHE_CHECK([whether we have a usable _Float128 type],
  libgfor_cv_have_float128, [
GCC_TRY_COMPILE_OR_LINK([
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
diff --git a/libgfortran/configure b/libgfortran/configure
index 11a1bc5f070..6ee56839968 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -16413,7 +16413,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16459,7 +16459,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16483,7 +16483,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16528,7 +16528,7 @@ else
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -16552,7 +16552,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
 We can't simply define LARGE_OFF_T to be 9223372036854775807,
 since some C++ compilers masquerading as C compilers
 incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
   && LARGE_OFF_T % 2147483647 == 1)
  ? 1 : -1];
@@ -30283,6 +30283,10 @@ else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
for porting existing code easier.
+#endif
+
 _Float128 foo (_Float128 x)
 {
  _Complex _Float128 z1, z2;
@@ -30336,6 +30340,10 @@ fi
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
 
+#ifdef __loongarch__
+#error On LoongArch we should use long double instead; __float128 is only 
f

[Ada] Fix wrong 'Access to aliased constrained array of controlled type

2025-04-07 Thread Eric Botcazou

For technical reasons, the recently reimplemented finalization machinery for 
controlled types requires arrays of controlled types to be allocated with 
their bounds, including in the case where their nominal subtype is 
constrained.  However, in this case, the type of 'Access for the arrays is 
pointer-to-constrained-array and, therefore, its value must designate the 
array itself and not the bounds.

Tested on x86-64/Linux, applied on the mainline (branches are not affected).


2025-04-07  Eric Botcazou  

ada/
* gcc-interface/utils.cc (convert) : Use fold_convert
to convert between thin pointers.  If the source is a thin pointer
with zero offset from the base and the target is a pointer to its
array, displace the pointer after converting it.
* gcc-interface/utils2.cc (build_unary_op) : Use
fold_convert to convert the address before displacing it.

-- 
Eric Botcazoudiff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index 1448716acc5..9212827aecf 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -5259,7 +5259,7 @@ convert (tree type, tree expr)
 	  : size_zero_node;
 	  tree byte_diff = size_diffop (type_pos, etype_pos);
 
-	  expr = build1 (NOP_EXPR, type, expr);
+	  expr = fold_convert (type, expr);
 	  if (integer_zerop (byte_diff))
 	return expr;
 
@@ -5267,6 +5267,21 @@ convert (tree type, tree expr)
   fold_convert (sizetype, byte_diff));
 	}
 
+  /* If converting from a thin pointer with zero offset from the base to
+	 a pointer to the array, add the offset of the array field.  */
+  if (TYPE_IS_THIN_POINTER_P (etype)
+	  && !TYPE_UNCONSTRAINED_ARRAY (TREE_TYPE (etype)))
+	{
+	  tree arr_field = DECL_CHAIN (TYPE_FIELDS (TREE_TYPE (etype)));
+
+	  if (TREE_TYPE (type) == TREE_TYPE (arr_field))
+	{
+	  expr = fold_convert (type, expr);
+	  return build_binary_op (POINTER_PLUS_EXPR, type, expr,
+  byte_position (arr_field));
+	}
+	}
+
   /* If converting fat pointer to normal or thin pointer, get the pointer
 	 to the array and then convert it.  */
   if (TYPE_IS_FAT_POINTER_P (etype))
diff --git a/gcc/ada/gcc-interface/utils2.cc b/gcc/ada/gcc-interface/utils2.cc
index 99e592781f5..58418ea7236 100644
--- a/gcc/ada/gcc-interface/utils2.cc
+++ b/gcc/ada/gcc-interface/utils2.cc
@@ -1628,11 +1628,12 @@ build_unary_op (enum tree_code op_code, tree result_type, tree operand)
 		= size_binop (PLUS_EXPR, offset,
 			  size_int (bits_to_bytes_round_down (bitpos)));
 
-	  /* Take the address of INNER, convert it to a pointer to our type
-		 and add the offset.  */
-	  inner = build_unary_op (ADDR_EXPR,
-  build_pointer_type (TREE_TYPE (operand)),
-  inner);
+	  /* Take the address of INNER, formally convert it to a pointer
+		 to the operand type, and finally add the offset.  */
+	  inner = build_unary_op (ADDR_EXPR, NULL_TREE, inner);
+	  inner
+		= fold_convert (build_pointer_type (TREE_TYPE (operand)),
+inner);
 	  result = build_binary_op (POINTER_PLUS_EXPR, TREE_TYPE (inner),
 	inner, offset);
 	  break;

Re: [PATCH v3] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Lulu Cheng




在 2025/4/7 下午4:02, Jakub Jelinek 写道:

On Mon, Apr 07, 2025 at 03:44:52PM +0800, Lulu Cheng wrote:

In GCC14, LoongArch added __float128 as an alias for _Float128.
In commit r15-8962, support for q/Q suffixes for 128-bit floating point
numbers.  This will cause the compiler to automatically link libquadmath
when compiling Fortran programs.  But on LoongArch `long double` is
IEEE quad, so there is no need to implement libquadmath.
This causes link failure.

PR target/119408

libgfortran/ChangeLog:

* acinclude.m4: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

libquadmath/ChangeLog:

* configure.ac: When checking for __float128 support, determine
whether the current architecture is LoongArch.  If so, return false.
* configure: Regenerate.

Sigend-off-by: Xi Ruoyao 
Sigend-off-by: Jakub Jelinek 

---
v1 -> v2:
Corrected typos in commit information.
v2 -> v3:
Regenerate libgfortran/configure using gnu autoconf2.69.

LGTM.

Jakub

Thanks for your review.

Re: [PATCH] LoongArch: Add LoongArch architecture detection to __float128 support in libgfortran and libquadmath [PR119408].

2025-04-07 Thread Jakub Jelinek

On Mon, Apr 07, 2025 at 03:12:22PM +0800, Lulu Cheng wrote:
> In GCC14, LoongArch added __float128 as an alias for _Float128.
> In commit r15-8962, support for q/Q suffixes for 128-bit floating point
> numbers.  This will cause the compiler to automatically link libquadmath
> when compiling Fortran programs.  But on LoongArch `long double` is
> IEEE quad, so there is no need to implement libquadmath.
> This causes link failure.
> 
> Sigend-off-by: Xi Ruoyao <1...@xry111.site>
> Sigend-off-by: Jakub Jelinek 
> 
>   PR target/119408
> 
> libgfortran/ChangeLog:
> 
>   * acinclude.m4: When checking for __float128 support, determine
>   whether the current architecture is LoongArch.  If so, return false.
>   * configure: Regenerate.
> 
> libquadmath/ChangeLog:
> 
>   * configure.ac: When checking for __float128 support, determine
>   whether the current architecture is LoongArch.  If so, return false.
>   * configure: Regenerate.

> --- a/libgfortran/configure
> +++ b/libgfortran/configure
> @@ -16413,7 +16413,7 @@ else
>  We can't simply define LARGE_OFF_T to be 9223372036854775807,
>  since some C++ compilers masquerading as C compilers
>  incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 
> 31))
>int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
> @@ -16459,7 +16459,7 @@ else
>  We can't simply define LARGE_OFF_T to be 9223372036854775807,
>  since some C++ compilers masquerading as C compilers
>  incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 
> 31))
>int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
> @@ -16483,7 +16483,7 @@ rm -f core conftest.err conftest.$ac_objext 
> conftest.$ac_ext
>  We can't simply define LARGE_OFF_T to be 9223372036854775807,
>  since some C++ compilers masquerading as C compilers
>  incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 
> 31))
>int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
> @@ -16528,7 +16528,7 @@ else
>  We can't simply define LARGE_OFF_T to be 9223372036854775807,
>  since some C++ compilers masquerading as C compilers
>  incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 
> 31))
>int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];
> @@ -16552,7 +16552,7 @@ rm -f core conftest.err conftest.$ac_objext 
> conftest.$ac_ext
>  We can't simply define LARGE_OFF_T to be 9223372036854775807,
>  since some C++ compilers masquerading as C compilers
>  incorrectly reject 9223372036854775807.  */
> -#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
> +#define LARGE_OFF_T off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 
> 31))
>int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
>  && LARGE_OFF_T % 2147483647 == 1)
> ? 1 : -1];

The above hunks clearly show that you're regenerating it with some patched
autoconf or something like that.  Please manually remove those hunks or use
vanilla upstream autoconf 2.69.  Otherwise the CI will complain.

Otherwise the patch LGTM (though I think Sigend-off-by headers belong after
the ChangeLog entry, not before it).

Jakub

[PATCH v5 1/5] gomp: Various fixes for SVE types [PR101018]

2025-04-07 Thread Tejas Belagod

From: Richard Sandiford 

Various parts of the omp code checked whether the size of a decl
was an INTEGER_CST in order to determine whether the decl was
variable-sized or not.  If it was variable-sized, it was expected
to have a DECL_VALUE_EXPR replacement, as for VLAs.

This patch uses poly_int_tree_p instead, so that variable-length
SVE vectors are treated like constant-length vectors.  This means
that some structures become poly_int-sized, with some fields at
poly_int offsets, but we already have code to handle that.

An alternative would have been to handle the data via indirection
instead.  However, that's likely to be more complicated, and it
would contradict is_variable_sized, which already uses a check
for TREE_CONSTANT rather than INTEGER_CST.

gimple_add_tmp_var should probably not add a safelen of 1
for SVE vectors, but that's really a separate thing and might
be hard to test.

Co-authored-by: Tejas Belagod 

gcc/
PR middle-end/101018
* poly-int.h (can_and_p): New function.
* fold-const.cc (poly_int_binop): Use it to optimize BIT_AND_EXPRs
involving POLY_INT_CSTs.
* gimplify.cc (omp_notice_variable): Use poly_int_tree_p instead
of INTEGER_CST when checking for constant-sized omp data.
(gimplify_adjust_omp_clauses_1): Likewise.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.cc (scan_sharing_clauses): Likewise.
---
 gcc/fold-const.cc |  7 +++
 gcc/gimplify.cc   | 19 +--
 gcc/omp-low.cc|  2 +-
 gcc/poly-int.h| 19 +++
 4 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 3e20538de9f..1275ef75315 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -1284,6 +1284,13 @@ poly_int_binop (poly_wide_int &res, enum tree_code code,
return false;
   break;
 
+case BIT_AND_EXPR:
+  if (TREE_CODE (arg2) != INTEGER_CST
+ || !can_and_p (wi::to_poly_wide (arg1), wi::to_wide (arg2),
+&res))
+   return false;
+  break;
+
 case BIT_IOR_EXPR:
   if (TREE_CODE (arg2) != INTEGER_CST
  || !can_ior_p (wi::to_poly_wide (arg1), wi::to_wide (arg2),
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index e90220cc2a0..55cab7a74a8 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9301,7 +9301,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
   && (flags & (GOVD_SEEN | GOVD_LOCAL)) == GOVD_SEEN
   && DECL_SIZE (decl))
 {
-  if (TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+  tree size;
+  if (!poly_int_tree_p (DECL_SIZE (decl)))
{
  splay_tree_node n2;
  tree t = DECL_VALUE_EXPR (decl);
@@ -9312,16 +9313,14 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
  n2->value |= GOVD_SEEN;
}
   else if (omp_privatize_by_reference (decl)
-  && TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)))
-  && (TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl
-  != INTEGER_CST))
+  && (size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl
+  && !poly_int_tree_p (size))
{
  splay_tree_node n2;
- tree t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)));
- gcc_assert (DECL_P (t));
- n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) t);
+ gcc_assert (DECL_P (size));
+ n2 = splay_tree_lookup (ctx->variables, (splay_tree_key) size);
  if (n2)
-   omp_notice_variable (ctx, t, true);
+   omp_notice_variable (ctx, size, true);
}
 }
 
@@ -14581,7 +14580,7 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void 
*data)
   if ((gimplify_omp_ctxp->region_type & ORT_ACC) == 0)
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P (clause) = 1;
   if (DECL_SIZE (decl)
- && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+ && !poly_int_tree_p (DECL_SIZE (decl)))
{
  tree decl2 = DECL_VALUE_EXPR (decl);
  gcc_assert (INDIRECT_REF_P (decl2));
@@ -15322,7 +15321,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, 
gimple_seq body, tree *list_p,
  if (!DECL_P (decl))
break;
  if (DECL_SIZE (decl)
- && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+ && !poly_int_tree_p (DECL_SIZE (decl)))
{
  tree decl2 = DECL_VALUE_EXPR (decl);
  gcc_assert (INDIRECT_REF_P (decl2));
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index e369df6e8f1..e1036adab28 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -1461,7 +1461,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
  else
install_var_field (decl, false, 11, ctx);
  if (DECL_SIZE (decl)
- && TREE_CODE (DECL_SIZE (decl)) != INTEGER_CST)
+ && !poly_int_tree_p (DECL_SIZE (decl)))
{

[PATCH v5 3/5] AArch64: Diagnose OpenMP offloading when SVE types involved.

2025-04-07 Thread Tejas Belagod

The target clause in OpenMP is used to offload loop kernels to accelarator
peripeherals.  target's 'map' clause is used to move data from and to the
accelarator.  When the data is SVE type, it may not be suitable because of
various reasons i.e. the two SVE targets may not agree on vector size or
some targets don't support variable vector size.  This makes SVE unsuitable
for use in OMP's 'map' clause.  This patch diagnoses all such cases and issues
an error where SVE types are not suitable.

Co-authored-by: Andrea Corallo 

gcc/ChangeLog:

* target.h (type_context_kind): Add new context kinds for target 
clauses.
(omp_type_context): Query if the context is of OMP kind.
* config/aarch64/aarch64-sve-builtins.cc (verify_type_context): Diagnose
SVE types for a given OpenMP context.
(omp_type_context): New.
* gimplify.cc (omp_notice_variable): Diagnose implicitly-mapped SVE
objects in OpenMP regions.
(gimplify_scan_omp_clauses): Diagnose SVE types for various target
clauses.
---
 gcc/config/aarch64/aarch64-sve-builtins.cc | 37 ++-
 gcc/gimplify.cc| 41 +-
 gcc/target.h   | 37 ++-
 3 files changed, 112 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 44e4807325a..36519262efd 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -5174,7 +5174,11 @@ bool
 verify_type_context (location_t loc, type_context_kind context,
 const_tree type, bool silent_p)
 {
-  if (!sizeless_type_p (type))
+  const_tree tmp = type;
+  if (omp_type_context (context) && POINTER_TYPE_P (type))
+tmp = strip_pointer_types (tmp);
+
+  if (!sizeless_type_p (tmp))
 return true;
 
   switch (context)
@@ -5234,6 +5238,37 @@ verify_type_context (location_t loc, type_context_kind 
context,
   if (!silent_p)
error_at (loc, "capture by copy of SVE type %qT", type);
   return false;
+
+case TCTX_OMP_MAP:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in % clause", type);
+  return false;
+
+case TCTX_OMP_MAP_IMP_REF:
+  if (!silent_p)
+   error ("cannot reference %qT object types in % region", type);
+  return false;
+
+case TCTX_OMP_PRIVATE:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in"
+ " % % clause", type);
+  return false;
+
+case TCTX_OMP_FIRSTPRIVATE:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in"
+ " % % clause", type);
+  return false;
+
+case TCTX_OMP_DEVICE_ADDR:
+  if (!silent_p)
+   error_at (loc, "SVE type %qT not allowed in"
+ " % device clauses", type);
+  return false;
+
+default:
+  break;
 }
   gcc_unreachable ();
 }
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 55cab7a74a8..51595f6f51e 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -9222,11 +9222,18 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree 
decl, bool in_code)
  | GOVD_MAP_ALLOC_ONLY)) == flags)
{
  tree type = TREE_TYPE (decl);
+ location_t loc = DECL_SOURCE_LOCATION (decl);
 
  if (gimplify_omp_ctxp->target_firstprivatize_array_bases
  && omp_privatize_by_reference (decl))
type = TREE_TYPE (type);
- if (!omp_mappable_type (type))
+
+ if (!verify_type_context (loc, TCTX_OMP_MAP_IMP_REF, type))
+   /* Check if TYPE can appear in a target region.
+  verify_type_context has already issued an error if it
+  can't.  */
+   nflags |= GOVD_MAP | GOVD_EXPLICIT;
+ else if (!omp_mappable_type (type))
{
  error ("%qD referenced in target region does not have "
 "a mappable type", decl);
@@ -12956,6 +12963,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
   unsigned int flags;
   tree decl;
   auto_vec addr_tokens;
+  tree op = NULL_TREE;
+  location_t loc = OMP_CLAUSE_LOCATION (c);
 
   if (grp_end && c == OMP_CLAUSE_CHAIN (grp_end))
{
@@ -12963,6 +12972,36 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
  grp_end = NULL_TREE;
}
 
+  if (code == OMP_TARGET
+ || code == OMP_TARGET_DATA
+ || code == OMP_TARGET_ENTER_DATA
+ || code == OMP_TARGET_EXIT_DATA)
+   /* Do some target-specific type checks for map operands.  */
+   switch (OMP_CLAUSE_CODE (c))
+ {
+ case OMP_CLAUSE_MAP:
+   op = OMP_CLAUSE_OPERAND (c, 0);
+   verify_type_context (loc, TCTX_OMP_MAP, TREE_TYPE (op));
+   break;
+ case OMP_CLAUS

[patch,avr,applied]: Improve __[u]mulhisi3 on AVRrc

2025-04-07 Thread Georg-Johann Lay


When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper.  This improves code size,
stack footprint and runtime on AVRrc.  Applied as obvious.

Johann

--


AVRrc: Tweak __[u]mulhisi3.

When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper.  This improves code size,
stack footprint and runtime on AVRrc.

libgcc/
* config/avr/lib1funcs.S (__mulhisi3, __umulhisi3): Use
__mulhisi3_helper for better performance on AVRrc.AVRrc: Tweak __[u]mulhisi3.

When MUL is not available, then the __umulhisi3 and __mulhisi3
functions can use __mulhisi3_helper.  This improves code size,
stack footprint and runtime on AVRrc.

libgcc/
* config/avr/lib1funcs.S (__mulhisi3, __umulhisi3): Use
__mulhisi3_helper for better performance on AVRrc.

diff --git a/libgcc/config/avr/lib1funcs.S b/libgcc/config/avr/lib1funcs.S
index 52ce051e00f..dfe99b1ea06 100644
--- a/libgcc/config/avr/lib1funcs.S
+++ b/libgcc/config/avr/lib1funcs.S
@@ -395,29 +395,23 @@ DEFUN __mulhi3
 
 #if defined (L_umulhisi3)
 DEFUN __umulhisi3
-#ifndef __AVR_TINY__
+#ifdef __AVR_TINY__
+;; Save callee saved regs.
+pushB0
+pushB1
+#endif /* AVR_TINY */
 wmovB0, 24
 ;; Zero-extend B
 clr B2
 clr B3
 ;; Zero-extend A
 wmovA2, B2
-XJMP__mulsi3
+#ifdef __AVR_TINY__
+;; Clear hi16 of the result so we can use __mulsi3_helper.
+wmovCC2, B2
+XJMP__mulsi3_helper
 #else
-;; Push zero-extended R24
-push__zero_reg__
-push__zero_reg__
-pushr25
-pushr24
-;; Zero-extend R22
-clr R24
-clr R25
-XCALL   __mulsi3
-pop __tmp_reg__
-pop __tmp_reg__
-pop __tmp_reg__
-pop __tmp_reg__
-ret
+XJMP__mulsi3
 #endif /* AVR_TINY? */
 ENDF __umulhisi3
 #endif /* L_umulhisi3 */
@@ -425,54 +419,33 @@ DEFUN __umulhisi3
 #if defined (L_mulhisi3)
 DEFUN __mulhisi3
 #ifdef __AVR_TINY__
-;; Push sign-extended R24
-mov __tmp_reg__, r25
-lsl __tmp_reg__
-sbc __tmp_reg__, __tmp_reg__
-push__tmp_reg__
-push__tmp_reg__
-pushr25
-pushr24
-;;  Sign-extend R22
-mov r24, r23
-lsl r24
-sbc r24, r24
-sbc r25, r25
-XCALL   __mulsi3
-pop __tmp_reg__
-pop __tmp_reg__
-pop __tmp_reg__
-pop __tmp_reg__
-ret
-#else
+;; Save callee saved regs.
+pushB0
+pushB1
+#endif /* AVR_TINY */
 wmovB0, 24
 ;; Sign-extend B
 lsl r25
 sbc B2, B2
 mov B3, B2
-#ifdef __AVR_ERRATA_SKIP_JMP_CALL__
-;; Sign-extend A
-clr A2
-sbrcA1, 7
-com A2
-mov A3, A2
-XJMP__mulsi3
-#else /*  no __AVR_ERRATA_SKIP_JMP_CALL__ */
 ;; Zero-extend A and __mulsi3 will run at least twice as fast
 ;; compared to a sign-extended A.
 clr A2
 clr A3
+;; Clear hi16 of the result so we can use __mulsi3_helper.
+wmovCC2, A2
 sbrsA1, 7
-XJMP __mulsi3
+#ifdef __AVR_ERRATA_SKIP_JMP_CALL__
+rjmp 1f
+#else
+XJMP__mulsi3_helper
+#endif /* ERRATA_SKIP */
 ;; If  A < 0  then perform the  B * 0x before the
 ;; very multiplication by initializing the high part of the
 ;; result CC with -B.
-wmovCC2, A2
 sub CC2, B0
 sbc CC3, B1
-XJMP __mulsi3_helper
-#endif /*  __AVR_ERRATA_SKIP_JMP_CALL__ */
-#endif /* AVR_TINY? */
+1:  XJMP__mulsi3_helper
 ENDF __mulhisi3
 #endif /* L_mulhisi3 */

Re: GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]

2025-04-07 Thread Andrew Stubbs


On 07/04/2025 09:07, Thomas Schwinge wrote:

Hi!

On 2025-03-14T11:39:20+0100, I wrote:

As the first of a few patches to enable libstdc++ for GCN, nvptx targets,
[...]



some more fine-tuning is to follow later on.)


Any comments before I push the attached
"GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]"?

Jonathan, please put a sharp eye on the
'libstdc++-v3/acinclude.m4:GLIBCXX_ENABLE_LOCK_POLICY' change; to make
sure this only affects GCN, nvptx, but nothing else.


Grüße
  Thomas






+  amdgcn-*-amdhsa)
+# To avoid greater pain elsewhere, force use of '__atomic' builtins,
+# irregardless of outcome of 'configure' checks; see PR119645
+# "GCN, nvptx: libstdc++ 'checking for atomic builtins [...]... no'".
+atomicity_dir=cpu/generic/atomicity_builtins
+;;


"irregardless" is not a good word. Just use "regardless" (of the 
outcome) or "disregarding" (the outcome).


Otherwise LGTM. At least GCN certainly does support atomics, so the 
configure test must be broken somehow.


Andrew

[ping^2] [PATCH] includes, Darwin: Handle modular use for macOS SDKs [PR116827].

2025-04-07 Thread Iain Sandoe

Hi Folks

this has more than 2 weeks without comment,
(it is darwin-local)
thanks
Iain

> On 29 Mar 2025, at 15:23, Iain Sandoe  wrote:
> 
> C++ modules are not really usable on latest Darwin without resolving this,
> thanks
> Iain
> 
>> On 23 Mar 2025, at 12:29, Iain Sandoe  wrote:
>> 
>> From: Iain Sandoe 
>> 
>> Tested on x86_64/aarch64 Darwin and x86_64-linux,
>> OK for trunk?
>> backports to branches supporting modules?
>> thanks
>> Iain
>> 
>> --- 8< ---
>> 
>> Recent changes to the OS SDKs have altered the way in which include guards
>> are used for a number of headers when C++ modules are enabled.  Instead of
>> placing the guards in the included header, they are being placed in the
>> including header.  This breaks the assumptions in the current GCC stddef.h
>> specifically, that the presence of __PTRDIFF_T and __SIZE_T means that the
>> relevant defs are already made.  However in the case of the module-enabled
>> C++ with these SDKs, that is no longer true.
>> 
>> stddef.h has a large body of special-cases already, but it seems that the
>> only viable solution here is to add a new one specifically for __APPLE__
>> and modular code.
>> 
>> This fixes around 280 new fails in the modules test-suite; it is needed on
>> all open branches that support modules.
>> 
>>  PR target/116827
>> 
>> gcc/ChangeLog:
>> 
>>  * ginclude/stddef.h: Undefine __PTRDIFF_T and __SIZE_T for module-
>>  enabled c++ on Darwin/macOS platforms.
>> 
>> Signed-off-by: Iain Sandoe 
>> ---
>> gcc/ginclude/stddef.h | 11 +++
>> 1 file changed, 11 insertions(+)
>> 
>> diff --git a/gcc/ginclude/stddef.h b/gcc/ginclude/stddef.h
>> index 0d53103ce20..bf9c6e609dc 100644
>> --- a/gcc/ginclude/stddef.h
>> +++ b/gcc/ginclude/stddef.h
>> @@ -89,6 +89,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
>> If not, see
>> #undef _PTRDIFF_T_
>> #endif
>> 
>> +#if defined (__APPLE__)
>> +# if defined(__has_feature) && __has_feature(modules)
>> +#  if defined (__need_ptrdiff_t)
>> +#   undef __PTRDIFF_T
>> +#  endif
>> +#  if defined (__need_size_t)
>> +#   undef __SIZE_T
>> +#  endif
>> +# endif
>> +#endif
>> +
>> /* On VxWorks,  may have defined macros like
>>   _TYPE_size_t which will typedef size_t.  fixincludes patched the
>>   vxTypesBase.h so that this macro is only defined if _GCC_SIZE_T is
>> -- 
>> 2.39.2 (Apple Git-143)
>> 
>

Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language

2025-04-07 Thread Qing Zhao




> On Apr 7, 2025, at 10:31, Michael Matz  wrote:
> 
> Hello,
> 
> On Mon, 7 Apr 2025, Martin Uecker wrote:
> 
>>> So, what specifically would the two attributes do different?  FWIW: what 
>>> worries me about accepting a generic expression in counted_by, that isn't 
>>> prefixed by a (possibly empty) decl, is that after seeing a non-type 
>>> identifier the parser doesn't yet know if it's the lone-ident case (look 
>>> up in struct scope) or the expression case (look up everything in global 
>>> scope).  It requires look-ahead to decide this.
>>> 
>>> Would that be the difference between the attributes?  One accepting _only_ 
>>> a lone-ident or the decl+expr syntax, and the other _only_ expressions 
>>> that are never looked up in struct-scope (not even if its lone-ident)?
>> 
>> My understanding is that one accepts only a lone identifier and nothing
>> else, i.e.
>> 
>> counted_by(identifier)
>> 
>> and the other only accepts expressions, possibly including a forward
>> declaration.
>> 
>> counted_by_expr(expression)
>> counted_by_expr(decl; expression)
> 
> What exactly happens when counted_by_expr is used with only an identifier 
> expression, without decl?  Is the ident looked up normally, i.e. not in 
> struct scope.

Yes, with the new counted_by_expr, all identifiers in the expression that are 
not declared before the expression will be looked up normally. 

Qing
>  If so, then good, it would resolve my worry.
> 
> 
> Ciao,
> Michael.

Re: [ping^2] [PATCH] includes, Darwin: Handle modular use for macOS SDKs [PR116827].

2025-04-07 Thread Rainer Orth

Hi Iain,

>>> diff --git a/gcc/ginclude/stddef.h b/gcc/ginclude/stddef.h
>>> index 0d53103ce20..bf9c6e609dc 100644
>>> --- a/gcc/ginclude/stddef.h
>>> +++ b/gcc/ginclude/stddef.h
>>> @@ -89,6 +89,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively. 
>>>  If not, see
>>> #undef _PTRDIFF_T_
>>> #endif
>>> 
>>> +#if defined (__APPLE__)
>>> +# if defined(__has_feature) && __has_feature(modules)
>>> +#  if defined (__need_ptrdiff_t)
>>> +#   undef __PTRDIFF_T
>>> +#  endif
>>> +#  if defined (__need_size_t)
>>> +#   undef __SIZE_T
>>> +#  endif
>>> +# endif
>>> +#endif

shouldn't this have a comment explaining the need for this?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

72 matches

Mail list logo